Posts Privacy Policy
Post
Cancel

Privacy Policy

Last updated: April 22nd, 2022

This page describes all the different kinds of data we collect, how we use it, and how to opt-in (or revoke your opt-in) for each category of data.

Our Philosophy on Data Collection

Opt-In Data Collection

All data collection must be opt-in, meaning that none of none of your data is sent to Primary Source Scraper unless you want to share it with us. Primary Source Scraper is designed so that none of the core functionality (finding primary sources in news articles) is dependent on collecting data.

No Personal Information - Ever

None of our data collection is intended to store any personal information. We do not ever intentionally collect data that could be used to identify you, contact you, or trace you. We also do not intentionally collect data that is unique to your device, or that could be used to identify your device after the fact.

Because we do not collect personally identifiable information, we have no way to determine what data you have shared with us.

Types Of Data We Collect

Article Data

When Primary Source Scraper scans a supported news article, it collects the following data about that news article:

  • The title of the article
  • The date the article was published
  • The URL of the article
  • The total number of links in the article body
  • Details about each external link in the article body, including:
    • The destination URL of the external link
    • The text of the link in the article

How To Opt In

On the Primary Source Scraper Options page, under the Privacy Settings heading, check the box labeled “Send Article Data”.

How To Opt Out

On the Primary Source Scraper Options page, under the Privacy Settings heading, un-check the box labeled “Send Article Data”.

What Data We Collect

Here is an example of the article data that Primary Source Scraper will send from this NPR Article about Microsoft acquiring Bethesda

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
{
    "article": {
    "title": "Microsoft To Buy Bethesda In $7.5 Billion Deal, Acquiring Fallout, The Elder Scrolls : NPR",
    "date": "September 21, 2020",
    "location": "https://www.npr.org/2020/09/21/915308028/microsoft-to-buy-be…hesda-in-7-5-billion-deal-acquiring-fallout-the-elder-scroll"
    },
    "totalLinkCount": 8,
    "externalLinks": [
    {
        "url": "https://news.microsoft.com/2020/09/21/microsoft-to-acquire-zenimax-media-and-its-game-publisher-bethesda-softworks/",
        "text": "announced"
    },
    {
        "url": "https://news.microsoft.com/2020/09/21/microsoft-to-acquire-zenimax-media-and-its-game-publisher-bethesda-softworks/",
        "text": "release"
    },
    {
        "url": "https://bethesda.net/en/article/1iLtcvwY6Nb1GeKADyDUEX/why-microsoft-is-the-perfect-fit",
        "text": "blog post"
    },
    {
        "url": "https://twitter.com/i/events/1308047834684354560",
        "text": "feared the deal"
    },
    {
        "url": "https://www.bloomberg.com/news/articles/2020-09-21/microsoft…sda-studios-for-7-5-billion-to-boost-xbox?srnd=technology-vp",
        "text": "Bloomberg"
    },
    {
        "url": "https://bethesda.net/en/article/4IwKWIj174Cb2QNTTtBAEb/todd-howard-on-joining-xbox",
        "text": "blog post"
    }
    ]
}

How We Use This Data

We use this data to build a database of the connections between news articles and their sources. We will use this to help determine which domains on the internet are commonly cited and attempt to understand which sources are the most trustworthy.

Site Categories

When a user flags a website as being a Social Media, News / Opinion, or Misc site, Primary Source Scraper can collect that suggestion. A suggestion includes:

  • The hostname of the site flagged
  • The category suggested by the user (which can currently be Social Media, News / Opinion, or Misc)

How To Opt In

On the Primary Source Scraper Options page, under the Privacy Settings heading, check the box labeled “Share Site Categories”.

How To Opt Out

On the Primary Source Scraper Options page, under the Privacy Settings heading, un-check the box labeled “Share Site Categories”.

What Data We Collect

Here is an example of the Site Category data Primary Source Scraper will collect if a user flags nytimes.com as a News / Opinion site:

1
2
3
4
5
6
{
    "siteCategoryData":{
        "hostname":"nytimes.com",
        "category":"News / Opinion"
        }
}

How We Use This Data

We use this data to help categorize websites accurately.

Error Reports

News sites occasionally change the way they structure their webpages and this can break the way Primary Source Scraper interacts with those sites. Because we support over 70 news sites we don’t always notice when one of those sites has changed.

When Primary Source Scraper fails to find an article on a news site that it should support, or it encounters an unexpected error, we collect the following information to help us diagnose the error:

  • What type of error this is (a constant string containing no user data, such as “QuerySelectorMismatch”)
  • Details on the specific error (a stack trace, or a config setting from the files included with Primary Source Scraper)
  • The URL of the page that caused the error
  • The version of Primary Source Scraper that encountered the error

These reports help us learn about sites that might be broken or other errors in the browser extension so we can quickly fix those issues.

How to Opt In

On the Primary Source Scraper Options page, under the Privacy Settings heading, check the box labeled “Send Error Reports”.

How to Opt Out

On the Primary Source Scraper Options page, under the Privacy Settings heading, un-check the box labeled “Send Error Reports”.

What Data We Collect

Here is an example of the data Primary Source Scraper (version 0.7.4) would collect if it encountered an error trying to find an article at this URL on theatlantic.com:

1
2
3
4
5
6
{
    "errorType":"QuerySelectorMismatch",
    "errorInfo":"[id=main-content]",
    "location":"https://www.theatlantic.com/ideas/archive/2022/04/diverse-democracy-protect-personal-liberty/629543/",
    "version":"0.7.4"
}

Changes to this Privacy Policy

We may update our Privacy Policy from time to time. We will notify you of any changes by posting the new Privacy Policy on this page.

Any change made to the types of data we can collect will be accompanied by changes under the Privacy Settings heading on the Options page of Primary Source Scraper. Before you choose to opt in to any new types of data we collect, you will have the opportunity to revisit this page and review the changes in this policy.

Contact Us

If you have any questions about this Privacy Policy, You can contact us:

  • By email: sourcescraper@gmail.com
This post is licensed under CC BY 4.0 by the author.

Contents

Trending Tags