Right now I am just manually adding new domains to this tracking system. I wanted a way to dynamically discover potentially new domains that are spreading propaganda. The way the Facebook platform is setup, there isn't an easy way to discover news that is being submitted and shared, so I turned to the Twitter API to see what I could do--there had to be a way to find other fake news domains via the much more public Twitter.
I created an automated job that would take any of the 50+ fake news domains I'm targeting and search using the Twitter search API. This returns the top 100 Tweets that contained a URL using that domain. I'm processing the results and recording each of the Twitter users who are behind these Tweets and working on a new automated job to pull their top tweets and see if there are any new domains to be discovered there.
Many of the most popular Tweets around these fake news domains are central in spreading this news on Twitter and via Facebook. For now, I'm just adding new URLs to a list, and manually looking through them on a regular basis. I'm categorizing them into propaganda, news, and some other categories for possible future evaluation, or inclusion as part of the URL and graph harvesting process.
I am only pulling the tweets and new domains from the accounts who have a high amount of followers and retweets. These are usually the Twitter accounts associated with the fake news domains I"m targeting or similar sites. I am not doing this step for the regular news sites, as it isn't too difficult to find new news outlets, where the fake news sites are much more difficult to uncover and identify.