Automation and Crowdsourcing to Support Factchecking
News sites like Politifact and Factcheck.org have defined a whole genre of journalism that pursues fact checking as a public activity, its own form of coverage and content. There is a large audience demand for factchecking. For instance, NPR’s live factcheck of the first 2016 presidential debates led to record site traffic. But journalists’ attention is limited, and given the endless sea of things people say, how should journalists go about identifying and ranking the most newsworthy and important things to fact check?
This project will seek to automate the monitoring for fact checkable claims from the Congressional Record, a very long and dense daily publication that records everything that goes on in congress. Using the Claimbuster API, claims that are initially deemed fact-checkable will be automatically extracted. Then, this project will utilize crowdsourcing techniques, such as Amazon’s Mechanical Turk, to rate the extracted facts on a variety of factors that will help rank the most important and newsworthy for journalists to investigate. Finally, these leads will be designed into a daily email that could be sent to interested journalists to help them decide what to fact check for the day.
By the end of the quarter students will have built a functioning prototype of a monitoring system that can scan for fact checkable claims, rank them in a meaningful way to help identify the most interesting, and then present these to journalists in a compelling interface. Students can expect to learn about text processing, crowdsourcing, and user interface design to support journalistic tasks.