As local papers are forced to cut back on coverage and struggle to survive, the future of local news is uncertain. Some communities are even considered “news deserts” with little to no local information. What can be done to increase the volume of local news? The goal of this project is to address this issue by considering how national reporting can be augmented with a local angle based on data.
To prototype this idea students will do research to find datasets with national coverage (e.g. from the census) and consider how locally relevant facts can be derived from those datasets. The facts will then be turned into snippets of text using template-based natural language generation. These fact snippets will be built into an article plugin that inserts the locally relevant fact into a national article based on the location of the user (e.g. as reported by their browser). For instance, a New York Times article about the unemployment rate would have a snippet automatically inserted about the unemployment rate in your home county, including how it’s changed and how it varies by demographic.
Assistant Professor, Director of the Computational Journalism Lab
Northwestern University Asst Professor of Communication & Tow Center fellow. Computational journalism, algorithmic accountability, social computing.
How can we augment news articles to give them a local angle and make them more interesting and relevant to local readers?
How can national datasets be used to derive locally interesting facts?
How can natural language generating be used to write those facts into snippets?
Weeks 1-2: Research datasets with national coverage and curate a dataset of datasets that is indexed by topic.
Weeks 3-5: Write natural language generation templates to produce interesting factoids from datasets.
Weeks 6-8: Develop a web frontend that dynamically loads the local snippets into an augmented article page.
Weeks 9-10: Test and demo augmented article pages for different topics
Students will build a functioning prototype of an article page that is dynamically localized based on a user’s location using fact snippets automatically generated from data. They will gain experience working with geographic data and with natural language generation template writing.