Talking to Data

The aim of this project is to provide users with a conversational interface to data sets that allow them to first describe what the data is about, where the various elements that they can ask about can be found, and then ask questions about the data.

Getting to insight through conversation

At the core of data-driven journalism is the ability to pull information, insight and understanding out of the data that surrounds us. Unfortunately, the mechanisms information extraction from data makes it difficult for new users to find the information they need in the data sets we have. In the absence of a skilled practitioner, users have to ramp into techniques that will allows them to find complex information in data sets.

The problem is simply that not everyone has these skills and the number of people who understand the questions that need to be asked how to get to the answers is small. We want to overcome this problem by developing a model that links the questions we ask with the processes that are used to answer them in a way that does not require expert knowledge of analytics.

The aim of this project is to provide users with a conversational interface to data sets that allow them to first describe what the data is about, where the various elements that they can ask about can be found, and then ask questions about the data. Users would not know about the details of the relationship between the questions they ask and the analytics that need to be run to answer them. Instead, the system itself would have that knowledge and apply it to produce both visualizations and text in response to questions.

We see this work as a partnership between CS and J in which the design and dynamic of the system will be guided by the journalism students and the development will be executed by the CS students. That is, Journalism will frame the questions and CS will build the machine to provide the answers.

Technical Approach:

  • Set up core data base as the platform

  • Select a visualization platform

  • Select a conversational platform (or not)

  • Develop a simple language generator to express core ideas

  • Develop a model of context to be used to better interpret series of questions.

Presentation Slides

Results

Results from the project

Faculty and Staff Leads

Kris Hammond

Professor of Electrical Engineering and Computer Science

Prior to joining the faculty at Northwestern, Kris founded the University of Chicago’s Artificial Intelligence Laboratory. His research has been primarily focused on artificial intelligence, machine-generated content and context-driven information systems. Kris currently sits on a United Nations policy committee run by the United Nations Institute for Disarmament Research (UNIDIR). He received his PhD from Yale.

Students

Rafah Ali

Daniel Fernandez

Brandon Fujii

Crystal Gong

Alexander Morikado