TimeLark: Understanding relationships over time made easy
By Evangeline De Bourgoing, Ian Knopke, Galen Reich, Ned Davies, Karina Shedrofsky, Jan Strozyk
How can AI enhance journalism? 30 journalists and technologists from across the globe have joined the 2023 JournalismAI fellowship to find out. They are working in six self-selected teams on six different projects that use AI to enhance journalism and its processes.
In this series of articles, our Fellows describe their journey so far, the progress they’ve made, and what they learned along the way. In this blog post, you’ll hear from team TimeLark, a collaboration between editorial and technical Fellows from the BBC, OCCRP, and Reuters.
Journalists worldwide have to contend with tight deadlines and multiple calls on their time, sometimes making it difficult to delve deep into complex stories. This is a challenge we share at BBC, OCCRP, and Reuters. We want to see how AI might help make this easier for journalists.
Sifting through mountains of articles and documents is time-consuming and often feels like looking for a needle in a haystack. Can we use AI to help journalists get a clear overview of a story, letting them quickly zoom in to the most interesting parts, and freeing up their valuable time to focus on in-depth reporting?
One solution is using a knowledge graph – a map of all the different parts of a story and how they link together. But news stories develop over time (obviously!), so we need to include information about when events occur. That’s what a temporal knowledge graph is for; it captures both the information about how parts of a story relate to each other and also when they relate. This technology allows us to map out connections, build timelines, and uncover hidden connections in news stories.
The project builds on experience with knowledge graphs that already exist inside the BBC, OCCRP, and Reuters, and focuses on the temporal aspect of extracting insights from graph networks. Prioritizing the importance of extracted information is challenging, especially with regard to contradictory information that can occur in the dynamic landscape of developing news.
What is TimeLark?
TimeLark is a journalistic research tool that makes complex timelines easy to explore and understand.
How does it work?
● Use machine learning to extract and map information from text
● Create a temporal knowledge graph of these relationships
● Allow users to query for entities, timeframes, and connections
We focused on the war in Ukraine as a case study with many different overlapping narratives, encompassing international, political, humanitarian, and financial dimensions. As all of our organizations have been actively covering this topic, we already possess a store of data in this area.
What have we achieved and learned so far?
On the product side, we have refined our product definition, value proposition, and user persona, creating a paper prototype that maps out our user’s journey.
We opted to prototype on paper (in this case, in PowerPoint) for the following reasons:
- Quick iteration: Temporal knowledge graphs applied to news is new. We did not have many examples to draw from. Paper prototyping allowed us to create many iterations of the design in just a couple of hours.
- Minimal learning curve: Paper prototyping is accessible to everyone. Team members without experience with prototyping tools were able to fully participate in the prototyping process.
- Low cost and low commitment: We started prototyping with simulated data, as we were in the middle of the data extraction process and did not have real-world data to play with. Our real-world data is much more complex than our simulated data, which means that we might have to significantly alter the design of our prototype. Paper prototyping provides us with greater flexibility in making these adjustments, as we have not invested extensive time and resources in developing a polished prototype.
On the technical side, we have built datasets and decided on a common format. We have extracted entities and relationships, but have encountered a few challenges:
- Lack of accuracy in relationship extraction: Most of the relationships we extracted were accurate, but some were completely wrong. We are currently refining our relationship extraction process.
- Define and standardize the relationships we extract: We have successfully extracted a diverse range of relationships, and our next step involves defining which of these relationships hold value for our organizations. This refinement process aims to reduce noise and narrow down the selection to a manageable set of relationships suitable for visualization.
- Combining data from different organizations: While we have a common interchange format for joining extracted relations between organizations, we haven’t actually combined our data.
- Limited use of temporal information: We are able to extract temporal information from our datasets, but are still experimenting with the best ways to use and visualize them.
At the end of the fellowship, we aim to build a functional temporal knowledge graph that journalists could query to find insights about the war in Ukraine.
Do you have skills and expertise that could help team TimeLark? Get in touch by sending an email to Programme Manager, Lakshmi Sivadas, at lakshmi@journalismai.info