CheckMate: AI for fact-checking video claims
Team CheckMate is a collaboration between journalists and technologists from News UK (The Sun and the Times), dPA, Data Crítica,& the BBC as part of the 2024 JournalismAI Fellowship. Journalists and technologists from the organisations proposed to develop a simple web app for real-time fact-checking on live or recorded video and audio broadcasts. Here, they discuss their progress mid-way through the 2024 JournalismAI Fellowship.
2024 is the year of elections, with around 1.5 billion people heading to the polls. With democracy, comes debate and a flurry of claims, some true, some less so. Fact-checking is a key part of the journalistic process, but in a world of fast-moving communication, false information can spread rapidly, so it is key that journalists can quickly identify these claims and provide clarity and context to the public.
This is the basis of Checkmate - a fact-checking system which can identify claims in real-time on live stream broadcasts- to enable journalists to respond quickly and accurately. Checkmate is an international collaborative project that brings together expertise from the BBC, Deutsche Presse-Agentur, DataCrítica, The Sun and The Times. Arne Beckman, Manja Borchert, Nadine Forshaw, Gibran Mena, Tom Potter and Luke Sikkema are the team members working on the Checkmate project.
For Checkmate to function, we needed to develop four individual elements, which formed the initial scope for the project:
Step 1: Transcription
The tool would transcribe video and audio in real-time. Ideally, it would split speakers and also use OCR-reading overlays or video inserts to identify who is speaking.
Step 2: Claim identification
The AI aspect of the solution will then identify ‘claims’ and cross reference and verify this information against an existing ‘claim database’ such as the Google Fact Check Explorer API.
Step 3: Notify journalist
Any unverified or false claims are then flagged to journalists, detailing the claim and sharing the link to the reviewed sources for the journalist to check.
Step 4: Feedback
Journalists can then feed back with accurate information, debunking the claim, further feeding the fact-checking model. The journalist can also add a claim, which was not highlighted by the model.
As the scope of Checkmate is quite ambitious, we needed to streamline the functions of the tool for the initial POC. This included removing speaker identification, including a small delay in the real-time transcription, and removing the feedback proposition from V1.
As part of the ideation we spoke to journalists and prospective users to gather feedback on how they would expect this kind of tool to look and behave, this provided valuable insight and informed the front-end designs.
We have been fortunate enough in the Checkmate team to have technical experience across the front and back end, and also experience in the fact-checking space thanks to our fellows from DPA. This has enabled the team to each work on projects as individuals and then come together for integration, and share findings and frustrations.
Description of the functions of the tool
The tool uses an array of JavaScript scripts and compiled tools to achieve its tasks. It processes recorded video by splitting the audio into chunks of data. These chunks are then uploaded to Amazon S3 buckets and passed into a video builder script, which subsequently sends all this data to a transcription tool called AssemblyAI.
The transcription generated by this tool is then analysed by a service that identifies potential errors using a confidence score for each word provided by the transcription tool. It corrects the most likely mistakes of the original transcription, then another function glues the corrected chunks together.
The resulting enhanced transcription is uploaded to a sentences database, where using OpenAI, claims are extracted from the transcript and stored into a claim database. These claims are then compared to claims stored in the Google Fact Check Explorer's database. Finally, the claims are matched to Google Fact Checkers together with the annotations, similar claims, claim rating, and other data associated with it. The reporter then uses this powerful context to make decisions on the facticity of each claim.
By the end of the JournalismAI Fellowship, we aim to have a functional V1 POC (version one, proof-of-concept), which can identify claims in real-time on live stream broadcasts. This will then pave the way for more complex additions in future iterations.
The 2024 JournalismAI Fellowship brought together 40 journalists and technologists to collaboratively work on using AI to improve journalism, its systems and processes. Read more about it here.