AI vs disinformation: who will win this battle?
A new study reveals who is better at checking the news for truth.
Large Language Models (LLMPs) are an evolution of natural language processing (NLP) techniques that can quickly generate human-like texts and perform other simple language-related tasks. These models have become very popular since the public release of Chat GPT, a high-performance BNM developed by OpenAI.
Recent studies evaluating BNM have so far mainly tested their ability to produce well-written texts, define specific terms, write essays or other documents, and produce efficient computer code. However, these models have the potential to help people deal with other real-world issues, including fake news and misinformation.
Kevin Matte Caramanción, a researcher at the University of Wisconsin-Stut, recently conducted a study evaluating the ability of the best-known BNMs released to date to determine whether a news story is true or fake. His results published in an article on arXiv provide valuable insights that may contribute to the future use of these sophisticated models to counter online disinformation.
“The inspiration for my recent article was the need to understand the possibilities and limitations of various NMLs in the fight against disinformation,” Caramanción told Tech Xplore. “My goal was to test the level of mastery of these models in distinguishing fact from fiction, using controlled simulation and established fact-checking agencies as a benchmark.”
“We evaluated the performance of these large language models using a test set of 100 verified news stories from independent fact-checking agencies,” Caramanción said. “We presented each of these news items to the models under controlled conditions, and then classified their responses into one of three categories: True, False, and Partially True/False. The effectiveness of the models was measured based on how accurately they classified these elements, compared to validated facts provided by independent agencies.”
Disinformation has become a major problem in recent decades as the internet and social media have allowed information to spread more and more rapidly, whether true or false. Many computer scientists have therefore tried to develop better fact-checking tools and platforms that allow users to verify the news they read online.
Despite the many fact-checking tools that have been created and tested to date, there is still no widely accepted and reliable model for combating disinformation. As part of his study, Caramanción attempted to determine whether the existing BNMs could effectively solve this problem. He specifically evaluated the performance of four BNAMs, namely Open AI’s Chat GPT-3.0 and Chat GPT-4.0, Google’s Bard/LaMDA, and Microsoft’s Bing AI. Caramanción fed these models the same news stories that were pre-tested for factual accuracy and then compared their ability to determine whether they were true, false, or partially true/false.
“We did a comparative evaluation of the main NMLs in terms of their ability to distinguish fact from fiction,” Caramanción said. “We found that GPT-4.0 OpenAI outperformed other models, hinting at advances in the new BNM. However, all models lagged behind human fact-checkers, emphasizing the inestimable value of human knowledge. These findings may lead to increased attention to the development of AI in fact-checking while ensuring a balanced, symbiotic integration with human skills.”
An evaluation by Caramanción found that ChatGPT 4.0 significantly outperforms other outstanding BNMs on fact-checking tasks. Further research testing the BNM on a wider pool of fake news may help confirm this conclusion.
The researcher also found that manual fact-checking still outperformed all of the major BNMs he had evaluated. His work highlights the need to further improve these models or combine them with the work of human agents.
“My future research plans are to study the development of AI capabilities, focusing on how we can use these advances without losing sight of the unique cognitive abilities of humans,” Caramanción added. “We are committed to improving our test protocols, exploring new BNMs, and further exploring the dynamics between human cognition and AI technologies in news verification.”