Top

KAUST Developing ‘Trustworthiness’ GenAI Tools to Combat Misinformation Online

In an era where online misinformation undermines public discourse and social cohesion, KAUST is harnessing the power of generative AI, in collaboration with key stakeholders, to develop tools that detect false materials, trace their origins, and evaluate content trustworthiness across text, images and video. 

“Basically, it’s about detecting whether AI predictions are hallucinated or misinformed and the causality of this prediction,” said Professor Bernard Ghanem, chair of the KAUST Center of Excellence for Generative AI. He is principal investigator for one of the new CoE’s projects aimed at standardizing and enabling trustworthiness in GenAI. “Trustworthiness is about more than just countering malicious behavior. It’s about making the tools we create, and the tools created by others, more reliable.” 

In part, this project focuses on enhancing GenAI reliability to combat misinformation, improving GenAI models’ robustness and defending against prompt attacks that generate misleading content. Researchers will develop methods to ensure consistent performance and align models with human values, reducing the risk of creating harmful or deceptive outputs. These efforts safeguard GenAI from misuse and bolster its trustworthiness in generating diverse content. 

KAUST researchers are also exploring privacy-preserving measures, such as differential privacy, to prevent the extraction of sensitive information for misinformation purposes. The CoE is positioning itself to address significant challenges in safeguarding GenAI from misuse that spreads misinformation, deepfakes and other malicious content. 

As part of a broader GenAI trustworthiness effort, this research aims to drive economic growth, creating jobs and fostering new skills to develop and deploy trustworthy GenAI models. By addressing trustworthiness challenges, the CoE strives to accelerate GenAI adoption, spurring innovation and economic expansion aligned to Saudi Vision 2030. 

“It’s not just about combating malicious behavior,” said Ghanem. “We also focus on developing GenAI tools that themselves have trustworthiness, where a human using it will know the information they’re giving up won’t be leaked. We envision next-generation GenAI tools to provide answers that are factual and not hallucinated. This echoes the need for trustworthiness. This revolves around how humans can trust that they’re interacting with something with human values.” 

Derailing deepfakes 

While text-based misinformation campaigns targeting social media are a concern, noted Ghanem, professor of Electrical and Computer Engineering, there is also significant concern regarding manipulation in the visual domain, particularly with text-to-image translators and lifelike online images that can be used to mislead viewers. 

“This is important because nowadays with these photorealistic image generators, it’s very difficult for a human to tell whether an image is fake or not. AI has the ability to detect this. In the same way we’re generating models to create better-looking images based on text prompts, we also want to develop ways to detect whether the images were generated from a text prompt or not.” 

By providing disclaimers or likelihood percentages of AI generation, added the CoE chair, media consumers can better assess trustworthiness. If there is a high chance content is AI-generated, consumers can exercise caution. “The user needs to trust the sources of the media they are observing and consuming.” 

Cyclical approach 

Researchers follow a cyclical approach, similar to cybersecurity, exploring both offensive and defensive strategies for misinformation, Ghanem said. By understanding how to generate and defend against attacks, his team is improving misinformation detection and monitoring, ultimately bolstering defenses against AI-generated content and deepfakes. 

“We’re understanding how to generate this content and make it more trustworthy, reliable and of better quality, but also detecting whether these things are generated by these models or not.” 

KAUST focuses on creating tools to identify AI-generated content and helping users assess reliability. These tools are useful for preventing the online sphere from being flooded with untrustworthy materials, ensuring users can make informed conclusions.