Spaces:

group9-dsailab
/

multimodal_misinfo_detector

Sleeping

App Files Files Community

multimodal_misinfo_detector / README.md

rajyalakshmijampani

Update README.md

7709ce6 verified 17 days ago

preview code

raw

history blame contribute delete

4.59 kB

A newer version of the Gradio SDK is available: 6.0.1

Upgrade

metadata

title: Multimodal Misinfo Detector
emoji: ⚡
colorFrom: purple
colorTo: blue
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
short_description: Detects misinformation from text + real/fake from images
license: mit

Multimodal Misinformation Detector

This project is a Gradio-based web application that detects misinformation from both text and images. It combines Natural Language Processing (NLP) and Computer Vision (CV) models to determine whether content is real or fake, and provides an LLM-generated explanation for textual claims using the Hugging Face Inference API.

Overview

The application consists of two main components:

Text Detector – Classifies textual claims as Real or Fake by:
- Collecting evidence from Google Fact Check API, Tavily API, and Wikipedia.
- Ranking evidence sentences using semantic similarity with SentenceTransformer.
- Feeding the claim and evidence into a fine-tuned DeBERTa model.
- Using an LLM (Llama 3.1 8B Instruct) to generate a structured explanation.
Image Detector – Classifies uploaded images as Real or Fake using a fine-tuned CLIP-based binary classifier.

Required API Keys

To use the Text Detector, you must enter the following keys in the Gradio interface:

Key	Description
Hugging Face Token	Used for the Llama 3.1 Inference API (for explanations).
Tavily API Key	Used to fetch evidence sentences from Tavily.
Google Fact Check API Key	Used to retrieve verified fact-checks from Google.

The "Classify Claim" button remains disabled until all three keys are provided.

Models Used

Component	Model	Description
Text Classifier	rajyalakshmijampani/fever_finetuned_deberta	Fine-tuned DeBERTa model for FEVER-style fact verification.
LLM (Explanation)	meta-llama/Llama-3.1-8B-Instruct	Generates structured JSON explanations through Inference API.
Embedding Model	sentence-transformers/all-MiniLM-L6-v2	Used to rank retrieved evidence sentences.
Image Classifier	rajyalakshmijampani/finetuned_clip	Fine-tuned CLIP-based binary classifier for image authenticity.
Vision Encoder	openai/clip-vit-base-patch32	Base visual encoder for feature extraction in the image classifier.

Text Detection Flow

The user enters a textual claim and provides the three API keys.
The system retrieves relevant evidence sentences using Google Fact Check API, Tavily API, Wikipedia search.
Evidence sentences are ranked by semantic similarity with the claim.
The top sentences and claim are combined and passed to the fine-tuned DeBERTa classifier.
The classifier outputs a label (REAL or FAKE).
The result and evidence are sent to the Llama 3.1 model for generating Verdict (Real / Fake / Uncertain), Explanation (3–5 sentences), Confidence (Low / Medium / High).
The formatted explanation is displayed in the Gradio interface.

Example Input:
"NASA confirms the Sun rose from the West today!"
Example Output:
Prediction: The claim is Fake.
Explanation: The statement contradicts verified astronomical facts. The Sun always rises in the East due to Earth's rotation.
Confidence: High.

Image Detection Flow

The user uploads an image in standard format (JPG, PNG, etc.).
The image is processed by the CLIP vision encoder.
The fine-tuned classifier predicts a probability between 0 and 1.
Output: "Fake" if probability > 0.5, "Real" otherwise.

Input/Output Summary

Detector	Input	Output
Text Detector	Text claim + 3 API keys	Prediction (Real/Fake/Uncertain), Explanation, Confidence
Image Detector	Uploaded image	Prediction (Real/Fake), Confidence score

Authored by:

Group 9 - DSAI Lab Project

License

This project is licensed under the MIT License. You are free to use and modify it with proper attribution.