Chhagan_ML-VL-OCR-v1

Multilingual OCR model for government ID cards, fine-tuned with LoRA on Qwen2.5-VL-3B.

Supported Documents

  • 🇴🇲 Oman Resident Card
  • 🇦🇪 UAE Identity Card
  • 🇸🇦 Saudi National ID
  • 🇮🇳 Aadhaar Card, PAN Card, Passport
  • 🇿🇦 South Africa ID

Languages

English, Arabic (عربي), Hindi (हिन्दी), Urdu (اردو)

Usage

Option 1: Load as LoRA Adapter (~74MB download)

from transformers import Qwen2_5_VLForConditionalGeneration, AutoProcessor
from peft import PeftModel

base = Qwen2_5_VLForConditionalGeneration.from_pretrained("Qwen/Qwen2.5-VL-3B-Instruct")
model = PeftModel.from_pretrained(base, "Chhagan005/Chhagan_ML-VL-OCR-v1")
processor = AutoProcessor.from_pretrained("Chhagan005/Chhagan_ML-VL-OCR-v1")

Option 2: Load as Standalone Model (~4.4GB download)

from transformers import Qwen2_5_VLForConditionalGeneration, AutoProcessor

model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
    "Chhagan005/Chhagan_ML-VL-OCR-v1",
    torch_dtype=torch.float16, device_map="auto"
)
processor = AutoProcessor.from_pretrained("Chhagan005/Chhagan_ML-VL-OCR-v1")

Training

  • Base: Qwen/Qwen2.5-VL-3B-Instruct
  • Method: LoRA (r=32, α=64)
  • Data: Synthetic IDs + ANETAC Arabic names + Ara-Eng Parallel Corpus
  • Focus: OCR extraction from government identity documents
Downloads last month
190
Safetensors
Model size
4B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Spaces using Chhagan005/Chhagan_ML-VL-OCR-v1 4