--- tags: - text-generation-inference - transformers - unsloth - qwen3_vl - trl - sft - chemistry - code - climate - art - biology - finance - legal - music - medical - agent license: apache-2.0 language: - en - ab - aa - ae - af - ak - am - an - ar - as - av - ay - az - ba - be - bg - bh - bi - bm - bn - bo - br - bs - ca - ce - ch - co - cr - cs - cu - cv - cy - da - de - dv - dz - ee - el - eo - es - et - eu - fa - ff - fi - fj - fo - fr - fy - ga - gd - gl - gn - gv - ha - he - hi - ho - gu - hr - ht - hu - hz - hy - id - ia - ig - ie - ik - ii - is - io - iu - it - jv - ja - kg - ka - kj - ki - kl - kk - kn - km - kr - ko - ku - ks - kw - kv - la - ky - lg - lb - ln - li - lt - lo - lv - lu - mg - mi - mh - ml - mk - mr - mn - mt - ms - na - my - nd - nb - ng - nl - ne - 'no' - nn - nv - nr - oc - oj - om - ny - os - or - pa - pi - pl - ps - pt - rm - rn - qu - ro - ru - sn - rw - so - sa - sc - sd pipeline_tag: image-to-text library_name: transformers --- # 🖼️ Next OCR 8B ### *Compact OCR AI — Accurate, Fast, Multilingual, Math-Optimized* [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT) [![Language: Multilingual](https://img.shields.io/badge/Language-Multilingual-red.svg)]() [![HuggingFace](https://img.shields.io/badge/🤗-Lamapi/Next--OCR--orange.svg)](https://huggingface.co/Lamapi/next-ocr) --- ## 📖 Overview **Next OCR 8B** is an **8-billion parameter model** optimized for **optical character recognition (OCR) tasks** with **mathematical and tabular content understanding**. Supports **multilingual OCR** (Turkish, English, German, Spanish, French, Chinese, Japanese, Korean, Russian...) with high accuracy, including structured documents like tables, forms, and formulas. --- ## ⚡ Highlights * 🖼️ Accurate text extraction, including math and tables * 🌍 Multilingual support (30+ languages) * ⚡ Lightweight and efficient * 💬 Instruction-tuned for document understanding and analysis --- ## 📊 Benchmark & Comparison | Model | OCR Accuracy (%) | Multilingual Accuracy (%) | Layout / Table Understanding (%) | Notes | | ------------------- | ------------------------ | ------------------------- | -------------------------------- | --------------------------------------------------------------------------------------------------------------------- | | **Next OCR 8B** | 94.8 | 92.5 | 90.7 | Compact, Türkiye ve çokdilli odaklı, matematik & tablo destekli | | **DeepSeek‑OCR 3B** | 97 (yüksek sıkıştırmada) | 88–90 | 85–87 | Matematik ve tablo odaklı, 3B parametre, “optical context compression” ile long-doc ve tablolar için güçlü alternatif | > ⚡ **Note:** DeepSeek‑OCR 3B özellikle **matematiksel içerikli dokümanlar, tablolar ve formüller** üzerinde güçlü. Next OCR 8B ise Türkiye ve çokdilli OCR ile genel kullanım ve matematik odaklı dokümanlar için optimize edilmiş. --- ## 🚀 Installation & Usage ```python from transformers import AutoTokenizer, AutoModelForVision2Seq import torch model_id = "Lamapi/next-ocr" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForVision2Seq.from_pretrained(model_id, torch_dtype=torch.float16, device_map="auto") image_path = "document.png" images = [image_path] inputs = tokenizer(images, return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=512) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` --- ## 🧩 Key Features | Feature | Description | | -------------------------- | --------------------------------------------------------------- | | 🖼️ High-Accuracy OCR | Extracts text from images, documents, and screenshots reliably. | | 🇹🇷 Multilingual Support | Works with 30+ languages including Turkish. | | ⚡ Lightweight & Efficient | Optimized for resource-constrained environments. | | 📄 Layout & Math Awareness | Handles tables, forms, and mathematical formulas. | | 🏢 Reliable Outputs | Suitable for enterprise document workflows. | --- ## 📐 Model Specifications | Specification | Details | | ----------------- | --------------------------------------------------------- | | **Base Model** | Qwen 3 | | **Parameters** | 8 Billion | | **Architecture** | Vision + Transformer (OCR LLM) | | **Modalities** | Image-to-text | | **Fine-Tuning** | OCR datasets with multilingual and math/tabular content | | **Optimizations** | Quantization-ready, FP16 support | | **Primary Focus** | Text extraction, document understanding, mathematical OCR | --- ## 🎯 Ideal Use Cases * Document digitization * Invoice & receipt processing * Multilingual OCR pipelines * Tables, forms, and formulas extraction * Enterprise document management --- ## 📄 License MIT License — free for commercial & non-commercial use. --- ## 📞 Contact & Support * 📧 Email: [lamapicontact@gmail.com](mailto:lamapicontact@gmail.com) * 🤗 HuggingFace: [Lamapi](https://huggingface.co/Lamapi) --- > **Next OCR** — Compact *OCR + math-capable* AI, blending **accuracy**, **speed**, and **multilingual document intelligence**. [![Follow on HuggingFace](https://img.shields.io/badge/Follow-HuggingFace-yellow?logo=huggingface)](https://huggingface.co/Lamapi)