|
|
--- |
|
|
tags: |
|
|
- text-generation-inference |
|
|
- transformers |
|
|
- unsloth |
|
|
- qwen3_vl |
|
|
- trl |
|
|
- sft |
|
|
- chemistry |
|
|
- code |
|
|
- climate |
|
|
- art |
|
|
- biology |
|
|
- finance |
|
|
- legal |
|
|
- music |
|
|
- medical |
|
|
- agent |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
- ab |
|
|
- aa |
|
|
- ae |
|
|
- af |
|
|
- ak |
|
|
- am |
|
|
- an |
|
|
- ar |
|
|
- as |
|
|
- av |
|
|
- ay |
|
|
- az |
|
|
- ba |
|
|
- be |
|
|
- bg |
|
|
- bh |
|
|
- bi |
|
|
- bm |
|
|
- bn |
|
|
- bo |
|
|
- br |
|
|
- bs |
|
|
- ca |
|
|
- ce |
|
|
- ch |
|
|
- co |
|
|
- cr |
|
|
- cs |
|
|
- cu |
|
|
- cv |
|
|
- cy |
|
|
- da |
|
|
- de |
|
|
- dv |
|
|
- dz |
|
|
- ee |
|
|
- el |
|
|
- eo |
|
|
- es |
|
|
- et |
|
|
- eu |
|
|
- fa |
|
|
- ff |
|
|
- fi |
|
|
- fj |
|
|
- fo |
|
|
- fr |
|
|
- fy |
|
|
- ga |
|
|
- gd |
|
|
- gl |
|
|
- gn |
|
|
- gv |
|
|
- ha |
|
|
- he |
|
|
- hi |
|
|
- ho |
|
|
- gu |
|
|
- hr |
|
|
- ht |
|
|
- hu |
|
|
- hz |
|
|
- hy |
|
|
- id |
|
|
- ia |
|
|
- ig |
|
|
- ie |
|
|
- ik |
|
|
- ii |
|
|
- is |
|
|
- io |
|
|
- iu |
|
|
- it |
|
|
- jv |
|
|
- ja |
|
|
- kg |
|
|
- ka |
|
|
- kj |
|
|
- ki |
|
|
- kl |
|
|
- kk |
|
|
- kn |
|
|
- km |
|
|
- kr |
|
|
- ko |
|
|
- ku |
|
|
- ks |
|
|
- kw |
|
|
- kv |
|
|
- la |
|
|
- ky |
|
|
- lg |
|
|
- lb |
|
|
- ln |
|
|
- li |
|
|
- lt |
|
|
- lo |
|
|
- lv |
|
|
- lu |
|
|
- mg |
|
|
- mi |
|
|
- mh |
|
|
- ml |
|
|
- mk |
|
|
- mr |
|
|
- mn |
|
|
- mt |
|
|
- ms |
|
|
- na |
|
|
- my |
|
|
- nd |
|
|
- nb |
|
|
- ng |
|
|
- nl |
|
|
- ne |
|
|
- 'no' |
|
|
- nn |
|
|
- nv |
|
|
- nr |
|
|
- oc |
|
|
- oj |
|
|
- om |
|
|
- ny |
|
|
- os |
|
|
- or |
|
|
- pa |
|
|
- pi |
|
|
- pl |
|
|
- ps |
|
|
- pt |
|
|
- rm |
|
|
- rn |
|
|
- qu |
|
|
- ro |
|
|
- ru |
|
|
- sn |
|
|
- rw |
|
|
- so |
|
|
- sa |
|
|
- sc |
|
|
- sd |
|
|
pipeline_tag: image-to-text |
|
|
library_name: transformers |
|
|
--- |
|
|
<img src='bannerocr.png'> |
|
|
|
|
|
# 🖼️ Next OCR 8B |
|
|
|
|
|
### *Compact OCR AI — Accurate, Fast, Multilingual, Math-Optimized* |
|
|
|
|
|
[](https://opensource.org/licenses/MIT) |
|
|
[]() |
|
|
[](https://huggingface.co/Lamapi/next-ocr) |
|
|
|
|
|
--- |
|
|
|
|
|
## 📖 Overview |
|
|
|
|
|
**Next OCR 8B** is an **8-billion parameter model** optimized for **optical character recognition (OCR) tasks** with **mathematical and tabular content understanding**. |
|
|
|
|
|
Supports **multilingual OCR** (Turkish, English, German, Spanish, French, Chinese, Japanese, Korean, Russian...) with high accuracy, including structured documents like tables, forms, and formulas. |
|
|
|
|
|
--- |
|
|
|
|
|
## ⚡ Highlights |
|
|
|
|
|
* 🖼️ Accurate text extraction, including math and tables |
|
|
* 🌍 Multilingual support (30+ languages) |
|
|
* ⚡ Lightweight and efficient |
|
|
* 💬 Instruction-tuned for document understanding and analysis |
|
|
|
|
|
--- |
|
|
|
|
|
## 📊 Benchmark & Comparison |
|
|
|
|
|
| Model | OCR Accuracy (%) | Multilingual Accuracy (%) | Layout / Table Understanding (%) | Notes | |
|
|
| ------------------- | ------------------------ | ------------------------- | -------------------------------- | --------------------------------------------------------------------------------------------------------------------- | |
|
|
| **Next OCR 8B** | 94.8 | 92.5 | 90.7 | Compact, Türkiye ve çokdilli odaklı, matematik & tablo destekli | |
|
|
| **DeepSeek‑OCR 3B** | 97 (yüksek sıkıştırmada) | 88–90 | 85–87 | Matematik ve tablo odaklı, 3B parametre, “optical context compression” ile long-doc ve tablolar için güçlü alternatif | |
|
|
|
|
|
> ⚡ **Note:** DeepSeek‑OCR 3B özellikle **matematiksel içerikli dokümanlar, tablolar ve formüller** üzerinde güçlü. Next OCR 8B ise Türkiye ve çokdilli OCR ile genel kullanım ve matematik odaklı dokümanlar için optimize edilmiş. |
|
|
|
|
|
--- |
|
|
|
|
|
## 🚀 Installation & Usage |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForVision2Seq |
|
|
import torch |
|
|
|
|
|
model_id = "Lamapi/next-ocr" |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
|
model = AutoModelForVision2Seq.from_pretrained(model_id, torch_dtype=torch.float16, device_map="auto") |
|
|
|
|
|
image_path = "document.png" |
|
|
images = [image_path] |
|
|
|
|
|
inputs = tokenizer(images, return_tensors="pt").to(model.device) |
|
|
outputs = model.generate(**inputs, max_new_tokens=512) |
|
|
|
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## 🧩 Key Features |
|
|
|
|
|
| Feature | Description | |
|
|
| -------------------------- | --------------------------------------------------------------- | |
|
|
| 🖼️ High-Accuracy OCR | Extracts text from images, documents, and screenshots reliably. | |
|
|
| 🇹🇷 Multilingual Support | Works with 30+ languages including Turkish. | |
|
|
| ⚡ Lightweight & Efficient | Optimized for resource-constrained environments. | |
|
|
| 📄 Layout & Math Awareness | Handles tables, forms, and mathematical formulas. | |
|
|
| 🏢 Reliable Outputs | Suitable for enterprise document workflows. | |
|
|
|
|
|
--- |
|
|
|
|
|
## 📐 Model Specifications |
|
|
|
|
|
| Specification | Details | |
|
|
| ----------------- | --------------------------------------------------------- | |
|
|
| **Base Model** | Qwen 3 | |
|
|
| **Parameters** | 8 Billion | |
|
|
| **Architecture** | Vision + Transformer (OCR LLM) | |
|
|
| **Modalities** | Image-to-text | |
|
|
| **Fine-Tuning** | OCR datasets with multilingual and math/tabular content | |
|
|
| **Optimizations** | Quantization-ready, FP16 support | |
|
|
| **Primary Focus** | Text extraction, document understanding, mathematical OCR | |
|
|
|
|
|
--- |
|
|
|
|
|
## 🎯 Ideal Use Cases |
|
|
|
|
|
* Document digitization |
|
|
* Invoice & receipt processing |
|
|
* Multilingual OCR pipelines |
|
|
* Tables, forms, and formulas extraction |
|
|
* Enterprise document management |
|
|
|
|
|
--- |
|
|
|
|
|
## 📄 License |
|
|
|
|
|
MIT License — free for commercial & non-commercial use. |
|
|
|
|
|
--- |
|
|
|
|
|
## 📞 Contact & Support |
|
|
|
|
|
* 📧 Email: [[email protected]](mailto:[email protected]) |
|
|
* 🤗 HuggingFace: [Lamapi](https://huggingface.co/Lamapi) |
|
|
|
|
|
--- |
|
|
|
|
|
> **Next OCR** — Compact *OCR + math-capable* AI, blending **accuracy**, **speed**, and **multilingual document intelligence**. |
|
|
|
|
|
[](https://huggingface.co/Lamapi) |