small-models-for-glam
/

Qwen3-VL-2B-catmus

@@ -1,7 +1,7 @@
 ---
-base_model: Qwen/Qwen3-VL-4B-Instruct
 library_name: transformers
-model_name: Qwen3-VL-4B-catmus-medieval
 tags:
 - generated_from_trainer
 - sft
@@ -15,9 +15,9 @@ tags:
 licence: license
 ---
-# Model Card for Qwen3-VL-4B-catmus-medieval
-This model is a fine-tuned version of [Qwen/Qwen3-VL-4B-Instruct](https://huggingface.co/Qwen/Qwen3-VL-4B-Instruct) for transcribing line-level medieval manuscripts from images.
 It has been trained using [TRL](https://github.com/huggingface/trl) on the [CATMuS/medieval](https://huggingface.co/datasets/CATMuS/medieval) dataset.
 ## Model Description
@@ -32,8 +32,8 @@ The model was evaluated on 100 examples from the [CATMuS/medieval](https://huggi
 | Metric | Base Model | Fine-tuned Model | Improvement |
 |--------|-----------|------------------|-------------|
-| **Character Error Rate (CER)** | 0.8044 (80.44%) | 0.2205 (22.05%) | **+72.59%** |
-| **Word Error Rate (WER)** | 1.2029 (120.29%) | 0.5714 (57.14%) | **+52.49%** |
 ### Sample Predictions
@@ -42,28 +42,28 @@ Here are some example transcriptions comparing the base model and fine-tuned mod
 **Example 1:**
 - **Reference:** paulꝯ ad thessalonicenses .iii.
-- **Base Model:** pauli ad theMAlomontes • 111 •
-- **Fine-tuned Model:** Paulꝰ ad thesalonicenses .iii.
 **Example 2:**
 - **Reference:** acceptad mi humilde seruicio. e dissipad. e plantad en el
-- **Base Model:** acceptad mi humilde servició, e dissipad, e plantad en el
-- **Fine-tuned Model:** acceptad mi humilde seruicio. e dissipad. e splantad en el
 **Example 3:**
 - **Reference:** ꝙ mattheus illam dictionem ponat
-- **Base Model:** g mattheus illam dictionem proua
-- **Fine-tuned Model:** ꝙ mattheus illam dictione in ponas
 **Example 4:**
 - **Reference:** Elige ꝗd uoueas. eadẽ ħ ꝗꝗ sama ferebat.
-- **Base Model:** f. luge quomoc. eade & q. fama ferebat.
-- **Fine-tuned Model:** Flige qd̵ uoneas. eadẽ ħ ꝗꝗ fama ferebat.
 **Example 5:**
 - **Reference:** a prima coniugatione ue
-- **Base Model:** d'artimacopinazione ne
-- **Fine-tuned Model:** a primiti coniugatione ut
 ## Quick start
@@ -74,8 +74,8 @@ from peft import PeftModel
 from PIL import Image
 # Load model and processor
-base_model = "Qwen/Qwen3-VL-4B-Instruct"
-adapter_model = "wjbmattingly/Qwen3-VL-4B-catmus-medieval"
 model = Qwen3VLForConditionalGeneration.from_pretrained(
     base_model,
@@ -129,7 +129,7 @@ This model is designed for:
 ## Training procedure
-This model was fine-tuned using Supervised Fine-Tuning (SFT) with LoRA adapters on the Qwen3-VL-4B-Instruct base model.
 ### Training Data
@@ -138,7 +138,7 @@ a dataset containing images of line-level medieval manuscripts with correspondin
 ### Training Configuration
-- **Base Model**: Qwen/Qwen3-VL-4B-Instruct
 - **Training Method**: Supervised Fine-Tuning (SFT) with LoRA
 - **LoRA Configuration**:
   - Rank (r): 16
@@ -197,4 +197,4 @@ If you use this model, please cite the base model and training framework:
 ---
-*README generated automatically on 2025-10-24 10:46:50*

 ---
+base_model: Qwen/Qwen3-VL-2B-Instruct
 library_name: transformers
+model_name: Qwen3-VL-2B-catmus-medieval
 tags:
 - generated_from_trainer
 - sft
 licence: license
 ---
+# Model Card for Qwen3-VL-2B-catmus-medieval
+This model is a fine-tuned version of [Qwen/Qwen3-VL-2B-Instruct](https://huggingface.co/Qwen/Qwen3-VL-2B-Instruct) for transcribing line-level medieval manuscripts from images.
 It has been trained using [TRL](https://github.com/huggingface/trl) on the [CATMuS/medieval](https://huggingface.co/datasets/CATMuS/medieval) dataset.
 ## Model Description
 | Metric | Base Model | Fine-tuned Model | Improvement |
 |--------|-----------|------------------|-------------|
+| **Character Error Rate (CER)** | 1.0815 (108.15%) | 0.2779 (27.79%) | **+74.30%** |
+| **Word Error Rate (WER)** | 1.7386 (173.86%) | 0.7043 (70.43%) | **+59.49%** |
 ### Sample Predictions
 **Example 1:**
 - **Reference:** paulꝯ ad thessalonicenses .iii.
+- **Base Model:** Paulus ad the Malomancis · iii.
+- **Fine-tuned Model:** Paulꝰ ad thessalonensis .iii.
 **Example 2:**
 - **Reference:** acceptad mi humilde seruicio. e dissipad. e plantad en el
+- **Base Model:**  acceptad mi humilde servicio, e dissipad, e plantad en el
+- **Fine-tuned Model:** acceptad mi humilde seruicio, e dissipad, e plantad en el
 **Example 3:**
 - **Reference:** ꝙ mattheus illam dictionem ponat
+- **Base Model:**  p mattheus illam dictoneum proa
+- **Fine-tuned Model:** ꝑ mattheus illam dictione in ponat
 **Example 4:**
 - **Reference:** Elige ꝗd uoueas. eadẽ ħ ꝗꝗ sama ferebat.
+- **Base Model:** f. ligeq d uonear. eade h q q fama ferebat.
+- **Fine-tuned Model:** f liges ꝗd uonear. eadẽ li ꝗq tanta ferebat᷑.
 **Example 5:**
 - **Reference:** a prima coniugatione ue
+- **Base Model:** Grigimacopissagazione-ve
+- **Fine-tuned Model:** a ꝑrũt̾tacõnueꝰatione. ne
 ## Quick start
 from PIL import Image
 # Load model and processor
+base_model = "Qwen/Qwen3-VL-2B-Instruct"
+adapter_model = "wjbmattingly/Qwen3-VL-2B-catmus-medieval"
 model = Qwen3VLForConditionalGeneration.from_pretrained(
     base_model,
 ## Training procedure
+This model was fine-tuned using Supervised Fine-Tuning (SFT) with LoRA adapters on the Qwen3-VL-2B-Instruct base model.
 ### Training Data
 ### Training Configuration
+- **Base Model**: Qwen/Qwen3-VL-2B-Instruct
 - **Training Method**: Supervised Fine-Tuning (SFT) with LoRA
 - **LoRA Configuration**:
   - Rank (r): 16
 ---
+*README generated automatically on 2025-10-24 10:49:05*