remyxai
/

SpaceQwen3-VL-2B-Thinking

Image-Text-to-Text

Model card Files Files and versions

salma-remyx commited on 18 days ago

Commit

405a1c9

·

verified ·

1 Parent(s): bbbec91

Update README.md

Files changed (1) hide show

README.md +23 -1

README.md CHANGED Viewed

@@ -16,4 +16,26 @@ pipeline_tag: image-text-to-text
 # Model Card for SpaceQwen3-VL-2B-Thinking
-Finetuned [Qwen3-VL-2B-Thinking](https://huggingface.co/Qwen/Qwen3-VL-2B-Thinking) by Low-Rank Adapters using the [SpaceOm dataset](https://huggingface.co/datasets/remyxai/SpaceOm) created with [VQASynth](https://github.com/remyxai/VQASynth), an open-source multimodal data synthesis pipeline inspired by [SpatialVLM](https://spatial-vlm.github.io/#community-implementation)

 # Model Card for SpaceQwen3-VL-2B-Thinking
+Finetuned [Qwen3-VL-2B-Thinking](https://huggingface.co/Qwen/Qwen3-VL-2B-Thinking) by Low-Rank Adapters using the [SpaceOm dataset](https://huggingface.co/datasets/remyxai/SpaceOm) created with [VQASynth](https://github.com/remyxai/VQASynth), an open-source multimodal data synthesis pipeline inspired by [SpatialVLM](https://spatial-vlm.github.io/#community-implementation)
+# Citation
+```
+@misc{qwen3technicalreport,
+      title={Qwen3 Technical Report},
+      author={Qwen Team},
+      year={2025},
+      eprint={2505.09388},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2505.09388},
+}
+@article{chen2024spatialvlm,
+  title = {SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities},
+  author = {Chen, Boyuan and Xu, Zhuo and Kirmani, Sean and Ichter, Brian and Driess, Danny and Florence, Pete and Sadigh, Dorsa and Guibas, Leonidas and Xia, Fei},
+  journal = {arXiv preprint arXiv:2401.12168},
+  year = {2024},
+  url = {https://arxiv.org/abs/2401.12168},
+}
+```