MedM-VL
Collection
Model weights for 2D/3D medical LVLMs
•
3 items
•
Updated
•
1
A 2D medical LVLM trained on 2D medical images and English medical texts, enabling tasks such as report generation, VQA, referring expression comprehension (REC), referring expression generation (REG) and image classification.
| Config | |
|---|---|
| Image encoder | google/siglip-base-patch16-256-multilingual |
| Connector | MLP (2-layer) |
| LLM | Qwen/Qwen2.5-3B-Instruct |
| Image resolution | 256*256 |
| Sequence length | 2048 |
| Benchmark | Med-Flamingo | LLaVA-Med | RadFM | MedM-VL-2D-3B-en |
|---|---|---|---|---|
| MedMNISTderma | 0.012 | 0.258 | 0.051 | 0.810 |
| MedMNISTorgan | 0.089 | 0.668 | 0.189 | 0.791 |
| MedPix | 0.081 | 0.151 | - | 0.087 |
| MIMIC-CXR | 0.233 | 0.204 | 0.068 | 0.222 |
| PathVQA | 0.334 | 0.378 | 0.248 | 0.634 |
| SAMedidentify | - | 0.458 | - | 0.637 |
| SAMedrefer | - | 0.086 | - | 0.225 |
| SLAKEidentify | - | 0.272 | - | 0.349 |
| SLAKErefer | - | 0.041 | - | 0.261 |
| SLAKEvqa | 0.215 | 0.337 | 0.817 | 0.812 |
Please refer to MedM-VL.
@inproceedings{shi2025medm,
title={Medm-vl: What makes a good medical lvlm?},
author={Shi, Yiming and Yang, Shaoshuai and Zhu, Xun and Wang, Haoyu and Fu, Xiangling and Li, Miao and Wu, Ji},
booktitle={International Workshop on Agentic AI for Medicine},
pages={290--299},
year={2025},
organization={Springer}
}