EVLM: An Efficient Vision-Language Model for Visual Understanding Paper • 2407.14177 • Published Jul 19, 2024 • 44
Lumos : Empowering Multimodal LLMs with Scene Text Recognition Paper • 2402.08017 • Published Feb 12, 2024 • 27