Baichuan-M2: Scaling Medical Capability with Large Verifier System Paper • 2509.02208 • Published Sep 2 • 41
Can Multimodal LLMs See Materials Clearly? A Multimodal Benchmark on Materials Characterization Paper • 2509.09307 • Published Sep 11 • 6
TalkVid: A Large-Scale Diversified Dataset for Audio-Driven Talking Head Synthesis Paper • 2508.13618 • Published Aug 19 • 17
ShareGPT-4o-Image: Aligning Multimodal Models with GPT-4o-Level Image Generation Paper • 2506.18095 • Published Jun 22 • 65
QFFT, Question-Free Fine-Tuning for Adaptive Reasoning Paper • 2506.12860 • Published Jun 15 • 18
S2S-Arena, Evaluating Speech2Speech Protocols on Instruction Following with Paralinguistic Information Paper • 2503.05085 • Published Mar 7 • 47