Marco-MoE Collection A suit of multilingual MoE models with highly-sparse architectures β’ 4 items β’ Updated about 24 hours ago β’ 6
view article Article Welcome Gemma 4: Frontier multimodal intelligence on device +5 2 days ago β’ 457
Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale Paper β’ 2603.25040 β’ Published 9 days ago β’ 125
PrismAudio: Decomposed Chain-of-Thoughts and Multi-dimensional Rewards for Video-to-Audio Generation Paper β’ 2511.18833 β’ Published Nov 24, 2025 β’ 5
Speed by Simplicity: A Single-Stream Architecture for Fast Audio-Video Generative Foundation Model Paper β’ 2603.21986 β’ Published 12 days ago β’ 120
dots.mocr Collection Multimodal OCR: Parse Anything from Documents β’ 2 items β’ Updated 16 days ago β’ 7
InCoder-32B: Code Foundation Model for Industrial Scenarios Paper β’ 2603.16790 β’ Published 18 days ago β’ 306
MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification Paper β’ 2603.15726 β’ Published 19 days ago β’ 184
Qianfan-OCR: A Unified End-to-End Model for Document Intelligence Paper β’ 2603.13398 β’ Published 24 days ago β’ 152