Submitted by GMFTBY 90 The End of Manual Decoding: Towards Truly End-to-End Language Models Tencent 23 1
Submitted by xinlongwang 67 Emu3.5: Native Multimodal Models are World Learners Beijing Academy of Artificial Intelligence 854 4
Submitted by taesiri 55 Kimi Linear: An Expressive, Efficient Attention Architecture Moonshot AI 776 1
Submitted by taesiri 40 Can Agent Conquer Web? Exploring the Frontiers of ChatGPT Atlas Agent in Web Games · 3 authors 1
Submitted by ShengnanAn 30 AMO-Bench: Large Language Models Still Struggle in High School Math Competitions LongCat 16 1
Submitted by ZrrSkywalker 28 Are Video Models Ready as Zero-Shot Reasoners? An Empirical Study with the MME-CoF Benchmark The Chinese University of Hong Kong 20 2
Submitted by hamza-hcompany 28 Surfer 2: The Next Generation of Cross-Platform Computer Use Agents H company 1
Submitted by wruisi 26 The Quest for Generalizable Motion Generation: Data, Model, and Evaluation Nanyang Technological University 6 1
Submitted by alexhsu 24 Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning Google 1
Submitted by CZWin32768 20 The Era of Agentic Organization: Learning to Organize with Language Models · 7 authors 1
Submitted by KevinHuang 17 OmniX: From Unified Panoramic Generation and Perception to Graphics-Ready 3D Scenes The University of Hong Kong 33 2
Submitted by nicolas-dufour 11 MIRO: MultI-Reward cOnditioned pretraining improves T2I quality and efficiency · 5 authors 2
Submitted by chaoyi-wu 9 EHR-R1: A Reasoning-Enhanced Foundational Language Model for Electronic Health Record Analysis Shanghai Jiao Tong University 5 1
Submitted by khr0516 8 OmniLayout: Enabling Coarse-to-Fine Learning with LLMs for Universal Document Layout Generation · 7 authors 1
Submitted by akshaynambi 8 Magentic Marketplace: An Open-Source Environment for Studying Agentic Markets Microsoft Research 14 2
Submitted by xk-huang 6 MedVLSynther: Synthesizing High-Quality Visual Question Answering from Medical Documents with Generator-Verifier LMMs UCSC-VLAA 7 1
Submitted by alessandrobondielli 4 CLASS-IT: Conversational and Lecture-Aligned Small-Scale Instruction Tuning for BabyLMs CoLingLab | Computational Linguistics Laboratory - University of Pisa 1
Submitted by taesiri 2 Counteracting Matthew Effect in Self-Improvement of LVLMs through Head-Tail Re-balancing · 9 authors 1
Submitted by acharkq 2 EnzyControl: Adding Functional and Substrate-Specific Control for Enzyme Backbone Generation National University of Singapore 1
Submitted by jtlicardo 2 Performance Trade-offs of Optimizing Small Language Models for E-Commerce · 2 authors 1
Submitted by JJ-TMT 1 CityRiSE: Reasoning Urban Socio-Economic Status in Vision-Language Models via Reinforcement Learning · 6 authors 2 1
Submitted by fangwu97 1 L^2M^3OF: A Large Language Multimodal Model for Metal-Organic Frameworks · 7 authors 1
Submitted by zhoutianyi - ChartAB: A Benchmark for Chart Grounding & Dense Alignment University of Maryland College Park 0 1