Qwen-Image-Layered: Towards Inherent Editability via Layer Decomposition Paper • 2512.15603 • Published 17 days ago • 58
VideoAgentTrek: Computer Use Pretraining from Unlabeled Videos Paper • 2510.19488 • Published Oct 22, 2025 • 19
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning Paper • 2507.01006 • Published Jul 1, 2025 • 248
Leveraging Online Data to Enhance Medical Knowledge in a Small Persian Language Model Paper • 2505.16000 • Published May 21, 2025 • 1
PersianMind: A Cross-Lingual Persian-English Large Language Model Paper • 2401.06466 • Published Jan 12, 2024 • 5
A Comparative Study on Reasoning Patterns of OpenAI's o1 Model Paper • 2410.13639 • Published Oct 17, 2024 • 19
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper • 2502.14739 • Published Feb 20, 2025 • 106
LongEval: A Comprehensive Analysis of Long-Text Generation Through a Plan-based Paradigm Paper • 2502.19103 • Published Feb 26, 2025 • 3
COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values Paper • 2504.05535 • Published Apr 7, 2025 • 44
Unified Reward Model for Multimodal Understanding and Generation Paper • 2503.05236 • Published Mar 7, 2025 • 122
HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance Paper • 2504.06232 • Published Apr 8, 2025 • 13
MM-IFEngine: Towards Multimodal Instruction Following Paper • 2504.07957 • Published Apr 10, 2025 • 35
From Bytes to Borsch: Fine-Tuning Gemma and Mistral for the Ukrainian Language Representation Paper • 2404.09138 • Published Apr 14, 2024 • 6