co1dspring (jinxu)

upvoted a collection 2 months ago

PixMo

A set of vision-language datasets built by Ai2 and used to train the Molmo family of models. Read more at https://molmo.allenai.org/blog • 10 items • Updated Apr 30 • 81

upvoted 6 papers 2 months ago

Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels

Paper • 2508.17437 • Published Aug 20 • 36

Ovis2.5 Technical Report

Paper • 2508.11737 • Published Aug 15 • 110

Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL

Paper • 2508.13167 • Published Aug 6 • 127

Intern-S1: A Scientific Multimodal Foundation Model

Paper • 2508.15763 • Published Aug 21 • 255

DINOv3

Paper • 2508.10104 • Published Aug 13 • 274

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Paper • 2508.18265 • Published Aug 25 • 202

upvoted 5 papers 6 months ago

LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale

Paper • 2504.16030 • Published Apr 22 • 37

100 Days After DeepSeek-R1: A Survey on Replication Studies and More Directions for Reasoning Language Models

Paper • 2505.00551 • Published May 1 • 36

PrimitiveAnything: Human-Crafted 3D Primitive Assembly Generation with Auto-Regressive Transformer

Paper • 2505.04622 • Published May 7 • 27

Memorization-Compression Cycles Improve Generalization

Paper • 2505.08727 • Published May 13 • 5

MiniMax-Speech: Intrinsic Zero-Shot Text-to-Speech with a Learnable Speaker Encoder

Paper • 2505.07916 • Published May 12 • 132

upvoted 2 papers 7 months ago

Pangu Ultra: Pushing the Limits of Dense Large Language Models on Ascend NPUs

Paper • 2504.07866 • Published Apr 10 • 12

VCR-Bench: A Comprehensive Evaluation Framework for Video Chain-of-Thought Reasoning

Paper • 2504.07956 • Published Apr 10 • 47

upvoted 4 papers 8 months ago

R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization

Paper • 2503.12937 • Published Mar 17 • 30

Being-0: A Humanoid Robotic Agent with Vision-Language Models and Modular Skills

Paper • 2503.12533 • Published Mar 16 • 68

LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL

Paper • 2503.07536 • Published Mar 10 • 88

HumanMM: Global Human Motion Recovery from Multi-shot Videos

Paper • 2503.07597 • Published Mar 10 • 2

upvoted 2 papers 9 months ago

Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published Jan 22 • 123

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 422

jinxu

AI & ML interests

Organizations

PixMo

Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels

Ovis2.5 Technical Report

Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL

Intern-S1: A Scientific Multimodal Foundation Model

DINOv3

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale

100 Days After DeepSeek-R1: A Survey on Replication Studies and More Directions for Reasoning Language Models

PrimitiveAnything: Human-Crafted 3D Primitive Assembly Generation with Auto-Regressive Transformer

Memorization-Compression Cycles Improve Generalization

MiniMax-Speech: Intrinsic Zero-Shot Text-to-Speech with a Learnable Speaker Encoder

Pangu Ultra: Pushing the Limits of Dense Large Language Models on Ascend NPUs

VCR-Bench: A Comprehensive Evaluation Framework for Video Chain-of-Thought Reasoning

R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization

Being-0: A Humanoid Robotic Agent with Vision-Language Models and Modular Skills

LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL

HumanMM: Global Human Motion Recovery from Multi-shot Videos

Kimi k1.5: Scaling Reinforcement Learning with LLMs

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

jinxu

AI & ML interests

Organizations

co1dspring's activity