-
Breaking the Exploration Bottleneck: Rubric-Scaffolded Reinforcement Learning for General LLM Reasoning
Paper • 2508.16949 • Published • 22 -
Diffusion Language Models Know the Answer Before Decoding
Paper • 2508.19982 • Published • 24 -
ThinkDial: An Open Recipe for Controlling Reasoning Effort in Large Language Models
Paper • 2508.18773 • Published • 15 -
Intern-S1: A Scientific Multimodal Foundation Model
Paper • 2508.15763 • Published • 256
Pan He
bestsonny
·
AI & ML interests
None yet
Organizations
None yet
papers
-
Breaking the Exploration Bottleneck: Rubric-Scaffolded Reinforcement Learning for General LLM Reasoning
Paper • 2508.16949 • Published • 22 -
Diffusion Language Models Know the Answer Before Decoding
Paper • 2508.19982 • Published • 24 -
ThinkDial: An Open Recipe for Controlling Reasoning Effort in Large Language Models
Paper • 2508.18773 • Published • 15 -
Intern-S1: A Scientific Multimodal Foundation Model
Paper • 2508.15763 • Published • 256
models
0
None public yet
datasets
0
None public yet