20 8 4

Hanbin Wang

hanbin

https://wanghanbinpanda.github.io/

wanghanbinpanda

AI & ML interests

Code Intelligence and LLM Reasoning (Code, Math)

Recent Activity

upvoted a paper 4 months ago

From f(x) and g(x) to f(g(x)): LLMs Learn New Skills in RL by Composing Old Ones

upvoted a paper 4 months ago

A Survey of Reinforcement Learning for Large Reasoning Models

upvoted a paper 5 months ago

UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning

View all activity

Organizations

upvoted 2 papers 4 months ago

From f(x) and g(x) to f(g(x)): LLMs Learn New Skills in RL by Composing Old Ones

Paper • 2509.25123 • Published Sep 29, 2025 • 20

A Survey of Reinforcement Learning for Large Reasoning Models

Paper • 2509.08827 • Published Sep 10, 2025 • 190

upvoted a paper 5 months ago

UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning

Paper • 2509.02544 • Published Sep 2, 2025 • 124

published a dataset 5 months ago

hanbin/dolmino-mix-1124-pes2o-hf-3

Updated Aug 25, 2025

updated a dataset 5 months ago

hanbin/dolmino-mix-1124-pes2o-hf

Viewer • Updated Aug 25, 2025 • 1.09M • 10

published a dataset 5 months ago

hanbin/dolmino-mix-1124-pes2o-hf

Viewer • Updated Aug 25, 2025 • 1.09M • 10

updated 2 models 6 months ago

hanbin/Llama-3.1-8B-pretrain-1-pes2o-anneal-1B_oasst1_wildchat

Text Generation • 8B • Updated Jul 29, 2025

hanbin/Llama-3.1-8B-pes2o-anneal-2.7B_oasst1_wildchat

Text Generation • 8B • Updated Jul 29, 2025

published 2 models 6 months ago

hanbin/Llama-3.1-8B-pretrain-1-pes2o-anneal-1B_oasst1_wildchat

Text Generation • 8B • Updated Jul 29, 2025

hanbin/Llama-3.1-8B-pes2o-anneal-2.7B_oasst1_wildchat

Text Generation • 8B • Updated Jul 29, 2025

updated 2 models 6 months ago

hanbin/Llama-3.1-8B-pes2o-anneal-2.7B

Text Generation • 8B • Updated Jul 28, 2025

hanbin/Llama-3.1-8B-pretrain-1-pes2o-anneal-1B

Text Generation • 8B • Updated Jul 28, 2025

published 2 models 6 months ago

hanbin/Llama-3.1-8B-pretrain-1-pes2o-anneal-1B

Text Generation • 8B • Updated Jul 28, 2025

hanbin/Llama-3.1-8B-pes2o-anneal-2.7B

Text Generation • 8B • Updated Jul 28, 2025

updated a model 6 months ago

hanbin/Qwen2.5-7B-pattern-mixed-6epoch

Text Generation • 8B • Updated Jul 23, 2025 • 1

published a model 6 months ago

hanbin/Qwen2.5-7B-pattern-mixed-6epoch

Text Generation • 8B • Updated Jul 23, 2025 • 1

updated a model 6 months ago

hanbin/Llama-3.1-8B-pretrain-1

Text Generation • 8B • Updated Jul 14, 2025

published a model 6 months ago

hanbin/Llama-3.1-8B-pretrain-1

Text Generation • 8B • Updated Jul 14, 2025

updated a model 10 months ago

PRIME-RL/Eurus-2-7B-PRIME-Zero

Text Generation • 8B • Updated Mar 14, 2025 • 4 • 2

published a model 10 months ago

PRIME-RL/Eurus-2-7B-PRIME-Zero

Text Generation • 8B • Updated Mar 14, 2025 • 4 • 2

Hanbin Wang

AI & ML interests

Recent Activity

Organizations

hanbin's activity