arxiv:2407.15762
Kaiwen Wang
kaiwenw
AI & ML interests
Reinforcement Learning
Organizations
models
36
kaiwenw/single_node_run2-step-12170
2B
•
Updated
•
5
kaiwenw/single_node_run2-step-12150
2B
•
Updated
•
4
kaiwenw/single_node_run2-step-11664
2B
•
Updated
•
4
kaiwenw/single_node_run2-step-11178
2B
•
Updated
•
4
kaiwenw/single_node_run2-step-10692
2B
•
Updated
•
6
kaiwenw/single_node_run2-step-10206
2B
•
Updated
•
4
kaiwenw/single_node_run2-step-9720
2B
•
Updated
•
4
kaiwenw/single_node_run2-step-9234
2B
•
Updated
•
5
kaiwenw/single_node_run2-step-8748
2B
•
Updated
•
5
kaiwenw/single_node_run2-step-8262
2B
•
Updated
•
5
datasets
220
kaiwenw/distill-r1-qwen-1.5b-hmmt-feb-25-4096-with-bt-model-with-sigmoid
Viewer
•
Updated
•
123k
•
107
kaiwenw/distill-r1-qwen-1.5b-hmmt-feb-24-4096-with-bt-model-with-sigmoid
Viewer
•
Updated
•
123k
•
97
kaiwenw/distill-r1-qwen-1.5b-aime-25-4096-with-bt-model-with-sigmoid
Viewer
•
Updated
•
123k
•
2.33k
kaiwenw/distill-r1-qwen-1.5b-aime-24-4096-with-bt-model-with-sigmoid
Viewer
•
Updated
•
123k
•
2.64k
kaiwenw/distill-r1-qwen-1.5b-hmmt-feb-25-4096-with-bt-model-wout-sigmoid
Viewer
•
Updated
•
123k
•
44
kaiwenw/distill-r1-qwen-1.5b-hmmt-feb-24-4096-with-bt-model-wout-sigmoid
Viewer
•
Updated
•
123k
•
55
kaiwenw/distill-r1-qwen-1.5b-aime-25-4096-with-bt-model-wout-sigmoid
Viewer
•
Updated
•
123k
•
2.23k
kaiwenw/distill-r1-qwen-1.5b-aime-24-4096-with-bt-model-wout-sigmoid
Viewer
•
Updated
•
123k
•
2.5k
kaiwenw/distill-r1-qwen-1.5b-hmmt-feb-25-4096-with-old-prm-indices_61440_69120
Viewer
•
Updated
•
7.68k
•
12
kaiwenw/distill-r1-qwen-1.5b-hmmt-feb-25-4096-with-old-prm-indices_76800_84480
Viewer
•
Updated
•
7.68k
•
19