·
AI & ML interests
None yet
Organizations
None yet
IDoNotHaveAName/aug_model
IDoNotHaveAName/Hint-informed-env
Updated
IDoNotHaveAName/GRPO-800-reproduction
Updated
IDoNotHaveAName/PRM-GRPO-800-1.5B
Updated
IDoNotHaveAName/Hint-Informed-grpo
2B
•
Updated
IDoNotHaveAName/reproduce-grpo-1.5B
Updated
IDoNotHaveAName/model-trainby-mistake
Text Generation
•
2B
•
Updated
IDoNotHaveAName/2epoch-experiment
Text Generation
•
2B
•
Updated
IDoNotHaveAName/X-R1-3epoch
Text Generation
•
2B
•
Updated
•
1
IDoNotHaveAName/GRPO-1epoch-train-by-mistake-collections-without-hint
Text Generation
•
2B
•
Updated
•
1
IDoNotHaveAName/GRPO-1epoch-train-by-mistake-collections-with-hint
Text Generation
•
2B
•
Updated
IDoNotHaveAName/GRPO-qwen2.5-1.5B-reward-process
Text Generation
•
2B
•
Updated
IDoNotHaveAName/origin_grpo_train_1_epoch
Text Generation
•
2B
•
Updated
•
1
IDoNotHaveAName/GRPO_tokens_repeat_model
Text Generation
•
2B
•
Updated