arxiv:2504.19394
Toby Simonds
TamasSimonds
AI & ML interests
None yet
Organizations
None yet
models 7
TamasSimonds/llama3.1-8b-kp-1k-self-play-step-336-sys-prompt
8B • Updated
TamasSimonds/spiral-qwen2-5-3b-base-KP-1k-self-play-1-1-step-192
3B • Updated
TamasSimonds/spiral-qwen3-8b-base-KP-1k-self-play-1-1-step-192
8B • Updated • 3
TamasSimonds/spiral-llama-3B-base-KP-1k-self-play-1-1-step-192
3B • Updated • 2
TamasSimonds/Qwen3-4B-KP-no-sys-prompt-1k-self-play-1-1-step-192
4B • Updated
TamasSimonds/spiral-qwen3-4b-base-KP-1k-self-play-1.1_0707T15-09-49
4B • Updated • 1
TamasSimonds/O1-Llama-3.2-3B
3B • Updated
datasets 7
TamasSimonds/record-test4
Viewer • Updated • 2.19k • 17
TamasSimonds/record-test3
Updated • 7
TamasSimonds/olympiad-proof-problems
Viewer • Updated • 39.8k • 34
TamasSimonds/poker_safety_realignment
Viewer • Updated • 70 • 6
TamasSimonds/imo-dataset
Viewer • Updated • 370 • 21
TamasSimonds/TextbooksToRLQuestions-100k
Viewer • Updated • 108k • 13 • 5
TamasSimonds/ReasonSet
Viewer • Updated • 1.78k • 15