Qwen3-4B Chemistry checkpoints from the SciKnowEval-style generalization runs: GRPO baseline, RLSD, and SDPO.
Seongryong Jung
SeongryongJung
AI & ML interests
Post-training, Knowledge Distillation, Self-Evolving AI
Recent Activity
updated a model 2 days ago
SeongryongJung/Qwen3-4B-Physics-RLSD published a model 2 days ago
SeongryongJung/Qwen3-4B-Physics-RLSD updated a collection 2 days ago
Qwen3-4B Chemistry RL Fine-tuningOrganizations
None yet