Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
MoeReward
/
rl_checkpoints
like
0
Follow
Project of MoE reward model
7
Safetensors
Model card
Files
Files and versions
xet
Community
9c87696
rl_checkpoints
115 GB
1 contributor
History:
3 commits
shengyi-qian
three checkpoints
9c87696
7 months ago
qwen1.5_base_rule_base_arc_heavy_grpo_naive
three checkpoints
7 months ago
qwen1.5_base_rule_base_equal_dist_grpo_naive
three checkpoints
7 months ago
qwen1.5_base_rule_base_grpo_naive
qwen1.5 rule based
7 months ago
qwen1.5_base_rule_base_imdb_heavy_grpo_naive
three checkpoints
7 months ago
.gitattributes
Safe
1.56 kB
qwen1.5 rule based
7 months ago