Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
RLAIF
/
dpo_thinking_reddit_judge_last_minute_200_1e-6_0.02_4B_4B
like
0
Follow
RLAIF
19
Safetensors
Model card
Files
Files and versions
xet
Community
main
dpo_thinking_reddit_judge_last_minute_200_1e-6_0.02_4B_4B
8.84 GB
1 contributor
History:
2 commits
AngelRaychev
Upload folder using huggingface_hub
e2e2f4d
verified
about 1 month ago
global_step_260
Upload folder using huggingface_hub
about 1 month ago
.gitattributes
1.59 kB
Upload folder using huggingface_hub
about 1 month ago