Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
luckeciano
/
Qwen-2.5-7B-GRPO-NoBaseline-Adam-FisherMaskToken-1e-5-HessianMaskToken-0.01-v2_4270
like
0
Text Generation
Transformers
Safetensors
DigitalLearningGmbH/MATH-lighteval
qwen2
Generated from Trainer
open-r1
trl
grpo
conversational
text-generation-inference
arxiv:
2402.03300
Model card
Files
Files and versions
xet
Community
Deploy
Use this model
main
Qwen-2.5-7B-GRPO-NoBaseline-Adam-FisherMaskToken-1e-5-HessianMaskToken-0.01-v2_4270
/
model-00004-of-00004.safetensors
Commit History
Training in progress, step 100
1a361c0
verified
luckeciano
commited on
Sep 24
Training in progress, step 90
0d7a627
verified
luckeciano
commited on
Sep 24
Training in progress, step 80
1a99ec0
verified
luckeciano
commited on
Sep 24
Training in progress, step 70
8d5e0a0
verified
luckeciano
commited on
Sep 23
Training in progress, step 60
9163564
verified
luckeciano
commited on
Sep 23
Training in progress, step 50
15c756f
verified
luckeciano
commited on
Sep 23
Training in progress, step 40
89e510b
verified
luckeciano
commited on
Sep 23
Training in progress, step 30
42dd9e2
verified
luckeciano
commited on
Sep 23
Training in progress, step 20
92e8013
verified
luckeciano
commited on
Sep 23
Training in progress, step 10
1a0c74b
verified
luckeciano
commited on
Sep 23