arxiv:2206.06614
Luckeciano Carvalho Melo
luckeciano
·
AI & ML interests
Reinforcement Learning
Recent Activity
updated
a model
29 days ago
luckeciano/Qwen-2.5-0.5B-Instruct-AC-RL_3872
updated
a model
about 1 month ago
luckeciano/Qwen-2.5-7B-GRPO-NoBaseline-Adam-FisherMaskToken-1e-5-HessianMaskToken-0.01-v2_5923
updated
a model
about 1 month ago
luckeciano/Qwen-2.5-7B-GRPO-NoBaseline-Adam-FisherMaskToken-1e-5-HessianMaskToken-0.01-v2_3862