arxiv:2507.12284
Dmitri Babaev
dllllb
AI & ML interests
PLP, RL, sequential data
Recent Activity
upvoted
an
article
18 days ago
From GRPO to DAPO and GSPO: What, Why, and How
authored
a paper
about 1 month ago
SWE-MERA: A Dynamic Benchmark for Agenticly Evaluating Large Language
Models on Software Engineering Tasks
authored
a paper
about 1 month ago
MERA Code: A Unified Framework for Evaluating Code Generation Across
Tasks