2 45 4

Rafael Coelho de Souza Krzonkalla

krzonkalla

AI & ML interests

None yet

Recent Activity

updated a model 3 days ago

krzonkalla/rio_2.0_nothink_exp

updated a model 3 days ago

krzonkalla/rio_2.0_detective_exp

updated a model 3 days ago

krzonkalla/rio-2.0-tool-use-exp

View all activity

Organizations

None yet

upvoted a paper 4 days ago

Reasoning with Sampling: Your Base Model is Smarter Than You Think

Paper • 2510.14901 • Published 15 days ago • 43

upvoted 5 papers 8 days ago

Every Question Has Its Own Value: Reinforcement Learning with Explicit Human Values

Paper • 2510.20187 • Published 9 days ago • 18

BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping

Paper • 2510.18927 • Published 10 days ago • 80

upvoted a paper 9 days ago

Extracting alignment data in open models

Paper • 2510.18554 • Published 10 days ago • 7

upvoted a paper 14 days ago

Agentic Entropy-Balanced Policy Optimization

Paper • 2510.14545 • Published 15 days ago • 101

upvoted 2 papers 15 days ago

The Art of Scaling Reinforcement Learning Compute for LLMs

Paper • 2510.13786 • Published 16 days ago • 30

Attention Illuminates LLM Reasoning: The Preplan-and-Anchor Rhythm Enables Fine-Grained Policy Optimization

Paper • 2510.13554 • Published 16 days ago • 55

upvoted 2 papers 16 days ago

Cautious Weight Decay

Paper • 2510.12402 • Published 17 days ago • 4

HoneyBee: Data Recipes for Vision-Language Reasoners

Paper • 2510.12225 • Published 17 days ago • 9

upvoted a paper 17 days ago

AndesVL Technical Report: An Efficient Mobile-side Multimodal Large Language Model

Paper • 2510.11496 • Published 18 days ago • 3

upvoted a paper 20 days ago

Reinforcing Diffusion Models by Direct Group Preference Optimization

Paper • 2510.08425 • Published 22 days ago • 10

upvoted a paper 22 days ago

NorMuon: Making Muon more efficient and scalable

Paper • 2510.05491 • Published 25 days ago • 6

upvoted a paper 23 days ago

Fast-dLLM v2: Efficient Block-Diffusion LLM

Paper • 2509.26328 • Published Sep 30 • 49

upvoted 2 papers about 1 month ago

Apertus: Democratizing Open and Compliant LLMs for Global Language Environments

Paper • 2509.14233 • Published Sep 17 • 12

Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation

Paper • 2509.15194 • Published Sep 18 • 33

upvoted 2 papers about 2 months ago

A Survey of Reinforcement Learning for Large Reasoning Models

Paper • 2509.08827 • Published Sep 10 • 184

The Majority is not always right: RL training for solution aggregation

Paper • 2509.06870 • Published Sep 8 • 16

Rafael Coelho de Souza Krzonkalla

AI & ML interests

Recent Activity

Organizations

krzonkalla's activity