2 44 4

Rafael Coelho de Souza Krzonkalla

krzonkalla

AI & ML interests

None yet

Recent Activity

updated a model 2 days ago

krzonkalla/rio-2-video-vl

updated a model 2 days ago

krzonkalla/rio-2-ocr

upvoted a paper 3 days ago

Every Question Has Its Own Value: Reinforcement Learning with Explicit Human Values

View all activity

Organizations

None yet

updated 2 models 2 days ago

krzonkalla/rio-2-video-vl

Video-Text-to-Text • 849k • Updated 2 days ago • 18

krzonkalla/rio-2-ocr

Image-to-Text • 8B • Updated 2 days ago • 24 • 1

upvoted 5 papers 3 days ago

Every Question Has Its Own Value: Reinforcement Learning with Explicit Human Values

Paper • 2510.20187 • Published 4 days ago • 17

BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping

Paper • 2510.18927 • Published 5 days ago • 77

upvoted a paper 4 days ago

Extracting alignment data in open models

Paper • 2510.18554 • Published 5 days ago • 6

updated a model 5 days ago

krzonkalla/test-voice-nano

Updated 5 days ago • 8

published a model 5 days ago

krzonkalla/test-voice-nano

Updated 5 days ago • 8

liked a model 8 days ago

krzonkalla/Rio_2_14B

Text Generation • 15B • Updated 10 days ago • 628 • 1

upvoted a paper 9 days ago

Agentic Entropy-Balanced Policy Optimization

Paper • 2510.14545 • Published 10 days ago • 96

upvoted a paper 10 days ago

The Art of Scaling Reinforcement Learning Compute for LLMs

Paper • 2510.13786 • Published 11 days ago • 30

updated a model 10 days ago

krzonkalla/Rio_2_14B

Text Generation • 15B • Updated 10 days ago • 628 • 1

upvoted a paper 10 days ago

Attention Illuminates LLM Reasoning: The Preplan-and-Anchor Rhythm Enables Fine-Grained Policy Optimization

Paper • 2510.13554 • Published 11 days ago • 54

upvoted 2 papers 11 days ago

Cautious Weight Decay

Paper • 2510.12402 • Published 12 days ago • 4

HoneyBee: Data Recipes for Vision-Language Reasoners

Paper • 2510.12225 • Published 13 days ago • 9

upvoted a paper 12 days ago

AndesVL Technical Report: An Efficient Mobile-side Multimodal Large Language Model

Paper • 2510.11496 • Published 13 days ago • 3

upvoted a paper 15 days ago

Reinforcing Diffusion Models by Direct Group Preference Optimization

Paper • 2510.08425 • Published 17 days ago • 10

Rafael Coelho de Souza Krzonkalla

AI & ML interests

Recent Activity

Organizations

krzonkalla's activity