Co-rewarding
Collection
Co-rewarding is a novel self-supervised RL framework that improves training stability by seeking complementary supervision from another views.
•
69 items
•
Updated
•
1