StreamBP: Memory-Efficient Exact Backpropagation for Long Sequence Training of LLMs Paper • 2506.03077 • Published Jun 3 • 17
DPO-Shift: Shifting the Distribution of Direct Preference Optimization Paper • 2502.07599 • Published Feb 11 • 15