TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning Paper • 2509.25760 • Published Sep 30 • 54
A.S.E: A Repository-Level Benchmark for Evaluating Security in AI-Generated Code Paper • 2508.18106 • Published Aug 25 • 342
TianHongZXY/Top_5_similar_question-NVIDIA-OpenScienceReasoning-2 Viewer • Updated Aug 28 • 2.16k • 4.23k
TianHongZXY/Top_5_similar_question-NVIDIA-OpenScienceReasoning-2 Viewer • Updated Aug 28 • 2.16k • 4.23k
RLVR-Decomposed Collection The collection for the Paper "The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning" • 9 items • Updated Jun 1 • 2
TianHongZXY/OpenR1-Math-46k-8192-Qwen2.5-Math-7B-RoPE-40K-GRPO-use_guide-clip_ratio_upper_0.28 Updated Jul 12
TianHongZXY/OpenR1-Math-46k-8192-Qwen2.5-Math-7B-RoPE-40K-GRPO-use_guide-clip_ratio_upper_0.28 Updated Jul 12
TianHongZXY/OpenR1-Math-46k-8192-Qwen2.5-7B-Instruct-GRPO-gpt-4o-summary_wo_think-clip_0.28 Updated Jul 8
TianHongZXY/OpenR1-Math-46k-8192-Qwen2.5-7B-Instruct-GRPO-gpt-4o-summary_wo_think-clip_0.28 Updated Jul 8
AdaDecode Collection [ICML 2025] AdaDecode: Accelerating LLM Decoding with Adaptive Layer Parallelism. • 18 items • Updated Jun 4 • 3