-
PopAlign: Diversifying Contrasting Patterns for a More Comprehensive Alignment
Paper • 2410.13785 • Published • 19 -
Aligning Large Language Models via Self-Steering Optimization
Paper • 2410.17131 • Published • 24 -
Baichuan Alignment Technical Report
Paper • 2410.14940 • Published • 51 -
SemiEvol: Semi-supervised Fine-tuning for LLM Adaptation
Paper • 2410.14745 • Published • 47
Collections
Discover the best community collections!
Collections including paper arxiv:2508.14460
-
LoRA+: Efficient Low Rank Adaptation of Large Models
Paper • 2402.12354 • Published • 6 -
The FinBen: An Holistic Financial Benchmark for Large Language Models
Paper • 2402.12659 • Published • 23 -
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization
Paper • 2402.13249 • Published • 13 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 69
-
Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning
Paper • 2407.20798 • Published • 24 -
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper • 2412.16145 • Published • 38 -
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models
Paper • 2501.03262 • Published • 102 -
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution
Paper • 2502.18449 • Published • 75
-
PopAlign: Diversifying Contrasting Patterns for a More Comprehensive Alignment
Paper • 2410.13785 • Published • 19 -
Aligning Large Language Models via Self-Steering Optimization
Paper • 2410.17131 • Published • 24 -
Baichuan Alignment Technical Report
Paper • 2410.14940 • Published • 51 -
SemiEvol: Semi-supervised Fine-tuning for LLM Adaptation
Paper • 2410.14745 • Published • 47
-
Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning
Paper • 2407.20798 • Published • 24 -
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper • 2412.16145 • Published • 38 -
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models
Paper • 2501.03262 • Published • 102 -
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution
Paper • 2502.18449 • Published • 75
-
LoRA+: Efficient Low Rank Adaptation of Large Models
Paper • 2402.12354 • Published • 6 -
The FinBen: An Holistic Financial Benchmark for Large Language Models
Paper • 2402.12659 • Published • 23 -
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization
Paper • 2402.13249 • Published • 13 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 69