Intuitive Fine-Tuning: Towards Unifying SFT and RLHF into a Single Process Paper • 2405.11870 • Published May 20, 2024
MIRAGE: Exploring How Large Language Models Perform in Complex Social Interactive Environments Paper • 2501.01652 • Published Jan 3
RuleArena: A Benchmark for Rule-Guided Reasoning with LLMs in Real-World Scenarios Paper • 2412.08972 • Published Dec 12, 2024 • 10