5 23 12

Hao Peng

Wesleythu

h-peng17

AI & ML interests

None yet

Recent Activity

updated a model 4 days ago

Wesleythu/Qwen3-8B-RM

published a model 4 days ago

Wesleythu/Qwen3-8B-RM

updated a model 5 days ago

Wesleythu/Qwen3-4B-RM

View all activity

Organizations

upvoted 2 papers 3 months ago

Boundary-Guided Policy Optimization for Memory-efficient RL of Diffusion Large Language Models

Paper • 2510.11683 • Published Oct 13, 2025 • 14

DeepPrune: Parallel Scaling without Inter-trace Redundancy

Paper • 2510.08483 • Published Oct 9, 2025 • 24

upvoted 2 papers 5 months ago

Thyme: Think Beyond Images

Paper • 2508.11630 • Published Aug 15, 2025 • 81

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Paper • 2508.06471 • Published Aug 8, 2025 • 195

upvoted a paper 6 months ago

LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning

Paper • 2506.18841 • Published Jun 23, 2025 • 56

upvoted 3 papers 7 months ago

VerIF: Verification Engineering for Reinforcement Learning in Instruction Following

Paper • 2506.09942 • Published Jun 11, 2025 • 5

MMR-V: What's Left Unsaid? A Benchmark for Multimodal Deep Reasoning in Videos

Paper • 2506.04141 • Published Jun 4, 2025 • 29

AGENTIF: Benchmarking Instruction Following of Large Language Models in Agentic Scenarios

Paper • 2505.16944 • Published May 22, 2025 • 8

upvoted a paper 8 months ago

AdaptThink: Reasoning Models Can Learn When to Think

Paper • 2505.13417 • Published May 19, 2025 • 83

upvoted a paper 10 months ago

Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems

Paper • 2502.19328 • Published Feb 26, 2025 • 23

upvoted a paper 11 months ago

Pairwise RM: Perform Best-of-N Sampling with Knockout Tournament

Paper • 2501.13007 • Published Jan 22, 2025 • 19

upvoted a paper about 1 year ago

LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks

Paper • 2412.15204 • Published Dec 19, 2024 • 37

upvoted a collection about 1 year ago

Crab

Collection

《Constraint Back-translation Improves Complex Instruction Following of Large Language Models》 • 7 items • Updated Nov 1, 2024 • 4

upvoted 4 papers about 1 year ago

Constraint Back-translation Improves Complex Instruction Following of Large Language Models

Paper • 2410.24175 • Published Oct 31, 2024 • 18

upvoted 2 papers over 1 year ago

LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA

Paper • 2409.02897 • Published Sep 4, 2024 • 47

LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs

Paper • 2408.07055 • Published Aug 13, 2024 • 67

upvoted a collection over 1 year ago

ADELIE

Collection

EMNLP2024 Main Conference: 《Aligning Large Language Models on Information Extraction》 • 7 items • Updated Nov 4, 2024 • 3

Hao Peng

AI & ML interests

Recent Activity

Organizations

Wesleythu's activity