Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
m-a-p 's Collections
REER-DeepWriter
PIN Datasets
TreePO
CriticLean
Hybrid Linear Attention Research
MARBLE
COIG-P-Models
COIG-P-Datasets
YuE
FineFineWeb
MERT
MuPT
COIG
OpenCodeInterpreter
ChatMusician
M-A-P Full Paper List
Amber-Reproduce-Intermediate-CKPTs (The Fine Line)
OpenLLaMA-Reproduce-Intermediate-CKPTs (The Fine Line)
Chinese Tiny LLM
MusiLingo
Neo-Models
Neo-Datasets

TreePO

updated Sep 4
Upvote
1

  • TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling

    Paper • 2508.17445 • Published Aug 24 • 80

  • m-a-p/TreePO-Qwen2.5-7B

    Text Generation • 8B • Updated 19 days ago • 79 • 2

  • m-a-p/TreePO_data

    Viewer • Updated 19 days ago • 3.12k • 178

  • m-a-p/TreePO-Qwen2.5-7B_fixed-div

    8B • Updated Aug 31 • 63

  • m-a-p/TreePO-Qwen2.5-7B_GRPO-TreePO-Sampling

    8B • Updated Sep 4 • 65

  • m-a-p/TreePO-Qwen2.5-7B_Low_Prob_Encourage

    8B • Updated Sep 4 • 61

  • m-a-p/TreePO-Qwen2.5-7B_Naive2Low_Scheduler

    8B • Updated Sep 4 • 63
Upvote
1
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs