Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
abhranil14 's Collections
Augmenting Pretrained FMs with Post-Training/RL
RL/FM/Agent Data/Benchmark
FM4 EmbodiedAI/Robotics/DecisionMaking
FM_Training_Infra
Foundation Models Empirical Analysis
Survey LLM/VLM/MLM
RL
Reasoning/System2

Augmenting Pretrained FMs with Post-Training/RL

updated Jul 6
Upvote
-

  • AlphaMaze: Enhancing Large Language Models' Spatial Intelligence via GRPO

    Paper • 2502.14669 • Published Feb 20 • 14

  • R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning

    Paper • 2503.05592 • Published Mar 7 • 27

  • Offline Reinforcement Learning for LLM Multi-Step Reasoning

    Paper • 2412.16145 • Published Dec 20, 2024 • 38

  • OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning via Iterative Self-Improvement

    Paper • 2503.17352 • Published Mar 21 • 24

  • SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models

    Paper • 2504.11468 • Published Apr 10 • 30

  • Reinforcement Pre-Training

    Paper • 2506.08007 • Published Jun 9 • 262

  • Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning

    Paper • 2507.00432 • Published Jul 1 • 79
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs