-
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper • 2508.06471 • Published • 195 -
zai-org/GLM-4.5
Text Generation • 358B • Updated • 20.6k • • 1.39k -
zai-org/GLM-4.5-FP8
Text Generation • 358B • Updated • 2.43k • 76 -
GLM 4.5 Demo (API)
🏃106Chat with GLM-4.5 to get answers and reasoning
Collections
Discover the best community collections!
Collections including paper arxiv:2508.06471
-
Two Minds Better Than One: Collaborative Reward Modeling for LLM Alignment
Paper • 2505.10597 • Published -
COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values
Paper • 2504.05535 • Published • 44 -
nvidia/HelpSteer3
Viewer • Updated • 133k • 2.77k • 93 -
nvidia/Nemotron-RL-instruction_following
Preview • Updated • 153 • 7
-
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper • 2508.06471 • Published • 195 -
Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training
Paper • 2501.11425 • Published • 109 -
Agent Laboratory: Using LLM Agents as Research Assistants
Paper • 2501.04227 • Published • 95 -
Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving
Paper • 2507.06229 • Published • 75
-
microsoft/bitnet-b1.58-2B-4T
Text Generation • 0.8B • Updated • 5.69k • 1.24k -
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Paper • 2504.10449 • Published • 15 -
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct
Text Generation • 8B • Updated • 358 • 15 -
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs
Paper • 2504.11536 • Published • 63
-
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 501 -
zai-org/GLM-4.6
Text Generation • 357B • Updated • 94.6k • • 1.2k -
deepseek-ai/DeepSeek-R1
Text Generation • 685B • Updated • 439k • • 12.9k -
deepseek-ai/DeepSeek-V3.2-Exp
Text Generation • 685B • Updated • 71.6k • • 933
-
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning
Paper • 2505.24726 • Published • 277 -
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 263 -
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Paper • 2507.01006 • Published • 250 -
A Survey of Context Engineering for Large Language Models
Paper • 2507.13334 • Published • 259
-
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper • 2508.06471 • Published • 195 -
NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Paper • 2508.14444 • Published • 39 -
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities
Paper • 2507.06261 • Published • 64 -
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention
Paper • 2506.13585 • Published • 273
-
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper • 2508.06471 • Published • 195 -
zai-org/GLM-4.5
Text Generation • 358B • Updated • 20.6k • • 1.39k -
zai-org/GLM-4.5-FP8
Text Generation • 358B • Updated • 2.43k • 76 -
GLM 4.5 Demo (API)
🏃106Chat with GLM-4.5 to get answers and reasoning
-
microsoft/bitnet-b1.58-2B-4T
Text Generation • 0.8B • Updated • 5.69k • 1.24k -
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Paper • 2504.10449 • Published • 15 -
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct
Text Generation • 8B • Updated • 358 • 15 -
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs
Paper • 2504.11536 • Published • 63
-
Two Minds Better Than One: Collaborative Reward Modeling for LLM Alignment
Paper • 2505.10597 • Published -
COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values
Paper • 2504.05535 • Published • 44 -
nvidia/HelpSteer3
Viewer • Updated • 133k • 2.77k • 93 -
nvidia/Nemotron-RL-instruction_following
Preview • Updated • 153 • 7
-
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 501 -
zai-org/GLM-4.6
Text Generation • 357B • Updated • 94.6k • • 1.2k -
deepseek-ai/DeepSeek-R1
Text Generation • 685B • Updated • 439k • • 12.9k -
deepseek-ai/DeepSeek-V3.2-Exp
Text Generation • 685B • Updated • 71.6k • • 933
-
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning
Paper • 2505.24726 • Published • 277 -
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 263 -
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Paper • 2507.01006 • Published • 250 -
A Survey of Context Engineering for Large Language Models
Paper • 2507.13334 • Published • 259
-
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper • 2508.06471 • Published • 195 -
Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training
Paper • 2501.11425 • Published • 109 -
Agent Laboratory: Using LLM Agents as Research Assistants
Paper • 2501.04227 • Published • 95 -
Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving
Paper • 2507.06229 • Published • 75
-
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper • 2508.06471 • Published • 195 -
NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Paper • 2508.14444 • Published • 39 -
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities
Paper • 2507.06261 • Published • 64 -
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention
Paper • 2506.13585 • Published • 273