gulnawaz123 (GULNAWAZ sh.)

upvoted 8 papers 4 months ago

A Survey on Latent Reasoning

Paper • 2507.06203 • Published Jul 8 • 92

Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding

Paper • 2411.04282 • Published Nov 6, 2024 • 37

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9 • 262

Energy-Based Transformers are Scalable Learners and Thinkers

Paper • 2507.02092 • Published Jul 2 • 69

Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning

Paper • 2507.00432 • Published Jul 1 • 79

4KAgent: Agentic Any Image to 4K Super-Resolution

Paper • 2507.07105 • Published Jul 9 • 104

MemOS: A Memory OS for AI System

Paper • 2507.03724 • Published Jul 4 • 155

GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Paper • 2507.01006 • Published Jul 1 • 238

upvoted 9 papers 6 months ago

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Paper • 2505.03335 • Published May 6 • 187

Robot-R1: Reinforcement Learning for Enhanced Embodied Reasoning in Robotics

Paper • 2506.00070 • Published May 29 • 29

Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces

Paper • 2506.00123 • Published May 30 • 35

Video World Models with Long-term Spatial Memory

Paper • 2506.05284 • Published Jun 5 • 54

ComfyUI-Copilot: An Intelligent Assistant for Automated Workflow Development

Paper • 2506.05010 • Published Jun 5 • 79

REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable Rewards

Paper • 2505.24760 • Published May 30 • 74

AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30 • 97

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Paper • 2505.24864 • Published May 30 • 141

Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Paper • 2505.24726 • Published May 30 • 274

upvoted a collection 6 months ago

audio

Collection

109 items • Updated 19 days ago • 6

upvoted 2 papers 9 months ago

RAD: Training an End-to-End Driving Policy via Large-Scale 3DGS-based Reinforcement Learning

Paper • 2502.13144 • Published Feb 18 • 38

PC-Agent: A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PC

Paper • 2502.14282 • Published Feb 20 • 29

GULNAWAZ sh.

AI & ML interests

Organizations

A Survey on Latent Reasoning

Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding

Reinforcement Pre-Training

Energy-Based Transformers are Scalable Learners and Thinkers

Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning

4KAgent: Agentic Any Image to 4K Super-Resolution

MemOS: A Memory OS for AI System

GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Robot-R1: Reinforcement Learning for Enhanced Embodied Reasoning in Robotics

Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces

Video World Models with Long-term Spatial Memory

ComfyUI-Copilot: An Intelligent Assistant for Automated Workflow Development

REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable Rewards

AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

audio

RAD: Training an End-to-End Driving Policy via Large-Scale 3DGS-based Reinforcement Learning

PC-Agent: A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PC

GULNAWAZ sh.

AI & ML interests

Organizations

gulnawaz123's activity