1 225 740

Motoki Wu

tokestermw

https://motoki.co

AI & ML interests

None yet

Recent Activity

upvoted a paper 2 days ago

Reinforcement Learning for Self-Improving Agent with Skill Library

liked a model 7 days ago

LiquidAI/LFM2-Audio-1.5B

liked a model 10 days ago

ResembleAI/chatterbox-turbo

View all activity

Organizations

upvoted a paper 2 days ago

Reinforcement Learning for Self-Improving Agent with Skill Library

Paper • 2512.17102 • Published 8 days ago • 21

upvoted an article 11 days ago

Article

Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models

12 days ago

•

101

upvoted an article 16 days ago

Article

Apriel-1.6-15b-Thinker: Cost-efficient Frontier Multimodal Performance

17 days ago

•

upvoted an article 17 days ago

Article

How We Use Claude Code Skills to Run 1,000+ ML Experiments a Day

18 days ago

•

upvoted an article about 1 month ago

Article

Open ASR Leaderboard: Trends and Insights with New Multilingual & Long-Form Tracks

Nov 21

•

upvoted a collection about 2 months ago

PromptMII

Collection

Prompt-MII: Meta-Learning Instruction Induction for LLMs. Link to paper: https://arxiv.org/abs/2510.16932 • 4 items • Updated Oct 21 • 2

upvoted a paper 3 months ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24 • 316

upvoted an article 3 months ago

Article

mem-agent: Equipping LLM Agents with Memory Using RL

Oct 9

•

upvoted a paper 3 months ago

Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models

Paper • 2510.04618 • Published Oct 6 • 125

upvoted a collection 3 months ago

Qwen3-Omni

Collection

6 items • Updated Oct 9 • 176

upvoted 5 papers 4 months ago

Why Language Models Hallucinate

Paper • 2509.04664 • Published Sep 4 • 194

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2 • 227

R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning

Paper • 2508.21113 • Published Aug 28 • 110

Breaking the Exploration Bottleneck: Rubric-Scaffolded Reinforcement Learning for General LLM Reasoning

Paper • 2508.16949 • Published Aug 23 • 22

AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs

Paper • 2508.16153 • Published Aug 22 • 160

upvoted a collection 4 months ago

NVIDIA Nemotron V2

Collection

Open, Production-ready Enterprise Models. Nvidia Open Model license. • 9 items • Updated 3 days ago • 100

upvoted 2 papers 4 months ago

SSRL: Self-Search Reinforcement Learning

Paper • 2508.10874 • Published Aug 14 • 97

Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models

Paper • 2508.10751 • Published Aug 14 • 28

upvoted 2 papers 5 months ago

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Paper • 2508.05629 • Published Aug 7 • 180

Feedback-Driven Tool-Use Improvements in Large Language Models via Automated Build Environments

Paper • 2508.08791 • Published Aug 12 • 16

Motoki Wu

AI & ML interests

Recent Activity

Organizations

tokestermw's activity

Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models

Apriel-1.6-15b-Thinker: Cost-efficient Frontier Multimodal Performance

How We Use Claude Code Skills to Run 1,000+ ML Experiments a Day

Open ASR Leaderboard: Trends and Insights with New Multilingual & Long-Form Tracks

mem-agent: Equipping LLM Agents with Memory Using RL