reasoning-project (reasoning-project)

Cartinoe5930

authored a paper 3 months ago

Pushing on Multilingual Reasoning Models with Language-Mixed Chain-of-Thought

Paper • 2510.04230 • Published Oct 5, 2025 • 26

JW17

authored 2 papers 6 months ago

AlphaPO -- Reward shape matters for LLM alignment

Paper • 2501.03884 • Published Jan 7, 2025 • 2

Online Difficulty Filtering for Reasoning Oriented Reinforcement Learning

Paper • 2504.03380 • Published Apr 4, 2025

JW17

authored a paper 8 months ago

When AI Co-Scientists Fail: SPOT-a Benchmark for Automated Verification of Scientific Research

Paper • 2505.11855 • Published May 17, 2025 • 10

amphora

authored a paper 8 months ago

When AI Co-Scientists Fail: SPOT-a Benchmark for Automated Verification of Scientific Research

Paper • 2505.11855 • Published May 17, 2025 • 10

Cartinoe5930

authored 2 papers 8 months ago

When AI Co-Scientists Fail: SPOT-a Benchmark for Automated Verification of Scientific Research

Paper • 2505.11855 • Published May 17, 2025 • 10

Won: Establishing Best Practices for Korean Financial NLP

Paper • 2503.17963 • Published Mar 23, 2025

amphora

authored a paper 10 months ago

Linguistic Generalizability of Test-Time Scaling in Mathematical Reasoning

Paper • 2502.17407 • Published Feb 24, 2025 • 26

Cartinoe5930

authored 2 papers 10 months ago

Multi-Step Reasoning in Korean and the Emergent Mirage

Paper • 2501.05712 • Published Jan 10, 2025

Linguistic Generalizability of Test-Time Scaling in Mathematical Reasoning

Paper • 2502.17407 • Published Feb 24, 2025 • 26

JW17

updated a model 11 months ago

reasoning-project/Q25M-1.5B-MR1-50k-SFT-v0.2-3epoch

Text Generation • 2B • Updated Feb 16, 2025 • 4

JW17

published a model 11 months ago

reasoning-project/Q25M-1.5B-MR1-50k-SFT-v0.2-3epoch

Text Generation • 2B • Updated Feb 16, 2025 • 4

JW17

updated a model 11 months ago

reasoning-project/Q25M-1.5B-Open-R1-55k-SFT-v0.1

Text Generation • 2B • Updated Feb 15, 2025 • 3

JW17

published a model 11 months ago

reasoning-project/Q25M-1.5B-Open-R1-55k-SFT-v0.1

Text Generation • 2B • Updated Feb 15, 2025 • 3

JW17

updated a model 11 months ago

reasoning-project/Q25-1.5B-PRIME-55K-GRPO-Acc2-format5e1

Updated Feb 14, 2025

JW17

published a model 11 months ago

reasoning-project/Q25-1.5B-PRIME-55K-GRPO-Acc2-format5e1

Updated Feb 14, 2025

JW17

updated a model 11 months ago

reasoning-project/Q25-1.5B-Open-R1-55K-GRPO-Acc2-format5e1

Updated Feb 14, 2025

JW17

published a model 11 months ago

reasoning-project/Q25-1.5B-Open-R1-55K-GRPO-Acc2-format5e1

Updated Feb 14, 2025

Cartinoe5930

authored 2 papers 12 months ago

LLM-as-a-Judge & Reward Model: What They Can and Cannot Do

Paper • 2409.11239 • Published Sep 17, 2024 • 3

Understand, Solve and Translate: Bridging the Multilingual Mathematical Reasoning Gap

Paper • 2501.02448 • Published Jan 5, 2025

AI & ML interests

Team members 3

reasoning-project's activity