OpenCompass

community

https://opencompass.org.cn/

OpenCompassX

open-compass

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

Sudanl updated a model 2 days ago

opencompass/CompassVerifier-3B

Sudanl updated a model 2 days ago

opencompass/CompassVerifier-32B

ZwwWayne authored a paper 24 days ago

Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving

View all activity

Papers

CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward

Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM

View all Papers

Sudanl

updated 2 models 2 days ago

opencompass/CompassVerifier-3B

3B • Updated 2 days ago • 1.25k • 6

opencompass/CompassVerifier-32B

33B • Updated 2 days ago • 71 • 7

vansin

posted an update 17 days ago

Post

254

QwenLong-L1.5: Post-Training Recipe for Long-Context Reasoning and Memory Management

ZwwWayne

authored a paper 24 days ago

Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving

Paper • 2512.10739 • Published 25 days ago • 46

vanilla1116

authored 4 papers 25 days ago

CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward

Paper • 2508.03686 • Published Aug 5, 2025 • 37

Intern-S1: A Scientific Multimodal Foundation Model

Paper • 2508.15763 • Published Aug 21, 2025 • 259

OPV: Outcome-based Process Verifier for Efficient Long Chain-of-Thought Verification

Paper • 2512.10756 • Published 25 days ago • 34

Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving

Paper • 2512.10739 • Published 25 days ago • 46

vanilla1116

submitted 3 papers to Daily Papers 25 days ago

Achieving Olympia-Level Geometry Large Language Model Agent via Complexity Boosting Reinforcement Learning

Paper • 2512.10534 • Published 26 days ago • 31

Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving

Paper • 2512.10739 • Published 25 days ago • 46

OPV: Outcome-based Process Verifier for Efficient Long Chain-of-Thought Verification

Paper • 2512.10756 • Published 25 days ago • 34

nebulae09

authored 3 papers about 1 month ago

IWR-Bench: Can LVLMs reconstruct interactive webpage from a user interaction video?

Paper • 2509.24709 • Published Sep 29, 2025 • 6

ATLAS: A High-Difficulty, Multidisciplinary Benchmark for Frontier Scientific Reasoning

Paper • 2511.14366 • Published Nov 18, 2025 • 16

ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning

Paper • 2512.05111 • Published Dec 4, 2025 • 47

yuhangzang

authored a paper about 1 month ago

ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning

Paper • 2512.05111 • Published Dec 4, 2025 • 47

Sudanl

authored a paper about 1 month ago

How Far Are We from Genuinely Useful Deep Research Agents?

Paper • 2512.01948 • Published Dec 1, 2025 • 54

jnanliu

authored 2 papers about 1 month ago

How Brittle is Agent Safety? Rethinking Agent Risk under Intent Concealment and Task Complexity

Paper • 2511.08487 • Published Nov 11, 2025 • 2

Rectifying LLM Thought from Lens of Optimization

Paper • 2512.01925 • Published Dec 1, 2025 • 24

Sudanl

updated a model about 1 month ago

opencompass/CompassVerifier-7B

8B • Updated Nov 26, 2025 • 609 • 4

vansin

in opencompass/RISEBench_Gallery about 1 month ago

cpu quota limit,can't start

#1 opened about 1 month ago by