5 29 14

Weigao Sun

weigao266

https://weigao266.github.io/

AI & ML interests

Algo & MLSys

Recent Activity

authored a paper 19 days ago

Native Hybrid Attention for Efficient Sequence Modeling

upvoted a paper 19 days ago

Native Hybrid Attention for Efficient Sequence Modeling

commented on a paper 19 days ago

Native Hybrid Attention for Efficient Sequence Modeling

View all activity

Organizations

authored a paper 19 days ago

Native Hybrid Attention for Efficient Sequence Modeling

Paper • 2510.07019 • Published 20 days ago • 16

upvoted a paper 19 days ago

Native Hybrid Attention for Efficient Sequence Modeling

Paper • 2510.07019 • Published 20 days ago • 16

commented a paper 19 days ago

Native Hybrid Attention for Efficient Sequence Modeling

Paper • 2510.07019 • Published 20 days ago • 16 •

upvoted a paper about 1 month ago

Reasoning over Boundaries: Enhancing Specification Alignment via Test-time Delibration

Paper • 2509.14760 • Published Sep 18 • 52

upvoted a paper about 2 months ago

NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model

Paper • 2508.14444 • Published Aug 20 • 36

authored a paper 2 months ago

Intern-S1: A Scientific Multimodal Foundation Model

Paper • 2508.15763 • Published Aug 21 • 254

upvoted a paper 2 months ago

Intern-S1: A Scientific Multimodal Foundation Model

Paper • 2508.15763 • Published Aug 21 • 254

authored 2 papers 2 months ago

CogniBench: A Legal-inspired Framework and Dataset for Assessing Cognitive Faithfulness of Large Language Models

Paper • 2505.20767 • Published May 27 • 1

Speed Always Wins: A Survey on Efficient Architectures for Large Language Models

Paper • 2508.09834 • Published Aug 13 • 53

upvoted 2 papers 2 months ago

CogniBench: A Legal-inspired Framework and Dataset for Assessing Cognitive Faithfulness of Large Language Models

Paper • 2505.20767 • Published May 27 • 1

Speed Always Wins: A Survey on Efficient Architectures for Large Language Models

Paper • 2508.09834 • Published Aug 13 • 53

commented a paper 2 months ago

Speed Always Wins: A Survey on Efficient Architectures for Large Language Models

Paper • 2508.09834 • Published Aug 13 • 53 •

upvoted a paper 3 months ago

Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation

Paper • 2507.10524 • Published Jul 14 • 70

upvoted 2 papers 4 months ago

Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers

Paper • 2506.23918 • Published Jun 30 • 88

IntFold: A Controllable Foundation Model for General and Specialized Biomolecular Structure Prediction

Paper • 2507.02025 • Published Jul 2 • 35

liked a model 4 months ago

hustvl/OmniMamba

Any-to-Any • Updated Mar 20 • 7

liked a dataset 4 months ago

yahma/alpaca-cleaned

Viewer • Updated Apr 10, 2023 • 51.8k • 27.3k • 736

upvoted a paper 4 months ago

LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs

Paper • 2506.14429 • Published Jun 17 • 44

liked a dataset 5 months ago

openbmb/Ultra-FineWeb

Viewer • Updated Jun 16 • 1.29B • 14.5k • 224

upvoted a paper 5 months ago

Unfolding Spatial Cognition: Evaluating Multimodal Models on Visual Simulations

Paper • 2506.04633 • Published Jun 5 • 19

Weigao Sun

AI & ML interests

Recent Activity

Organizations

weigao266's activity