4 285 46

Dazhi Jiang

thuzhizhi

jiangzizi

AI & ML interests

None yet

Recent Activity

liked a model 8 days ago

moonshotai/Kimi-K2.5

upvoted an article 29 days ago

The Optimal Architecture for Small Language Models

liked a dataset about 1 month ago

BytedTsinghua-SIA/DAPO-Math-17k

View all activity

Organizations

None yet

liked a model 8 days ago

moonshotai/Kimi-K2.5

Image-Text-to-Text • 171B • Updated 5 days ago • 504k • • 1.96k

upvoted an article 29 days ago

Article

The Optimal Architecture for Small Language Models

Dec 26, 2025

•

116

liked a dataset about 1 month ago

BytedTsinghua-SIA/DAPO-Math-17k

Viewer • Updated Apr 18, 2025 • 1.79M • 5.17k • 150

liked a model about 1 month ago

zai-org/GLM-4.7

Text Generation • 358B • Updated 12 days ago • 114k • • 1.91k

liked a dataset 3 months ago

qi6776/Recflow

Updated Jul 11, 2025 • 226 • 1

upvoted a paper 3 months ago

Data-Efficient RLVR via Off-Policy Influence Guidance

Paper • 2510.26491 • Published Oct 30, 2025 • 11

liked a Space 3 months ago

The Smol Training Playbook

📚

2.97k

The secrets to building world-class LLMs

liked a model 3 months ago

inclusionAI/LLaDA-MoE-7B-A1B-Instruct

7B • Updated Oct 28, 2025 • 4.35k • 64

liked a model 4 months ago

inclusionAI/LLaDA2.0-mini-preview

Text Generation • 16B • Updated Dec 19, 2025 • 2.69k • 88

upvoted a collection 4 months ago

LLaDA 2.0

Collection

7 items • Updated about 14 hours ago • 40

updated a Space 5 months ago

MorningMind NewsCards 🌱

🐳

Flip through news flashcards to stay informed

published a Space 5 months ago

MorningMind NewsCards 🌱

🐳

Flip through news flashcards to stay informed

liked a Space 5 months ago

DeepSite v4

🐳

16.4k

Generate any application by Vibe Coding it

liked a model 5 months ago

SJTU-DENG-Lab/D2F_LLaDA_Instruct_8B_Lora

Text Generation • Updated Aug 14, 2025 • 5

liked a Space 5 months ago

Qwen Image Edit

✒

808

Edit and enhance images based on descriptive instructions

New activity in GSAI-ML/LLaDA-1.5 6 months ago

期待demo

#1 opened 8 months ago by

zzzgry

liked 2 models 6 months ago

deepseek-ai/DeepSeek-V3.1

Text Generation • 685B • Updated Sep 5, 2025 • 115k • • 814

deepseek-ai/DeepSeek-V3.1-Base

Text Generation • 685B • Updated Aug 26, 2025 • 4.84k • 1.01k

authored a paper 6 months ago

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Paper • 2508.06471 • Published Aug 8, 2025 • 206

liked a model 6 months ago

zai-org/GLM-4.5V

Image-Text-to-Text • 108B • Updated Oct 25, 2025 • 39.3k • • 705

Dazhi Jiang

AI & ML interests

Recent Activity

Organizations

thuzhizhi's activity

The Optimal Architecture for Small Language Models

The Smol Training Playbook

MorningMind NewsCards 🌱

MorningMind NewsCards 🌱

DeepSite v4

Qwen Image Edit

期待demo