AXCXEPT
/

QwQ-32B-Distill-Qwen-1.5B-Alpha

Text Generation

text-generation-inference

Model card Files Files and versions

AXCXEPT commited on Apr 10

Commit

c64a30a

·

verified ·

1 Parent(s): 9d27549

Update README.md

Files changed (1) hide show

README.md +7 -6

README.md CHANGED Viewed

@@ -13,12 +13,13 @@ base_model:
 pipeline_tag: text-generation
 ---
-# QwQ‑32B‑Distill‑Qwen‑1.5B‑Alpha
-## - Solo Innovation: Breaking Performance Barriers with Minimal Resources -
-### Powered by personal research with insights from Berkeley
 ## Overview
 QwQ‑32B‑Distill‑Qwen‑1.5B‑Alpha is a groundbreaking language model built on top of the DeepSeek‑R1‑Distill‑Qwen‑1.5B base. Developed entirely by a solo innovator—with valuable inspiration from Berkeley’s research—the model employs a novel reinforcement learning distillation framework that dramatically enhances performance while keeping training data requirements and compute costs to a minimum. Despite having only 1.5B parameters, the model achieves a striking 47.18 MMLU score and outperforms prior baselines on multiple math and reasoning benchmarks.

 pipeline_tag: text-generation
 ---
+<div align="center">
+<span style="font-family: default; font-size: 1.5em;">QwQ‑32B‑Distill‑Qwen‑1.5B‑Alpha</span>
+<div>
+- Solo Innovation: Breaking Performance Barriers with Minimal Resources -
+<div><b>Powered by personal research with insights from Berkeley</b></div>
+</div>
+</div>
 ## Overview
 QwQ‑32B‑Distill‑Qwen‑1.5B‑Alpha is a groundbreaking language model built on top of the DeepSeek‑R1‑Distill‑Qwen‑1.5B base. Developed entirely by a solo innovator—with valuable inspiration from Berkeley’s research—the model employs a novel reinforcement learning distillation framework that dramatically enhances performance while keeping training data requirements and compute costs to a minimum. Despite having only 1.5B parameters, the model achieves a striking 47.18 MMLU score and outperforms prior baselines on multiple math and reasoning benchmarks.