AXCXEPT commited on
Commit
c64a30a
·
verified ·
1 Parent(s): 9d27549

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -6
README.md CHANGED
@@ -13,12 +13,13 @@ base_model:
13
  pipeline_tag: text-generation
14
  ---
15
 
16
- # QwQ‑32B‑Distill‑Qwen‑1.5B‑Alpha
17
-
18
- ## - Solo Innovation: Breaking Performance Barriers with Minimal Resources -
19
-
20
- ### Powered by personal research with insights from Berkeley
21
-
 
22
 
23
  ## Overview
24
  QwQ‑32B‑Distill‑Qwen‑1.5B‑Alpha is a groundbreaking language model built on top of the DeepSeek‑R1‑Distill‑Qwen‑1.5B base. Developed entirely by a solo innovator—with valuable inspiration from Berkeley’s research—the model employs a novel reinforcement learning distillation framework that dramatically enhances performance while keeping training data requirements and compute costs to a minimum. Despite having only 1.5B parameters, the model achieves a striking 47.18 MMLU score and outperforms prior baselines on multiple math and reasoning benchmarks.
 
13
  pipeline_tag: text-generation
14
  ---
15
 
16
+ <div align="center">
17
+ <span style="font-family: default; font-size: 1.5em;">QwQ‑32B‑Distill‑Qwen‑1.5B‑Alpha</span>
18
+ <div>
19
+ - Solo Innovation: Breaking Performance Barriers with Minimal Resources -
20
+ <div><b>Powered by personal research with insights from Berkeley</b></div>
21
+ </div>
22
+ </div>
23
 
24
  ## Overview
25
  QwQ‑32B‑Distill‑Qwen‑1.5B‑Alpha is a groundbreaking language model built on top of the DeepSeek‑R1‑Distill‑Qwen‑1.5B base. Developed entirely by a solo innovator—with valuable inspiration from Berkeley’s research—the model employs a novel reinforcement learning distillation framework that dramatically enhances performance while keeping training data requirements and compute costs to a minimum. Despite having only 1.5B parameters, the model achieves a striking 47.18 MMLU score and outperforms prior baselines on multiple math and reasoning benchmarks.