Update README.md
Browse files
README.md
CHANGED
|
@@ -23,6 +23,20 @@ The model weights included here are PyTorch state dicts converted from the offic
|
|
| 23 |
|
| 24 |
To avoid duplication and ease maintance, this repository only contains the model weights; the self-contained source code can be found [here](https://github.com/rasbt/LLMs-from-scratch/blob/main/pkg/llms_from_scratch/qwen3.py). Instructions on how to use the code are provided below.
|
| 25 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 26 |
|
| 27 |
|
| 28 |
### Using Qwen3 0.6B via the `llms-from-scratch` package
|
|
|
|
| 23 |
|
| 24 |
To avoid duplication and ease maintance, this repository only contains the model weights; the self-contained source code can be found [here](https://github.com/rasbt/LLMs-from-scratch/blob/main/pkg/llms_from_scratch/qwen3.py). Instructions on how to use the code are provided below.
|
| 25 |
|
| 26 |
+
|
| 27 |
+
|
| 28 |
+
# Qwen3 from-scratch code
|
| 29 |
+
|
| 30 |
+
The standalone notebooks in this folder contain from-scratch codes in linear fashion:
|
| 31 |
+
|
| 32 |
+
1. [standalone-qwen3.ipynb](https://github.com/rasbt/LLMs-from-scratch/blob/main/ch05/11_qwen3/standalone-qwen3.ipynb): The dense Qwen3 model without bells and whistles
|
| 33 |
+
2. [standalone-qwen3-plus-kvcache.ipynb](https://github.com/rasbt/LLMs-from-scratch/blob/main/ch05/11_qwen3/standalone-qwen3-plus-kvcache.ipynb): Same as above but with KV cache for better inference efficiency
|
| 34 |
+
3. [standalone-qwen3-moe.ipynb](https://github.com/rasbt/LLMs-from-scratch/blob/main/ch05/11_qwen3/standalone-qwen3-moe.ipynb): Like the first notebook but the Mixture-of-Experts (MoE) variant
|
| 35 |
+
4. [standalone-qwen3-moe-plus-kvcache.ipynb](https://github.com/rasbt/LLMs-from-scratch/blob/main/ch05/11_qwen3/standalone-qwen3-moe-plus-kvcache.ipynb): Same as above but with KV cache for better inference efficiency
|
| 36 |
+
|
| 37 |
+
Alternatively, I also organized the code into a Python package (including unit tests and CI), which you can run as described below.
|
| 38 |
+
|
| 39 |
+
|
| 40 |
|
| 41 |
|
| 42 |
### Using Qwen3 0.6B via the `llms-from-scratch` package
|