Update README.md
Browse files
README.md
CHANGED
|
@@ -29,7 +29,7 @@ uv pip install --pre --index-url https://download.pytorch.org/whl/nightly/cu126
|
|
| 29 |
|
| 30 |
## QAT Finetuning with PARQ
|
| 31 |
|
| 32 |
-
We apply QAT with a torchao optimizer-only package called [PARQ](https://github.com/pytorch/ao/tree/main/torchao/prototype/parq). This model is finetuned
|
| 33 |
|
| 34 |
The training command is provided below for reproducibility. Note that the model is initialized from a 2-bit model, [lvj/Qwen3-4B-parq-2b-weight-4b-embed-shared-hf](https://huggingface.co/lvj/Qwen3-4B-parq-2b-weight-4b-embed-shared-hf). To optimize for other tasks, replace `--dataset_name` with a custom finetuning dataset and remove `--dataset_sources`.
|
| 35 |
|
|
|
|
| 29 |
|
| 30 |
## QAT Finetuning with PARQ
|
| 31 |
|
| 32 |
+
We apply QAT with a torchao optimizer-only package called [PARQ](https://github.com/pytorch/ao/tree/main/torchao/prototype/parq). This model is finetuned on grade-school math data in order to maximize performance on [gsm8k](https://huggingface.co/datasets/openai/gsm8k).
|
| 33 |
|
| 34 |
The training command is provided below for reproducibility. Note that the model is initialized from a 2-bit model, [lvj/Qwen3-4B-parq-2b-weight-4b-embed-shared-hf](https://huggingface.co/lvj/Qwen3-4B-parq-2b-weight-4b-embed-shared-hf). To optimize for other tasks, replace `--dataset_name` with a custom finetuning dataset and remove `--dataset_sources`.
|
| 35 |
|