lvj
/

Qwen3-4B-parq-2b-weight-4b-embed-shared-gsm

Text Generation

text-generation-inference

Model card Files Files and versions

lvj commited on about 1 month ago

Commit

0ac66eb

·

verified ·

1 Parent(s): 83679cb

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -29,7 +29,7 @@ uv pip install --pre --index-url https://download.pytorch.org/whl/nightly/cu126
 ## QAT Finetuning with PARQ
-We apply QAT with a torchao optimizer-only package called [PARQ](https://github.com/pytorch/ao/tree/main/torchao/prototype/parq). This model is finetuned only on grade-school math data in order to maximize performance on [gsm8k](https://huggingface.co/datasets/openai/gsm8k).
 The training command is provided below for reproducibility. Note that the model is initialized from a 2-bit model, [lvj/Qwen3-4B-parq-2b-weight-4b-embed-shared-hf](https://huggingface.co/lvj/Qwen3-4B-parq-2b-weight-4b-embed-shared-hf). To optimize for other tasks, replace `--dataset_name` with a custom finetuning dataset and remove `--dataset_sources`.

 ## QAT Finetuning with PARQ
+We apply QAT with a torchao optimizer-only package called [PARQ](https://github.com/pytorch/ao/tree/main/torchao/prototype/parq). This model is finetuned on grade-school math data in order to maximize performance on [gsm8k](https://huggingface.co/datasets/openai/gsm8k).
 The training command is provided below for reproducibility. Note that the model is initialized from a 2-bit model, [lvj/Qwen3-4B-parq-2b-weight-4b-embed-shared-hf](https://huggingface.co/lvj/Qwen3-4B-parq-2b-weight-4b-embed-shared-hf). To optimize for other tasks, replace `--dataset_name` with a custom finetuning dataset and remove `--dataset_sources`.