Update README.md
Browse files
README.md
CHANGED
|
@@ -53,7 +53,8 @@ model-index:
|
|
| 53 |
This model is a quantized version of OpenAI's GPT-OSS-20B using NVIDIA's advanced NVFP4 format. It follows the official NVIDIA TensorRT Model Optimizer methodology, providing superior accuracy retention compared to MXFP4 quantization while maintaining significant memory efficiency gains.
|
| 54 |
|
| 55 |
## Blog
|
| 56 |
-
Fine-Tuning gpt-oss for Accuracy and Performance with Quantization Aware Training:
|
|
|
|
| 57 |
|
| 58 |
|
| 59 |
## Key Features
|
|
|
|
| 53 |
This model is a quantized version of OpenAI's GPT-OSS-20B using NVIDIA's advanced NVFP4 format. It follows the official NVIDIA TensorRT Model Optimizer methodology, providing superior accuracy retention compared to MXFP4 quantization while maintaining significant memory efficiency gains.
|
| 54 |
|
| 55 |
## Blog
|
| 56 |
+
Fine-Tuning gpt-oss for Accuracy and Performance with Quantization Aware Training:
|
| 57 |
+
https://developer.nvidia.com/blog/fine-tuning-gpt-oss-for-accuracy-and-performance-with-quantization-aware-training/
|
| 58 |
|
| 59 |
|
| 60 |
## Key Features
|