mdouglas
/

gpt-oss-20b-bnb-nf4

Text Generation

4-bit precision

Model card Files Files and versions

mdouglas HF Staff commited on Aug 8

Commit

ddb8776

·

verified ·

1 Parent(s): 8956143

Update README.md

Files changed (1) hide show

README.md +17 -3

README.md CHANGED Viewed

@@ -1,3 +1,17 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+base_model:
+- openai/gpt-oss-20b
+library_name: transformers
+---
+> [!IMPORTANT]
+> This repository is an **experimental** re-quantized version of the original model [`openai/gpt-oss-20b`](https://huggingface.co/openai/gpt-oss-20b).
+>
+> It requires development versions of `transformers` and `bitsandbytes`.
+# Quantization
+The MLP expert parameters have been dequantized from MXFP4 to BF16, and then requantized in the NF4 double-quantization format using an experimental `bnb_4bit_target_parameters` configuration option. The self-attention, routing, and embedding parameters are kept in BF16.