nightmedia
/

Qwen3-Coder-REAP-25B-A3B-qx65x-hi-mlx

Text Generation

Model card Files Files and versions

nightmedia commited on 10 days ago

Commit

148f2b0

·

verified ·

1 Parent(s): e9ec4e7

Update README.md

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -24,10 +24,12 @@ base_model: cerebras/Qwen3-Coder-REAP-25B-A3B
 # Qwen3-Coder-REAP-25B-A3B-qx65x-hi-mlx
-This version of the Deckard(qx) formula uses embeddings at 6 bit, along with the head and select attention paths.
 The model is quantized with group size 32(hi).
 This is an update from the model: [Qwen3-Coder-REAP-25B-A3B-qx64-hi-mlx](https://huggingface.co/nightmedia/Qwen3-Coder-REAP-25B-A3B-qx64-hi-mlx) that uses the base and embeddings at 4 bit.
 Metrics coming soon.

 # Qwen3-Coder-REAP-25B-A3B-qx65x-hi-mlx
+This version of the Deckard(qx) formula uses embeddings at 6 bit, along with the head and select attention paths, leaving the rest at 5 bit.
 The model is quantized with group size 32(hi).
+It is aimed as a mid-range quant with a quality approaching q8, that would run comfortably on a smaller Mac.
 This is an update from the model: [Qwen3-Coder-REAP-25B-A3B-qx64-hi-mlx](https://huggingface.co/nightmedia/Qwen3-Coder-REAP-25B-A3B-qx64-hi-mlx) that uses the base and embeddings at 4 bit.
 Metrics coming soon.