|
|
--- |
|
|
license: apache-2.0 |
|
|
library_name: mlx |
|
|
pipeline_tag: text-generation |
|
|
base_model: Qwen/Qwen3-Next-80B-A3B-Thinking |
|
|
tags: |
|
|
- mlx |
|
|
- qwen3_next |
|
|
- 6-bit |
|
|
- affine |
|
|
- text-generation |
|
|
quantization_config: |
|
|
bits: 6 |
|
|
mode: affine |
|
|
group_size: 64 |
|
|
model-index: |
|
|
- name: Qwen3-Next-80B-A3B-Thinking 6-bit (MLX) |
|
|
results: [] |
|
|
--- |
|
|
|
|
|
# Qwen3-Next-80B-A3B-Thinking — MLX 6-bit (affine) |
|
|
|
|
|
Apple MLX-optimized 6-bit affine-quantized checkpoint of the base model |
|
|
`Qwen/Qwen3-Next-80B-A3B-Thinking` for local inference on Apple Silicon. |
|
|
|
|
|
Key details |
|
|
- Format: MLX runtime, safetensors sharded weights |
|
|
- Quantization: affine int6, group_size=64 |
|
|
- Task: text generation / chat |
|
|
- Tokenizer: provided via `tokenizer.json` (BPE) with `chat_template.jinja` |
|
|
|
|
|
## Usage (MLX) |
|
|
```bash |
|
|
pip install mlx-lm |
|
|
``` |
|
|
|
|
|
```python |
|
|
from mlx_lm import load, generate |
|
|
repo_id = "abnormalmapstudio/Qwen3-Next-80B-A3B-Thinking-6bit-mlx" |
|
|
model, tokenizer = load(repo_id) |
|
|
out = generate(model, tokenizer, "List 5 creative dinner ideas.", max_tokens=200) |
|
|
print(out) |
|
|
``` |
|
|
|
|
|
## Benchmarks |
|
|
- Will be added after upload completes; see `scripts/bench/qwen_mxfp4_vs_int4.py` and `scripts/bench/model_queue_eval.py`. |
|
|
|
|
|
## License |
|
|
- Apache-2.0 for this packaging. See `LICENSE`. |
|
|
- Base model license and terms apply (Qwen/Qwen3-Next-80B-A3B-Thinking). |
|
|
|
|
|
|