--- license: apache-2.0 library_name: mlx pipeline_tag: text-generation base_model: Qwen/Qwen3-Next-80B-A3B-Thinking tags: - mlx - qwen3_next - 6-bit - affine - text-generation quantization_config: bits: 6 mode: affine group_size: 64 model-index: - name: Qwen3-Next-80B-A3B-Thinking 6-bit (MLX) results: [] --- # Qwen3-Next-80B-A3B-Thinking — MLX 6-bit (affine) Apple MLX-optimized 6-bit affine-quantized checkpoint of the base model `Qwen/Qwen3-Next-80B-A3B-Thinking` for local inference on Apple Silicon. Key details - Format: MLX runtime, safetensors sharded weights - Quantization: affine int6, group_size=64 - Task: text generation / chat - Tokenizer: provided via `tokenizer.json` (BPE) with `chat_template.jinja` ## Usage (MLX) ```bash pip install mlx-lm ``` ```python from mlx_lm import load, generate repo_id = "abnormalmapstudio/Qwen3-Next-80B-A3B-Thinking-6bit-mlx" model, tokenizer = load(repo_id) out = generate(model, tokenizer, "List 5 creative dinner ideas.", max_tokens=200) print(out) ``` ## Benchmarks - Will be added after upload completes; see `scripts/bench/qwen_mxfp4_vs_int4.py` and `scripts/bench/model_queue_eval.py`. ## License - Apache-2.0 for this packaging. See `LICENSE`. - Base model license and terms apply (Qwen/Qwen3-Next-80B-A3B-Thinking).