This is a MXFP4_MOE quantization of the model Qwen3-VL-235B-A22B-Instruct

Original model: https://huggingface.co/Qwen/Qwen3-VL-235B-A22B-Instruct

This GGUF quant has been made possible due to the excellent work from yairpatch and Thireus, and anyone else I forgot to mention

As of 2025-10-22 this is still experimental and should be treated as such.

In order to run it you must download a custom version of llama.cpp from here:

https://github.com/Thireus/llama.cpp/releases/tag/tr-qwen3-vl-4-b7062-b471ef7

Downloads last month
291
GGUF
Model size
235B params
Architecture
qwen3vlmoe
Hardware compatibility
Log In to view the estimation

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for noctrex/Qwen3-VL-235B-A22B-Instruct-MXFP4_MOE-GGUF

Quantized
(9)
this model

Collections including noctrex/Qwen3-VL-235B-A22B-Instruct-MXFP4_MOE-GGUF