This is a MXFP4_MOE quantization of the model Qwen3-VL-235B-A22B-Instruct

This GGUF quant has been made possible due to the excellent work from yairpatch and Thireus, and anyone else I forgot to mention

As of 2025-10-22 this is still experimental and should be treated as such.

In order to run it you must download a custom version of llama.cpp from here:

GGUF

Model size

235B params

Architecture

qwen3vlmoe

Hardware compatibility

4-bit

Model tree for noctrex/Qwen3-VL-235B-A22B-Instruct-MXFP4_MOE-GGUF

Base model

Quantized

(9)

this model

Collections including noctrex/Qwen3-VL-235B-A22B-Instruct-MXFP4_MOE-GGUF