Use vLLM or TensorRT-LLM to serve.

Restraints:

  • Blackwell GPUs only.
Downloads last month
38
Safetensors
Model size
62B params
Tensor type
BF16
·
F8_E4M3
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for kalbon/Behemoth-X-123B-v2-NVFP4

Quantized
(9)
this model

Dataset used to train kalbon/Behemoth-X-123B-v2-NVFP4