Quark Quantization Details

Package Versions:

Python: 3.12.8
PyTorch: 2.6.0.dev20250107+cpu
Transformers: 4.57.6
Quark: 0.1.0.dev0

Quantization Setup: AWQ, Group Size 32, bfloat16, lm_head included

Perplexity: 12.99265193939209

Quark Command:

python quantize_quark_granite.py --model_dir granite-4.0-1b-converted-v-4.57.6 --output_dir granite_awq_grp32 --device cpu --quant_scheme uint4_wo_32 --num_calib_data 128 --seq_len 512 --quant_algo awq --dataset pileval_for_awq_benchmark --model_export hf_format --data_type bfloat16 --exclude_layers

Downloads last month: 15

Safetensors

Model size

0.5B params

Tensor type

I32

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support