---
base_model:
- meta-llama/Llama-3.2-3B-Instruct
base_model_relation: quantized
license: llama3.1
---
# Model Card

- Base model: `meta-llama/Llama-3.1-70B-Instruct`
- Quantization method: BlockLDLQ with GuidedQuant Hessian
- Target bit-width: 3
- Backend kernel: QTIP kernel (HYB variant)
- Calibration data: RedPajama (1024 sentences / 4096 tokens)
- Calibration objective: Next-token prediction
- num_groups (for GuidedQuant Hessian): 1
- skip_list: 0_v (not quantizing 0_v layer, following YAQA paper)

# How to run
- Follow the instruction in https://github.com/snu-mllab/GuidedQuant and https://github.com/Cornell-RelaxML/qtip

# References
- [Model Paper](https://arxiv.org/abs/2505.07004)