Quark Quantized OCP FP8 Models
Collection
27 items
•
Updated
•
3
This model was created by applying Quark with calibration samples from Pile dataset.
Every eight int4 values are packed into a single int32 integeter following the sequence defined by order_map = [0, 2, 4, 6, 1, 3, 5, 7].
Follow Quantizing Sharded Grok-1 with Quark for SGLang to produced the quantized model using Quark.
Quark has its own export format and allows FP8 quantized models to be efficiently deployed using the SGLang backend.
| Benchmark | grok-1 | grok-1-W4A8KV8(this model) |
| gsm8k | 0.821 | 0.817 |
Modifications copyright(c) 2024 Advanced Micro Devices,Inc. All rights reserved.
Base model
lmzheng/grok-1