Towards Next-Level Post-Training Quantization of Hyper-Scale Transformers Paper • 2402.08958 • Published Feb 14, 2024 • 6
Attention-aware Post-training Quantization without Backpropagation Paper • 2406.13474 • Published Jun 19, 2024 • 1