DFlash Collection Block Diffusion for Flash Speculative Decoding • 22 items • Updated 5 days ago • 132
ParoQuant Collection Pairwise Rotation Quantization for Efficient Reasoning LLM Inference • 24 items • Updated 12 days ago • 26