Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference Paper • 2406.10774 • Published Jun 16, 2024 • 4
Video-As-Prompt: Unified Semantic Control for Video Generation Paper • 2510.20888 • Published Oct 23 • 45
AdaSPEC: Selective Knowledge Distillation for Efficient Speculative Decoders Paper • 2510.19779 • Published Oct 22 • 59