zombieofCrypto 's Collections llm_improvement_research
updated
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via
Reinforcement Learning
Paper
• 2501.12948
• Published • 443
LightThinker: Thinking Step-by-Step Compression
Paper
• 2502.15589
• Published • 31
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts
Language Model
Paper
• 2405.04434
• Published • 25
Model Compression and Efficient Inference for Large Language Models: A
Survey
Paper
• 2402.09748
• Published • 2
Efficient Transformers: A Survey
Paper
• 2009.06732
• Published • 1
Keyformer: KV Cache Reduction through Key Tokens Selection for Efficient
Generative Inference
Paper
• 2403.09054
• Published • 1
FastCache: Optimizing Multimodal LLM Serving through Lightweight
KV-Cache Compression Framework
Paper
• 2503.08461
• Published
Beyond RAG: Task-Aware KV Cache Compression for Comprehensive Knowledge
Reasoning
Paper
• 2503.04973
• Published • 27