Optimizing Large Language Models through Quantization: A Comparative Analysis of PTQ and QAT Techniques Paper • 2411.06084 • Published Nov 9, 2024 • 1
DL-QAT: Weight-Decomposed Low-Rank Quantization-Aware Training for Large Language Models Paper • 2504.09223 • Published Apr 12 • 1
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float Paper • 2504.11651 • Published Apr 15 • 31 • 5
Agent models: Internalizing Chain-of-Action Generation into Reasoning models Paper • 2503.06580 • Published Mar 9 • 20 • 3
VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks Paper • 2406.08394 • Published Jun 12, 2024 • 2
Instruction-guided Multi-Granularity Segmentation and Captioning with Large Multimodal Model Paper • 2409.13407 • Published Sep 20, 2024 • 2
OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding Paper • 2406.19389 • Published Jun 27, 2024 • 54 • 10
OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding Paper • 2406.19389 • Published Jun 27, 2024 • 54 • 10