Submitted by myownskyW7 47 PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction · 11 authors 133 3
Submitted by Tigerph 24 Aligning Large Language Models via Self-Steering Optimization · 9 authors 20 3
Submitted by michaelryoo 18 xGen-MM-Vid (BLIP-3-Video): You Only Need 32 Tokens to Represent a Video Even in VLMs · 10 authors 2
Submitted by xing0047 18 Mitigating Object Hallucination via Concentric Causal Attention · 4 authors 63 2
Submitted by t1101675 17 MiniPLM: Knowledge Distillation for Pre-Training Language Models · 5 authors 65 2
Submitted by AtsuMiyai 15 JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation · 8 authors 2
Submitted by OliverSieberling 9 EvoPress: Towards Optimal Dynamic Model Compression via Evolutionary Search · 4 authors 36 2
Submitted by bryanchrist 8 Math Neurosurgery: Isolating Language Models' Math Reasoning Abilities Using Only Forward Passes · 4 authors 2
Submitted by Xi8006 5 3DGS-Enhancer: Enhancing Unbounded 3D Gaussian Splatting with View-consistent 2D Diffusion Priors · 3 authors 2