Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
guaguastandup 's Collections
VLA
Video-Understadning
papers
MoE
LoRA

Video-Understadning

updated about 1 month ago
Upvote
-

  • OmniAgent: Audio-Guided Active Perception Agent for Omnimodal Audio-Video Understanding

    Paper • 2512.23646 • Published Dec 29, 2025 • 15

  • Recurrent-Depth VLA: Implicit Test-Time Compute Scaling of Vision-Language-Action Models via Latent Iterative Reasoning

    Paper • 2602.07845 • Published Feb 8 • 69
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs