Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
atifsal 's Collections
ComfyUI-Models-Workflows
Text-to-Video_Models
Fashion-Models
Graph-Learning_Models
Image-to-Image_Models
Audio-Text-to-Text_Models
Video-Text_to_Text_Models
Text-Gen_Models
Image-to-Video_Models
Any-to-Any_Models
Vision-Models
Embedding-Models
VTON_Models
AI-Models
AI-Datasets
Prompt-Engineering
Research-Papers

Research-Papers

updated 12 days ago
Upvote
-

  • MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with Interleaved Visual-Textual Tokens

    Paper • 2404.03413 • Published Apr 4, 2024 • 28

  • RepVideo: Rethinking Cross-Layer Representation for Video Generation

    Paper • 2501.08994 • Published Jan 15 • 15

  • Hierarchical Cross-modal Prompt Learning for Vision-Language Models

    Paper • 2507.14976 • Published Jul 20 • 2
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs