Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Yusu Qian's picture
2 7

Yusu Qian

YusuQian
arturao's profile picture leoye's profile picture hmb's profile picture
·

AI & ML interests

multimodal llm research

Recent Activity

upvoted a paper about 10 hours ago
PRISM-Bench: A Benchmark of Puzzle-Based Visual Tasks with CoT Error Detection
upvoted a paper 6 days ago
Pico-Banana-400K: A Large-Scale Dataset for Text-Guided Image Editing
upvoted a paper 9 days ago
OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM
View all activity

Organizations

Apple's profile picture

authored 2 papers 5 months ago

UniVG: A Generalist Diffusion Model for Unified Image Generation and Editing

Paper • 2503.12652 • Published Mar 16

GIE-Bench: Towards Grounded Evaluation for Text-Guided Image Editing

Paper • 2505.11493 • Published May 16 • 3
authored 3 papers over 1 year ago

MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs

Paper • 2407.01509 • Published Jul 1, 2024

Understanding Alignment in Multimodal LLMs: A Comprehensive Study

Paper • 2407.02477 • Published Jul 2, 2024 • 24

How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts

Paper • 2402.13220 • Published Feb 20, 2024 • 15
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs