Greedy Growing Enables High-Resolution Pixel-Based Diffusion Models Paper • 2405.16759 • Published May 27, 2024 • 8
ImageInWords: Unlocking Hyper-Detailed Image Descriptions Paper • 2405.02793 • Published May 5, 2024 • 4
Mismatch Quest: Visual and Textual Feedback for Image-Text Misalignment Paper • 2312.03766 • Published Dec 5, 2023 • 1
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context Paper • 2403.05530 • Published Mar 8, 2024 • 66
Davidsonian Scene Graph: Improving Reliability in Fine-grained Evaluation for Text-to-Image Generation Paper • 2310.18235 • Published Oct 27, 2023
DOCCI: Descriptions of Connected and Contrasting Images Paper • 2404.19753 • Published Apr 30, 2024 • 13