Tanmay Gupta's picture

1

Tanmay Gupta

tanmayg

·

https://tanmaygupta.info/

AI & ML interests

None yet

Organizations

None yet

authored 9 papers 5 months ago

Visual Semantic Role Labeling for Video Understanding

Paper • 2104.00990 • Published Apr 2, 2021

OBJECT 3DIT: Language-guided 3D-aware Image Editing

Paper • 2307.11073 • Published Jul 20, 2023

m&m's: A Benchmark to Evaluate Tool-Use for multi-step multi-modal Tasks

Paper • 2403.11085 • Published Mar 17, 2024

Visual Programming: Compositional visual reasoning without training

Paper • 2211.11559 • Published Nov 18, 2022 • 1

Task Me Anything

Paper • 2406.11775 • Published Jun 17, 2024 • 8

CodeNav: Beyond tool-use to using real-world codebases with LLM agents

Paper • 2406.12276 • Published Jun 18, 2024

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models

Paper • 2409.17146 • Published Sep 25, 2024 • 121

Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation

Paper • 2502.14846 • Published Feb 20 • 14

Towards General Purpose Vision Systems

Paper • 2104.00743 • Published Apr 1, 2021