Urodoc Oncall's picture

48 44

Urodoc Oncall

UDCAI

·

AI & ML interests

None yet

Recent Activity

upvoted a paper about 11 hours ago

RAPO++: Cross-Stage Prompt Optimization for Text-to-Video Generation via Data Alignment and Test-Time Scaling

upvoted a paper about 11 hours ago

Video-As-Prompt: Unified Semantic Control for Video Generation

upvoted a paper 4 days ago

LayerComposer: Interactive Personalized T2I via Spatially-Aware Layered Canvas

View all activity

Organizations

upvoted 2 papers about 11 hours ago

RAPO++: Cross-Stage Prompt Optimization for Text-to-Video Generation via Data Alignment and Test-Time Scaling

Paper • 2510.20206 • Published 5 days ago • 11

Video-As-Prompt: Unified Semantic Control for Video Generation

Paper • 2510.20888 • Published 4 days ago • 33

upvoted 5 papers 4 days ago

LayerComposer: Interactive Personalized T2I via Spatially-Aware Layered Canvas

Paper • 2510.20820 • Published 4 days ago • 7

HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives

Paper • 2510.20822 • Published 4 days ago • 35

Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence

Paper • 2510.20579 • Published 4 days ago • 48

Pico-Banana-400K: A Large-Scale Dataset for Text-Guided Image Editing

Paper • 2510.19808 • Published 5 days ago • 23

GigaBrain-0: A World Model-Powered Vision-Language-Action Model

Paper • 2510.19430 • Published 5 days ago • 39

liked a model 5 days ago

vita-video-gen/svi-model

Image-to-Video • Updated 4 days ago • 1 • 29

upvoted a paper 5 days ago

LightMem: Lightweight and Efficient Memory-Augmented Generation

Paper • 2510.18866 • Published 6 days ago • 104

upvoted 6 papers 6 days ago

UltraGen: High-Resolution Video Generation with Hierarchical Attention

Paper • 2510.18775 • Published 6 days ago • 15

MUG-V 10B: High-efficiency Training Pipeline for Large Video Generation Models

Paper • 2510.17519 • Published 7 days ago • 9

MoGA: Mixture-of-Groups Attention for End-to-End Long Video Generation

Paper • 2510.18692 • Published 6 days ago • 38

ConsistEdit: Highly Consistent and Precise Training-free Visual Editing

Paper • 2510.17803 • Published 7 days ago • 12

Visual Autoregressive Models Beat Diffusion Models on Inference Time Scaling

Paper • 2510.16751 • Published 9 days ago • 19

Glyph: Scaling Context Windows via Visual-Text Compression

Paper • 2510.17800 • Published 7 days ago • 61

upvoted 4 papers 7 days ago

Latent Diffusion Model without Variational Autoencoder

Paper • 2510.15301 • Published 11 days ago • 47

NANO3D: A Training-Free Approach for Efficient 3D Editing Without Masks

Paper • 2510.15019 • Published 11 days ago • 56

A Theoretical Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning

Paper • 2510.15444 • Published 11 days ago • 139

Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset

Paper • 2510.15742 • Published 10 days ago • 49

upvoted a paper 10 days ago

ImagerySearch: Adaptive Test-Time Search for Video Generation Beyond Semantic Dependency Constraints

Paper • 2510.14847 • Published 11 days ago • 55