2 13 14

Kevin Zhang

Kevin-thu

https://kevin-thu.github.io/homepage

AI & ML interests

Computer Vision, Generation Models, Neural Rendering

Recent Activity

upvoted a paper 2 days ago

Video-As-Prompt: Unified Semantic Control for Video Generation

upvoted a paper 7 days ago

MoGA: Mixture-of-Groups Attention for End-to-End Long Video Generation

upvoted a paper 14 days ago

Stable Video Infinity: Infinite-Length Video Generation with Error Recycling

View all activity

Organizations

None yet

upvoted a paper 2 days ago

Video-As-Prompt: Unified Semantic Control for Video Generation

Paper • 2510.20888 • Published 6 days ago • 41

upvoted a paper 7 days ago

MoGA: Mixture-of-Groups Attention for End-to-End Long Video Generation

Paper • 2510.18692 • Published 9 days ago • 38

upvoted 2 papers 14 days ago

Stable Video Infinity: Infinite-Length Video Generation with Error Recycling

Paper • 2510.09212 • Published 20 days ago • 14

Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation

Paper • 2510.08673 • Published 20 days ago • 120

upvoted a paper 15 days ago

Diffusion Transformers with Representation Autoencoders

Paper • 2510.11690 • Published 16 days ago • 160

upvoted a paper about 2 months ago

3D and 4D World Modeling: A Survey

Paper • 2509.07996 • Published Sep 4 • 57

upvoted 3 papers 3 months ago

STream3R: Scalable Sequential 3D Reconstruction with Causal Transformer

Paper • 2508.10893 • Published Aug 14 • 31

HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels

Paper • 2507.21809 • Published Jul 29 • 131

X-Omni: Reinforcement Learning Makes Discrete Autoregressive Image Generative Models Great Again

Paper • 2507.22058 • Published Jul 29 • 38

upvoted a paper 4 months ago

Epona: Autoregressive Diffusion World Model for Autonomous Driving

Paper • 2506.24113 • Published Jun 30 • 1

upvoted a paper about 1 year ago

CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer

Paper • 2408.06072 • Published Aug 12, 2024 • 39

upvoted 2 papers almost 2 years ago

DiffMorpher: Unleashing the Capability of Diffusion Models for Image Morphing

Paper • 2312.07409 • Published Dec 12, 2023 • 23

CogAgent: A Visual Language Model for GUI Agents

Paper • 2312.08914 • Published Dec 14, 2023 • 31

Kevin Zhang

AI & ML interests

Recent Activity

Organizations

Kevin-thu's activity