Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2409.02095

about 1 hour ago

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 28
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 14
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

New Depth Models

Recent depth models

Running on Zero

188

188

DepthCrafter

🦀

a super consistent video depth model
Paused

222

222

Depth Pro

🚀

Generate an inverse depth map from an image
Runtime error

77

77

LOTUS Depth

🚀

Generate depth maps from images and videos
apple/DepthPro

Depth Estimation • Updated Feb 28 • 25.4k • 485

Computer Vision

DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos

Paper • 2409.02095 • Published Sep 3, 2024 • 36
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Paper • 2409.01704 • Published Sep 3, 2024 • 83
CDM: A Reliable Metric for Fair and Accurate Formula Recognition Evaluation

Paper • 2409.03643 • Published Sep 5, 2024 • 19
UniDet3D: Multi-dataset Indoor 3D Object Detection

Paper • 2409.04234 • Published Sep 6, 2024 • 9

Running on CPU Upgrade

9.84k

9.84k

Kolors Virtual Try-On

👕

Try on clothes virtually by uploading images
Running on Zero

507

507

Finegrain Object Cutter

✂

Create HD cutouts from any image with just a prompt
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos

Paper • 2409.02095 • Published Sep 3, 2024 • 36
Running on Zero

MCP

2.39k

2.39k

Diffusers Image Outpaint

🔅

Easily expand image boundaries

DepthFM: Fast Monocular Depth Estimation with Flow Matching

Paper • 2403.13788 • Published Mar 20, 2024 • 17
Learning Temporally Consistent Video Depth from Video Diffusion Priors

Paper • 2406.01493 • Published Jun 3, 2024 • 23
NeuFlow v2: High-Efficiency Optical Flow Estimation on Edge Devices

Paper • 2408.10161 • Published Aug 19, 2024 • 15
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos

Paper • 2409.02095 • Published Sep 3, 2024 • 36

Crafter series models for 3D reconstruction and generation

GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors

Paper • 2504.01016 • Published Apr 1 • 28
TrajectoryCrafter: Redirecting Camera Trajectory for Monocular Videos via Diffusion Models

Paper • 2503.05638 • Published Mar 7 • 19
StereoCrafter: Diffusion-based Generation of Long and High-fidelity Stereoscopic 3D from Monocular Videos

Paper • 2409.07447 • Published Sep 11, 2024
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos

Paper • 2409.02095 • Published Sep 3, 2024 • 36

DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos

Paper • 2409.02095 • Published Sep 3, 2024 • 36
briaai/BRIA-2.3-ControlNet-Generative-Fill

Text-to-Image • Updated Jul 6 • 61 • 31

AI Math: Diffusion

Controllable Text Generation for Large Language Models: A Survey

Paper • 2408.12599 • Published Aug 22, 2024 • 65
xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations

Paper • 2408.12590 • Published Aug 22, 2024 • 36
Real-Time Video Generation with Pyramid Attention Broadcast

Paper • 2408.12588 • Published Aug 22, 2024 • 17
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Paper • 2408.11039 • Published Aug 20, 2024 • 63

MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model

Paper • 2405.20222 • Published May 30, 2024 • 11
ZeroSmooth: Training-free Diffuser Adaptation for High Frame Rate Video Generation

Paper • 2406.00908 • Published Jun 3, 2024 • 12
CamCo: Camera-Controllable 3D-Consistent Image-to-Video Generation

Paper • 2406.02509 • Published Jun 4, 2024 • 10
I4VGen: Image as Stepping Stone for Text-to-Video Generation

Paper • 2406.02230 • Published Jun 4, 2024 • 18

LocalMamba: Visual State Space Model with Windowed Selective Scan

Paper • 2403.09338 • Published Mar 14, 2024 • 9
GiT: Towards Generalist Vision Transformer through Universal Language Interface

Paper • 2403.09394 • Published Mar 14, 2024 • 27
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers

Paper • 2402.19479 • Published Feb 29, 2024 • 35
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection

Paper • 2405.10300 • Published May 16, 2024 • 30

about 1 hour ago

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 28
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 14
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

Crafter series models for 3D reconstruction and generation

GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors

Paper • 2504.01016 • Published Apr 1 • 28
TrajectoryCrafter: Redirecting Camera Trajectory for Monocular Videos via Diffusion Models

Paper • 2503.05638 • Published Mar 7 • 19
StereoCrafter: Diffusion-based Generation of Long and High-fidelity Stereoscopic 3D from Monocular Videos

Paper • 2409.07447 • Published Sep 11, 2024
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos

Paper • 2409.02095 • Published Sep 3, 2024 • 36

New Depth Models

Recent depth models

Running on Zero

188

188

DepthCrafter

🦀

a super consistent video depth model
Paused

222

222

Depth Pro

🚀

Generate an inverse depth map from an image
Runtime error

77

77

LOTUS Depth

🚀

Generate depth maps from images and videos
apple/DepthPro

Depth Estimation • Updated Feb 28 • 25.4k • 485

DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos

Paper • 2409.02095 • Published Sep 3, 2024 • 36
briaai/BRIA-2.3-ControlNet-Generative-Fill

Text-to-Image • Updated Jul 6 • 61 • 31

Computer Vision

DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos

Paper • 2409.02095 • Published Sep 3, 2024 • 36
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Paper • 2409.01704 • Published Sep 3, 2024 • 83
CDM: A Reliable Metric for Fair and Accurate Formula Recognition Evaluation

Paper • 2409.03643 • Published Sep 5, 2024 • 19
UniDet3D: Multi-dataset Indoor 3D Object Detection

Paper • 2409.04234 • Published Sep 6, 2024 • 9

AI Math: Diffusion

Controllable Text Generation for Large Language Models: A Survey

Paper • 2408.12599 • Published Aug 22, 2024 • 65
xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations

Paper • 2408.12590 • Published Aug 22, 2024 • 36
Real-Time Video Generation with Pyramid Attention Broadcast

Paper • 2408.12588 • Published Aug 22, 2024 • 17
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Paper • 2408.11039 • Published Aug 20, 2024 • 63

Running on CPU Upgrade

9.84k

9.84k

Kolors Virtual Try-On

👕

Try on clothes virtually by uploading images
Running on Zero

507

507

Finegrain Object Cutter

✂

Create HD cutouts from any image with just a prompt
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos

Paper • 2409.02095 • Published Sep 3, 2024 • 36
Running on Zero

MCP

2.39k

2.39k

Diffusers Image Outpaint

🔅

Easily expand image boundaries

MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model

Paper • 2405.20222 • Published May 30, 2024 • 11
ZeroSmooth: Training-free Diffuser Adaptation for High Frame Rate Video Generation

Paper • 2406.00908 • Published Jun 3, 2024 • 12
CamCo: Camera-Controllable 3D-Consistent Image-to-Video Generation

Paper • 2406.02509 • Published Jun 4, 2024 • 10
I4VGen: Image as Stepping Stone for Text-to-Video Generation

Paper • 2406.02230 • Published Jun 4, 2024 • 18

DepthFM: Fast Monocular Depth Estimation with Flow Matching

Paper • 2403.13788 • Published Mar 20, 2024 • 17
Learning Temporally Consistent Video Depth from Video Diffusion Priors

Paper • 2406.01493 • Published Jun 3, 2024 • 23
NeuFlow v2: High-Efficiency Optical Flow Estimation on Edge Devices

Paper • 2408.10161 • Published Aug 19, 2024 • 15
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos

Paper • 2409.02095 • Published Sep 3, 2024 • 36

LocalMamba: Visual State Space Model with Windowed Selective Scan

Paper • 2403.09338 • Published Mar 14, 2024 • 9
GiT: Towards Generalist Vision Transformer through Universal Language Interface

Paper • 2403.09394 • Published Mar 14, 2024 • 27
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers

Paper • 2402.19479 • Published Feb 29, 2024 • 35
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection

Paper • 2405.10300 • Published May 16, 2024 • 30

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs