Insta360-Research/DiT360-Panorama-Image-Generation Text-to-Image • Updated 15 days ago • 521 • 15
Splatting Physical Scenes: End-to-End Real-to-Sim from Imperfect Robot Data Paper • 2506.04120 • Published Jun 4 • 7
SphereDiff: Tuning-free Omnidirectional Panoramic Image and Video Generation via Spherical Latent Representation Paper • 2504.14396 • Published Apr 19 • 27
Running 3.38k 3.38k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 11 items • Updated Jul 21 • 544
DeepSeek R1 (All Versions) Collection DeepSeek-R1-0528 is here! The most powerful reasoning open LLM, available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 37 items • Updated 1 day ago • 259
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining Paper • 2501.00958 • Published Jan 1 • 107
LumiNet: Latent Intrinsics Meets Diffusion Models for Indoor Scene Relighting Paper • 2412.00177 • Published Nov 29, 2024 • 8 • 3
No More Adam: Learning Rate Scaling at Initialization is All You Need Paper • 2412.11768 • Published Dec 16, 2024 • 43
Offline Reinforcement Learning for LLM Multi-Step Reasoning Paper • 2412.16145 • Published Dec 20, 2024 • 38