LEYA-HSE (LEYA Lab)

posted an update 2 days ago

Post

2615

🚀👌🌟 New Research Alert - ICCV 2025 (Oral)! 🌟🤌🚀
📄 Title: Understanding Co-speech Gestures in-the-wild 🔝

📝 Description: JEGAL is a tri-modal model that learns from gestures, speech and text simultaneously, enabling devices to interpret co-speech gestures in the wild.

👥 Authors: @sindhuhegde , K R Prajwal, Taein Kwon, and Andrew Zisserman

📅 Conference: ICCV, 19 – 23 Oct, 2025 | Honolulu, Hawai'i, USA 🇺🇸

📄 Paper: Understanding Co-speech Gestures in-the-wild (2503.22668)

🌐 Web Page: https://www.robots.ox.ac.uk/~vgg/research/jegal
📁 Repository: https://github.com/Sindhu-Hegde/jegal
📺 Video: https://www.youtube.com/watch?v=TYFOLKfM-rM

🚀 ICCV-2023-25-Papers: https://github.com/DmitryRyumin/ICCV-2023-25-Papers

🚀 Added to the Human Modeling Section: https://github.com/DmitryRyumin/ICCV-2023-25-Papers/blob/main/sections/2025/main/human-modeling.md

📚 More Papers: more cutting-edge research presented at other conferences in the DmitryRyumin/NewEraAI-Papers curated by @DmitryRyumin

🔍 Keywords: #CoSpeechGestures #GestureUnderstanding #TriModalRepresentation #MultimodalLearning #AI #ICCV2025 #ResearchHighlight

DmitryRyumin

posted an update 5 days ago

Post

3856

🚀💡🌟 New Research Alert - ICCV 2025 (Oral)! 🌟🪄🚀
📄 Title: LoftUp: Learning a Coordinate-based Feature Upsampler for Vision Foundation Models 🔝

📝 Description: LoftUp is a coordinate-based transformer that upscales the low-resolution features of VFMs (e.g. DINOv2 and CLIP) using cross-attention and self-distilled pseudo-ground truth (pseudo-GT) from SAM.

👥 Authors: Haiwen Huang, Anpei Chen, Volodymyr Havrylov, Andreas Geiger, and Dan Zhang

📅 Conference: ICCV, 19 – 23 Oct, 2025 | Honolulu, Hawai'i, USA 🇺🇸

📄 Paper: LoftUp: Learning a Coordinate-Based Feature Upsampler for Vision Foundation Models (2504.14032)

🌐 Github Page: https://andrehuang.github.io/loftup-site
📁 Repository: https://github.com/andrehuang/loftup

🚀 ICCV-2023-25-Papers: https://github.com/DmitryRyumin/ICCV-2023-25-Papers

🚀 Added to the Foundation Models and Representation Learning Section: https://github.com/DmitryRyumin/ICCV-2023-25-Papers/blob/main/sections/2025/main/foundation-models-and-representation-learning.md

📚 More Papers: more cutting-edge research presented at other conferences in the DmitryRyumin/NewEraAI-Papers curated by @DmitryRyumin

🔍 Keywords: #LoftUp #VisionFoundationModels #FeatureUpsampling #Cross-AttentionTransformer #CoordinateBasedLearning #SelfDistillation #PseudoGroundTruth #RepresentationLearning #AI #ICCV2025 #ResearchHighlight

DmitryRyumin

posted an update 6 days ago

Post

1891

🚀🏷️🌟 New Research Alert - ICCV 2025 (Oral)! 🌟🧩🚀
📄 Title: Heavy Labels Out! Dataset Distillation with Label Space Lightening 🔝

📝 Description: The HeLlO framework is a new corpus distillation method that removes the need for large soft labels. It uses a lightweight, online image-to-label projector based on CLIP. This projector has been adapted using LoRA-style, parameter-efficient tuning. It has also been initialized with text embeddings.

👥 Authors: @roseannelexie , @Huage001 , Zigeng Chen, Jingwen Ye, and Xinchao Wang

📅 Conference: ICCV, 19 – 23 Oct, 2025 | Honolulu, Hawai'i, USA 🇺🇸

📄 Paper: Heavy Labels Out! Dataset Distillation with Label Space Lightening (2408.08201)

📺 Video: https://www.youtube.com/watch?v=kAyK_3wskgA

🚀 ICCV-2023-25-Papers: https://github.com/DmitryRyumin/ICCV-2023-25-Papers

🚀 Added to the Efficient Learning Section: https://github.com/DmitryRyumin/ICCV-2023-25-Papers/blob/main/sections/2025/main/efficient-learning.md

📚 More Papers: more cutting-edge research presented at other conferences in the DmitryRyumin/NewEraAI-Papers curated by @DmitryRyumin

🔍 Keywords: #DatasetDistillation #LabelCompression #CLIP #LoRA #EfficientAI #FoundationModels #AI #ICCV2025 #ResearchHighlight

2 replies

·

DmitryRyumin

authored a paper 7 days ago

Team RAS in 9th ABAW Competition: Multimodal Compound Expression Recognition Approach

Paper • 2507.02205 • Published Jul 2

DmitryRyumin

posted an update 7 days ago

Post

4727

🚀🤖🌟 New Research Alert - ICCV 2025 (Oral)! 🌟🤖🚀
📄 Title: Variance-based Pruning for Accelerating and Compressing Trained Networks 🔝

📝 Description: The one-shot pruning method efficiently compresses networks, reducing computation and memory usage while retaining almost full performance and requiring minimal fine-tuning.

👥 Authors: Uranik Berisha, Jens Mehnert, and Alexandru Paul Condurache

📅 Conference: ICCV, 19 – 23 Oct, 2025 | Honolulu, Hawai'i, USA 🇺🇸

📄 Paper: Variance-Based Pruning for Accelerating and Compressing Trained Networks (2507.12988)

🚀 ICCV-2023-25-Papers: https://github.com/DmitryRyumin/ICCV-2023-25-Papers

🚀 Added to the Efficient Learning Section: https://github.com/DmitryRyumin/ICCV-2023-25-Papers/blob/main/sections/2025/main/efficient-learning.md

📚 More Papers: more cutting-edge research presented at other conferences in the DmitryRyumin/NewEraAI-Papers curated by @DmitryRyumin

🔍 Keywords: #VarianceBasedPruning #NetworkCompression #ModelAcceleration #EfficientDeepLearning #VisionTransformers #AI #ICCV2025 #ResearchHighlight

DmitryRyumin

posted an update 9 days ago

Post

2926

🚀👁️🌟 New Research Alert - ICCV 2025 (Oral)! 🌟👁️🚀
📄 Title: Token Activation Map to Visually Explain Multimodal LLMs 🔝

📝 Description: The Token Activation Map (TAM) is an advanced explainability method for multimodal LLMs. Using causal inference and a Rank Gaussian Filter, TAM reveals token-level interactions and eliminates redundant activations. The result is clearer, high-quality visualizations that enhance understanding of object localization, reasoning and multimodal alignment across models.

👥 Authors: Yi Li, Hualiang Wang, Xinpeng Ding, Haonan Wang, and Xiaomeng Li

📅 Conference: ICCV, 19 – 23 Oct, 2025 | Honolulu, Hawai'i, USA 🇺🇸

📄 Paper: Token Activation Map to Visually Explain Multimodal LLMs (2506.23270)

📁 Repository: https://github.com/xmed-lab/TAM

🚀 ICCV-2023-25-Papers: https://github.com/DmitryRyumin/ICCV-2023-25-Papers

🚀 Added to the Multi-Modal Learning Section: https://github.com/DmitryRyumin/ICCV-2023-25-Papers/blob/main/sections/2025/main/multi-modal-learning.md

📚 More Papers: more cutting-edge research presented at other conferences in the DmitryRyumin/NewEraAI-Papers curated by @DmitryRyumin

🔍 Keywords: #TokenActivationMap #TAM #CausalInference #VisualReasoning #Multimodal #Explainability #VisionLanguage #LLM #XAI #AI #ICCV2025 #ResearchHighlight

2 replies

·

DmitryRyumin

posted an update 9 months ago

Post

4084

🚀🎭🌟 New Research Alert - WACV 2025 (Avatars Collection)! 🌟🎭🚀
📄 Title: EmoVOCA: Speech-Driven Emotional 3D Talking Heads 🔝

📝 Description: EmoVOCA is a data-driven method for generating emotional 3D talking heads by combining speech-driven lip movements with expressive facial dynamics. This method has been developed to overcome the limitations of corpora and to achieve state-of-the-art animation quality.

👥 Authors: @FedeNoce , Claudio Ferrari, and Stefano Berretti

📅 Conference: WACV, 28 Feb – 4 Mar, 2025 | Arizona, USA 🇺🇸

📄 Paper: https://arxiv.org/abs/2403.12886

🌐 Github Page: https://fedenoce.github.io/emovoca/
📁 Repository: https://github.com/miccunifi/EmoVOCA

🚀 CVPR-2023-24-Papers: https://github.com/DmitryRyumin/CVPR-2023-24-Papers

🚀 WACV-2024-Papers: https://github.com/DmitryRyumin/WACV-2024-Papers

🚀 ICCV-2023-Papers: https://github.com/DmitryRyumin/ICCV-2023-Papers

📚 More Papers: more cutting-edge research presented at other conferences in the DmitryRyumin/NewEraAI-Papers curated by @DmitryRyumin

🚀 Added to the Avatars Collection: DmitryRyumin/avatars-65df37cdf81fec13d4dbac36

🔍 Keywords: #EmoVOCA #3DAnimation #TalkingHeads #SpeechDriven #FacialExpressions #MachineLearning #ComputerVision #ComputerGraphics #DeepLearning #AI #WACV2024

1 reply

·

DmitryRyumin

posted an update about 1 year ago

Post

3074

🔥🎭🌟 New Research Alert - HeadGAP (Avatars Collection)! 🌟🎭🔥
📄 Title: HeadGAP: Few-shot 3D Head Avatar via Generalizable Gaussian Priors 🔝

📝 Description: HeadGAP introduces a novel method for generating high-fidelity, animatable 3D head avatars from few-shot data, using Gaussian priors and dynamic part-based modelling for personalized and generalizable results.

👥 Authors: @zxz267 , @walsvid , @zhaohu2 , Weiyi Zhang, @hellozhuo , Xu Chang, Yang Zhao, Zheng Lv, Xiaoyuan Zhang, @yongjie-zhang-mail , Guidong Wang, and Lan Xu

📄 Paper: HeadGAP: Few-shot 3D Head Avatar via Generalizable Gaussian Priors (2408.06019)

🌐 Github Page: https://headgap.github.io

🚀 CVPR-2023-24-Papers: https://github.com/DmitryRyumin/CVPR-2023-24-Papers

🚀 WACV-2024-Papers: https://github.com/DmitryRyumin/WACV-2024-Papers

🚀 ICCV-2023-Papers: https://github.com/DmitryRyumin/ICCV-2023-Papers

📚 More Papers: more cutting-edge research presented at other conferences in the DmitryRyumin/NewEraAI-Papers curated by @DmitryRyumin

🚀 Added to the Avatars Collection: DmitryRyumin/avatars-65df37cdf81fec13d4dbac36

🔍 Keywords: #HeadGAP #3DAvatar #FewShotLearning #GaussianPriors #AvatarCreation #3DModeling #MachineLearning #ComputerVision #ComputerGraphics #GenerativeAI #DeepLearning #AI

DmitryRyumin

posted an update about 1 year ago

Post

2145

🚀🕺🌟 New Research Alert - ECCV 2024 (Avatars Collection)! 🌟💃🚀
📄 Title: Expressive Whole-Body 3D Gaussian Avatar 🔝

📝 Description: ExAvatar is a model that generates animatable 3D human avatars with facial expressions and hand movements from short monocular videos using a hybrid mesh and 3D Gaussian representation.

👥 Authors: Gyeongsik Moon, Takaaki Shiratori, and @psyth

📅 Conference: ECCV, 29 Sep – 4 Oct, 2024 | Milano, Italy 🇮🇹

📄 Paper: MeshAvatar: Learning High-quality Triangular Human Avatars from Multi-view Videos (2407.08414)

📄 Paper: Expressive Whole-Body 3D Gaussian Avatar (2407.21686)

🌐 Github Page: https://mks0601.github.io/ExAvatar
📁 Repository: https://github.com/mks0601/ExAvatar_RELEASE

🚀 CVPR-2023-24-Papers: https://github.com/DmitryRyumin/CVPR-2023-24-Papers

🚀 WACV-2024-Papers: https://github.com/DmitryRyumin/WACV-2024-Papers

🚀 ICCV-2023-Papers: https://github.com/DmitryRyumin/ICCV-2023-Papers

📚 More Papers: more cutting-edge research presented at other conferences in the DmitryRyumin/NewEraAI-Papers curated by @DmitryRyumin

🚀 Added to the Avatars Collection: DmitryRyumin/avatars-65df37cdf81fec13d4dbac36

🔍 Keywords: #ExAvatar #3DAvatar #FacialExpressions #HandMotions #MonocularVideo #3DModeling #GaussianSplatting #MachineLearning #ComputerVision #ComputerGraphics #DeepLearning #AI #ECCV2024

DmitryRyumin

posted an update about 1 year ago

Post

1867

🔥🎭🌟 New Research Alert - ECCV 2024 (Avatars Collection)! 🌟🎭🔥
📄 Title: MeshAvatar: Learning High-quality Triangular Human Avatars from Multi-view Videos 🔝

📝 Description: MeshAvatar is a novel pipeline that generates high-quality triangular human avatars from multi-view videos, enabling realistic editing and rendering through a mesh-based approach with physics-based decomposition.

👥 Authors: Yushuo Chen, Zerong Zheng, Zhe Li, Chao Xu, and Yebin Liu

📅 Conference: ECCV, 29 Sep – 4 Oct, 2024 | Milano, Italy 🇮🇹

📄 Paper: MeshAvatar: Learning High-quality Triangular Human Avatars from Multi-view Videos (2407.08414)

🌐 Github Page: https://shad0wta9.github.io/meshavatar-page
📁 Repository: https://github.com/shad0wta9/meshavatar

📺 Video: https://www.youtube.com/watch?v=Kpbpujkh2iI

🚀 CVPR-2023-24-Papers: https://github.com/DmitryRyumin/CVPR-2023-24-Papers

🚀 WACV-2024-Papers: https://github.com/DmitryRyumin/WACV-2024-Papers

🚀 ICCV-2023-Papers: https://github.com/DmitryRyumin/ICCV-2023-Papers

📚 More Papers: more cutting-edge research presented at other conferences in the DmitryRyumin/NewEraAI-Papers curated by @DmitryRyumin

🚀 Added to the Avatars Collection: DmitryRyumin/avatars-65df37cdf81fec13d4dbac36

🔍 Keywords: #MeshAvatar #3DAvatars #MultiViewVideo #PhysicsBasedRendering #TriangularMesh #AvatarCreation #3DModeling #NeuralRendering #Relighting #AvatarEditing #MachineLearning #ComputerVision #ComputerGraphics #DeepLearning #AI #ECCV2024

DmitryRyumin

posted an update over 1 year ago

Post

2330

🚀🕺🌟 New Research Alert - CVPR 2024 (Avatars Collection)! 🌟💃🚀
📄 Title: IntrinsicAvatar: Physically Based Inverse Rendering of Dynamic Humans from Monocular Videos via Explicit Ray Tracing 🔝

📝 Description: IntrinsicAvatar is a method for extracting high-quality geometry, albedo, material, and lighting properties of clothed human avatars from monocular videos using explicit ray tracing and volumetric scattering, enabling realistic animations under varying lighting conditions.

👥 Authors: Shaofei Wang, Božidar Antić, Andreas Geiger, and Siyu Tang

📅 Conference: CVPR, Jun 17-21, 2024 | Seattle WA, USA 🇺🇸

🔗 Paper: IntrinsicAvatar: Physically Based Inverse Rendering of Dynamic Humans from Monocular Videos via Explicit Ray Tracing (2312.05210)

🌐 Github Page: https://neuralbodies.github.io/IntrinsicAvatar/
📁 Repository: https://github.com/taconite/IntrinsicAvatar

📺 Video: https://www.youtube.com/watch?v=aS8AIxgVXzI

🚀 CVPR-2023-24-Papers: https://github.com/DmitryRyumin/CVPR-2023-24-Papers

🚀 WACV-2024-Papers: https://github.com/DmitryRyumin/WACV-2024-Papers

🚀 ICCV-2023-Papers: https://github.com/DmitryRyumin/ICCV-2023-Papers

📚 More Papers: more cutting-edge research presented at other conferences in the DmitryRyumin/NewEraAI-Papers curated by @DmitryRyumin

🚀 Added to the Avatars Collection: DmitryRyumin/avatars-65df37cdf81fec13d4dbac36

🔍 Keywords: #IntrinsicAvatar #InverseRendering #MonocularVideos #RayTracing #VolumetricScattering #3DReconstruction #MachineLearning #ComputerVision #DeepLearning #AI #CVPR2024

DmitryRyumin

posted an update over 1 year ago

Post

3151

🔥🎭🌟 New Research Alert - ECCV 2024 (Avatars Collection)! 🌟🎭🔥
📄 Title: RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models 🔝

📝 Description: RodinHD generates high-fidelity 3D avatars from portrait images using a novel data scheduling strategy and weight consolidation regularization to capture intricate details such as hairstyles.

👥 Authors: Bowen Zhang, @yiji , @chunyuwang , Ting Zhang, @jiaolong , Yansong Tang, Feng Zhao, Dong Chen, and Baining Guo

📅 Conference: ECCV, 29 Sep – 4 Oct, 2024 | Milano, Italy 🇮🇹

📄 Paper: RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models (2407.06938)

🌐 Github Page: https://rodinhd.github.io/
📁 Repository: https://github.com/RodinHD/RodinHD

📺 Video: https://www.youtube.com/watch?v=ULvHt7dZx-Q

🚀 CVPR-2023-24-Papers: https://github.com/DmitryRyumin/CVPR-2023-24-Papers

🚀 WACV-2024-Papers: https://github.com/DmitryRyumin/WACV-2024-Papers

🚀 ICCV-2023-Papers: https://github.com/DmitryRyumin/ICCV-2023-Papers

📚 More Papers: more cutting-edge research presented at other conferences in the DmitryRyumin/NewEraAI-Papers curated by @DmitryRyumin

🚀 Added to the Avatars Collection: DmitryRyumin/avatars-65df37cdf81fec13d4dbac36

🔍 Keywords: #RodinHD #3DAvatars #DiffusionModels #HighFidelity #PortraitTo3D #MachineLearning #ComputerVision #DeepLearning #AI #ECCV2024

DmitryRyumin

posted an update over 1 year ago

Post

2440

🔥🎭🌟 New Research Alert - LivePortrait (Avatars Collection)! 🌟🎭🔥
📄 Title: LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control 🔝

📝 Description: LivePortrait is an efficient video-driven portrait animation framework that uses implicit keypoints and stitching/retargeting modules to generate high-quality, controllable animations from a single source image.

👥 Authors: @cleardusk , Dingyun Zhang, Xiaoqiang Liu, Zhizhou Zhong, Yuan Zhang, Pengfei Wan, and Di Zhang

🤗 Demo: KwaiVGI/LivePortrait

📄 Paper: LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control (2407.03168)

🌐 Github Page: https://liveportrait.github.io/
📁 Repository: https://github.com/KwaiVGI/LivePortrait

🔥 Model 🤖: KwaiVGI/LivePortrait

🚀 CVPR-2023-24-Papers: https://github.com/DmitryRyumin/CVPR-2023-24-Papers

🚀 WACV-2024-Papers: https://github.com/DmitryRyumin/WACV-2024-Papers

🚀 ICCV-2023-Papers: https://github.com/DmitryRyumin/ICCV-2023-Papers

📚 More Papers: more cutting-edge research presented at other conferences in the DmitryRyumin/NewEraAI-Papers curated by @DmitryRyumin

🚀 Added to the Avatars Collection: DmitryRyumin/avatars-65df37cdf81fec13d4dbac36

🔍 Keywords: #LivePortrait #PortraitAnimation #ComputerVision #MachineLearning #DeepLearning #ComputerGraphics #FacialAnimation #GenerativeAI #RealTimeRendering #AI

DmitryRyumin

posted an update over 1 year ago

Post

2732

🚀🕺🌟 New Research Alert (Avatars Collection)! 🌟💃🚀
📄 Title: Expressive Gaussian Human Avatars from Monocular RGB Video 🔝

📝 Description: The new EVA model enhances the expressiveness of digital avatars by using 3D Gaussians and SMPL-X to capture fine-grained hand and face details from monocular RGB video.

👥 Authors: Hezhen Hu, Zhiwen Fan, Tianhao Wu, Yihan Xi, Seoyoung Lee, Georgios Pavlakos, and Zhangyang Wang

📄 Paper: Expressive Gaussian Human Avatars from Monocular RGB Video (2407.03204)

🌐 Github Page: https://evahuman.github.io/
📁 Repository: https://github.com/evahuman/EVA

🚀 CVPR-2023-24-Papers: https://github.com/DmitryRyumin/CVPR-2023-24-Papers

🚀 WACV-2024-Papers: https://github.com/DmitryRyumin/WACV-2024-Papers

🚀 ICCV-2023-Papers: https://github.com/DmitryRyumin/ICCV-2023-Papers

📚 More Papers: more cutting-edge research presented at other conferences in the DmitryRyumin/NewEraAI-Papers curated by @DmitryRyumin

🚀 Added to the Avatars Collection: DmitryRyumin/avatars-65df37cdf81fec13d4dbac36

🔍 Keywords: #DigitalAvatars #3DModeling #ComputerVision #MonocularVideo #SMPLX #3DGaussians #AvatarExpressiveness #HandTracking #FacialExpressions #AI #MachineLearning

DmitryRyumin

posted an update over 1 year ago

Post

2069

🔥🎭🌟 New Research Alert - ECCV 2024 (Avatars Collection)! 🌟🎭🔥
📄 Title: Topo4D: Topology-Preserving Gaussian Splatting for High-Fidelity 4D Head Capture 🔝

📝 Description: Topo4D is a novel method for automated, high-fidelity 4D head tracking that optimizes dynamic topological meshes and 8K texture maps from multi-view time-series images.

👥 Authors: @Dazz1e , Y. Cheng, @Ryan-sjtu , H. Jia, D. Xu, W. Zhu, Y. Yan

📅 Conference: ECCV, 29 Sep – 4 Oct, 2024 | Milano, Italy 🇮🇹

📄 Paper: Topo4D: Topology-Preserving Gaussian Splatting for High-Fidelity 4D Head Capture (2406.00440)

🌐 Github Page: https://xuanchenli.github.io/Topo4D/
📁 Repository: https://github.com/XuanchenLi/Topo4D

🚀 CVPR-2023-24-Papers: https://github.com/DmitryRyumin/CVPR-2023-24-Papers

🚀 WACV-2024-Papers: https://github.com/DmitryRyumin/WACV-2024-Papers

🚀 ICCV-2023-Papers: https://github.com/DmitryRyumin/ICCV-2023-Papers

📚 More Papers: more cutting-edge research presented at other conferences in the DmitryRyumin/NewEraAI-Papers curated by @DmitryRyumin

🚀 Added to the Avatars Collection: DmitryRyumin/avatars-65df37cdf81fec13d4dbac36

🔍 Keywords: #Topo4D #4DHead #3DModeling #4DCapture #FacialAnimation #ComputerGraphics #MachineLearning #HighFidelity #TextureMapping #DynamicMeshes #GaussianSplatting #VisualEffects #ECCV2024

1 reply

·

DmitryRyumin

posted an update over 1 year ago

Post

3720

🚀🎭🌟 New Research Alert - Portrait4D-v2 (Avatars Collection)! 🌟🎭🚀
📄 Title: Portrait4D-v2: Pseudo Multi-View Data Creates Better 4D Head Synthesizer 🔝

📝 Description: Portrait4D-v2 is a novel method for one-shot 4D head avatar synthesis using pseudo multi-view videos and a vision transformer backbone, achieving superior performance without relying on 3DMM reconstruction.

👥 Authors: Yu Deng, Duomin Wang, and Baoyuan Wang

📄 Paper: Portrait4D-v2: Pseudo Multi-View Data Creates Better 4D Head Synthesizer (2403.13570)

🌐 GitHub Page: https://yudeng.github.io/Portrait4D-v2/
📁 Repository: https://github.com/YuDeng/Portrait-4D

📺 Video: https://www.youtube.com/watch?v=5YJY6-wcOJo

🚀 CVPR-2023-24-Papers: https://github.com/DmitryRyumin/CVPR-2023-24-Papers

📚 More Papers: more cutting-edge research presented at other conferences in the DmitryRyumin/NewEraAI-Papers curated by @DmitryRyumin

🚀 Added to the Avatars Collection: DmitryRyumin/avatars-65df37cdf81fec13d4dbac36

🔍 Keywords: Portrait4D #4DAvatar #HeadSynthesis #3DModeling #TechInnovation #DeepLearning #ComputerGraphics #ComputerVision #Innovation

1 reply

·

DmitryRyumin

posted an update over 1 year ago

Post

2390

😀😲😐😡 New Research Alert - CVPRW 2024 (Facial Expressions Recognition Collection)! 😡😥🥴😱
📄 Title: Zero-Shot Audio-Visual Compound Expression Recognition Method based on Emotion Probability Fusion 🔝

📝 Description: AVCER is a novel audio-visual method for compound expression recognition based on pair-wise sum of emotion probability, evaluated in multi- and cross-corpus setups without task-specific training data, demonstrating its potential for intelligent emotion annotation tools.

👥 Authors: @ElenaRyumina , Maxim Markitantov, @DmitryRyumin , Heysem Kaya, and Alexey Karpov

📅 Conference: CVPRW, Jun 17-21, 2024 | Seattle WA, USA 🇺🇸

🤗 Demo: ElenaRyumina/AVCER

📄 Paper: Audio-Visual Compound Expression Recognition Method based on Late Modality Fusion and Rule-based Decision (2403.12687)

🌐 Github Page: https://elenaryumina.github.io/AVCER
📁 Repository: https://github.com/ElenaRyumina/AVCER/tree/main/src

🚀 CVPR-2023-24-Papers: https://github.com/DmitryRyumin/CVPR-2023-24-Papers

📚 More Papers: more cutting-edge research presented at other conferences in the DmitryRyumin/NewEraAI-Papers curated by @DmitryRyumin

🚀 Added to the Facial Expressions Recognition Collection: DmitryRyumin/facial-expressions-recognition-65f22574e0724601636ddaf7

🔍 Keywords: #AVCER #AudioVisual #CompoundExpressions #EmotionRecognition #ModalityFusion #RuleBasedAI #ABAWCompetition #AIResearch #HumanEmotion #IntelligentTools #MachineLearning #DeepLearning #MultiCorpus #CrossCorpus #CVPR2024

DmitryRyumin

posted an update over 1 year ago

Post

1877

🚀🎭🌟 New Research Alert - CVPR 2024 (Avatars Collection)! 🌟🎭🚀
📄 Title: Relightable Gaussian Codec Avatars 🔝

📝 Description: Relightable Gaussian Codec Avatars is a method for creating highly detailed and relightable 3D head avatars that can animate expressions in real time and support complex features such as hair and skin with efficient rendering suitable for VR.

👥 Authors: @psyth , @GBielXONE02 , Tomas Simon, Junxuan Li, and @giljoonam

📅 Conference: CVPR, Jun 17-21, 2024 | Seattle WA, USA 🇺🇸

📄 Paper: Relightable Gaussian Codec Avatars (2312.03704)

🌐 GitHub Page: https://shunsukesaito.github.io/rgca/

🚀 CVPR-2023-24-Papers: https://github.com/DmitryRyumin/CVPR-2023-24-Papers

📚 More Papers: more cutting-edge research presented at other conferences in the DmitryRyumin/NewEraAI-Papers curated by @DmitryRyumin

🚀 Added to the Avatars Collection: DmitryRyumin/avatars-65df37cdf81fec13d4dbac36

🔍 Keywords: #3DAvatars #RealTimeRendering #RelightableAvatars #3DModeling #VirtualReality #CVPR2024 #DeepLearning #ComputerGraphics #ComputerVision #Innovation #VR

DmitryRyumin

posted an update over 1 year ago

Post

870

🚀🎭🌟 New Research Alert - InstructAvatar (Avatars Collection)! 🌟🎭🚀
📄 Title: InstructAvatar: Text-Guided Emotion and Motion Control for Avatar Generation 🔝

📝 Description: InstructAvatar is a novel method for generating emotionally expressive 2D avatars using text-guided instructions, offering improved emotion control, lip-sync quality, and naturalness. It uses a two-branch diffusion-based generator to predict avatars based on both audio and text input.

👥 Authors: Yuchi Wang et al.

📄 Paper: InstructAvatar: Text-Guided Emotion and Motion Control for Avatar Generation (2405.15758)

🌐 Github Page: https://wangyuchi369.github.io/InstructAvatar/
📁 Repository: https://github.com/wangyuchi369/InstructAvatar

📚 More Papers: more cutting-edge research presented at other conferences in the DmitryRyumin/NewEraAI-Papers curated by @DmitryRyumin

🚀 Added to the Avatars Collection: DmitryRyumin/avatars-65df37cdf81fec13d4dbac36

🔍 Keywords: #InstructAvatar #AvatarGeneration #EmotionControl #FacialMotion #LipSynchronization #NaturalLanguageInterface #DiffusionBasedGenerator #TextGuidedInstructions #2DAvatars #VideoSynthesis #Interactivity #ComputerGraphics #DeepLearning #ComputerVision #Innovation

DmitryRyumin

posted an update over 1 year ago

Post

1521

🔥🚀🌟 New Research Alert - YOLOv10! 🌟🚀🔥
📄 Title: YOLOv10: Real-Time End-to-End Object Detection 🔝

📝 Description: YOLOv10 improves real-time object recognition by eliminating non-maximum suppression and optimizing the model architecture to achieve state-of-the-art performance with lower latency and computational overhead.

👥 Authors: Ao Wang et al.

📄 Paper: YOLOv10: Real-Time End-to-End Object Detection (2405.14458)

🤗 Demo: kadirnar/Yolov10 curated by @kadirnar
🔥 Model 🤖: kadirnar/Yolov10

📁 Repository: https://github.com/THU-MIG/yolov10

📮 Post about YOLOv9 - https://huggingface.co/posts/DmitryRyumin/519784698531054

📚 More Papers: more cutting-edge research presented at other conferences in the DmitryRyumin/NewEraAI-Papers curated by @DmitryRyumin

🔍 Keywords: #YOLOv10 #ObjectDetection #RealTimeAI #ModelOptimization #MachineLearning #DeepLearning #ComputerVision #Innovation

1 reply

·

LEYA Lab

AI & ML interests

Recent Activity

Team RAS in 9th ABAW Competition: Multimodal Compound Expression Recognition Approach

AI & ML interests

Recent Activity

Team members 1

LEYA-HSE's activity