Shoe Style-Invariant and Ground-Aware Learning for Dense Foot Contact Estimation
Abstract
A framework for dense foot contact estimation addresses challenges of shoe appearance diversity and ground feature extraction using adversarial training and spatial context-based feature extraction.
Foot contact plays a critical role in human interaction with the world, and thus exploring foot contact can advance our understanding of human movement and physical interaction. Despite its importance, existing methods often approximate foot contact using a zero-velocity constraint and focus on joint-level contact, failing to capture the detailed interaction between the foot and the world. Dense estimation of foot contact is crucial for accurately modeling this interaction, yet predicting dense foot contact from a single RGB image remains largely underexplored. There are two main challenges for learning dense foot contact estimation. First, shoes exhibit highly diverse appearances, making it difficult for models to generalize across different styles. Second, ground often has a monotonous appearance, making it difficult to extract informative features. To tackle these issues, we present a FEet COntact estimation (FECO) framework that learns dense foot contact with shoe style-invariant and ground-aware learning. To overcome the challenge of shoe appearance diversity, our approach incorporates shoe style adversarial training that enforces shoe style-invariant features for contact estimation. To effectively utilize ground information, we introduce a ground feature extractor that captures ground properties based on spatial context. As a result, our proposed method achieves robust foot contact estimation regardless of shoe appearance and effectively leverages ground information. Code will be released.
Community
Shoe Style-Invariant and Ground-Aware Learning for Dense Foot Contact Estimation
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- DecoDINO: 3D Human-Scene Contact Prediction with Semantic Classification (2025)
- Open-world Hand-Object Interaction Video Generation Based on Structure and Contact-aware Representation (2025)
- MMHOI: Modeling Complex 3D Multi-Human Multi-Object Interactions (2025)
- DF-Mamba: Deformable State Space Modeling for 3D Hand Pose Estimation in Interactions (2025)
- DINOv2 Driven Gait Representation Learning for Video-Based Visible-Infrared Person Re-identification (2025)
- TOUCH: Text-guided Controllable Generation of Free-Form Hand-Object Interactions (2025)
- Human3R: Everyone Everywhere All at Once (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper