You need to agree to share your contact information to access this model
This repository is publicly accessible, but you have to accept the conditions to access its files and content.
By requesting access to these weights you agree to use them for non-commercial research only, and to cite the AHS paper (arxiv:2604.15857) in any derivative work.
Log in or Sign Up to review the conditions and access this model content.
AHS: Adaptive Head Synthesis via Synthetic Data Augmentations
Official pretrained weights for AHS.
- Inference code: https://github.com/KEH0T0/AHS (branch:
inference) - Paper: https://arxiv.org/abs/2604.15857
Repo contents
| path | size | purpose |
|---|---|---|
id_encoder_weights.pth |
1.6 GB | PhotoMaker-style ID encoder (photomaker_encoder.IDEncoder) |
graphonomy_weights.pth |
166 MB | Graphonomy DeepLab-Xception human parsing net (for head masks) |
checkpoint-179160/ |
19 GB | SDXL fine-tune (UNet + tryon UNet-encoder + text encoders + VAE + tokenizers) |
checkpoint-179160/ is a diffusers-format directory and can be loaded via
from_pretrained subfolder calls (see infer_xl_base.py).
Quick download (after gate approval)
pip install huggingface_hub
huggingface-cli login # token with 'read' scope
# All weights
huggingface-cli download Keh0t0/AHS-weights --local-dir ./AHS-weights
# Or only what you need:
huggingface-cli download Keh0t0/AHS-weights \
id_encoder_weights.pth graphonomy_weights.pth \
--local-dir weights/
huggingface-cli download Keh0t0/AHS-weights \
--include "checkpoint-179160/*" --local-dir checkpoints/
Usage
These weights are consumed by the inference code at github.com/KEH0T0/AHS. The README there walks through environment setup, preprocessing (GAGAvatar_track head alignment, Graphonomy head mask, DensePose, EMOCA normal swap), and inference.
Minimal quick-run over the 3-pair sample dataset shipped with the code:
git clone https://github.com/KEH0T0/AHS.git -b inference
cd AHS
conda env create -f environment.yaml && conda activate ahs
# Link the downloaded weights
mkdir -p weights
ln -sf /absolute/path/to/AHS-weights/id_encoder_weights.pth weights/id_encoder_weights.pth
ln -sf /absolute/path/to/AHS-weights/graphonomy_weights.pth weights/graphonomy_weights.pth
BASE_MODEL=/absolute/path/to/AHS-weights/checkpoint-179160 \
CUDA_VISIBLE_DEVICES=0 bash infer_sample.sh
Output lands in ./sample_result/final/<head>_<body>.png (triplet format:
[head | body | swapped]).
Method summary
AHS takes a (body image, head reference) pair and runs two-pass SDXL
inpainting over the head region of the body. Identity is transferred via
(a) PhotoMaker-style ID encoder that reads a face crop of the head
reference and injects it at the img trigger token, and (b) IP-Adapter
style CLIP image conditioning from the GAGAvatar-aligned head image.
Shape guidance comes from a merged map of body DensePose (below the neck)
and EMOCA face-normals of the head rendered in the body's pose (above
the neck).
License
Released under CC BY-NC 4.0 for research use. Do not use these weights for commercial purposes. Upstream components follow their own licenses:
- SDXL base model β CreativeML Open RAIL++-M
- IDM-VTON β see https://github.com/yisol/IDM-VTON
- Graphonomy β see https://github.com/Gaoyiminggithub/Graphonomy
Citation
@misc{kang2026ahsadaptiveheadsynthesis,
title={AHS: Adaptive Head Synthesis via Synthetic Data Augmentations},
author={Taewoong Kang and Hyojin Jang and Sohyun Jeong and Seunggi Moon and Gihwi Kim and Hoon Jin Jung and Jaegul Choo},
year={2026},
eprint={2604.15857},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2604.15857},
}
Acknowledgements
- IDM-VTON β base UNet / tryon pipeline
- IP-Adapter β image prompt adapter
- PhotoMaker β ID encoder design
- Graphonomy β human parsing
- EMOCA β 3D face normals
- GAGAvatar_track β head alignment
- DensePose β body pose
- Downloads last month
- -