| license: apache-2.0 | |
| # UNet with Sliding Window Attention | |
| - 8ch latent by moving modules from WF-VAE to [NoobAI XL VAE](https://huggingface.co/Laxhar/noobai-XL-Vpred-1.0/tree/main/vae) | |
| - supports recent long context CLIPs | |
| - variable num_head in MHA across the layers | |
| - both the UNet and the Autoencoder are written in vanilla PyTorch | |
| The result is similar to what [Mitsua](https://huggingface.co/Mitsua/mitsua-likes) accomplished back then. | |
| ## References | |
| - 2411.17459 |