--- license: apache-2.0 --- # UNet with Sliding Window Attention - 8ch latent by moving modules from WF-VAE to [NoobAI XL VAE](https://huggingface.co/Laxhar/noobai-XL-Vpred-1.0/tree/main/vae) - supports recent long context CLIPs - variable num_head in MHA across the layers - both the UNet and the Autoencoder are written in vanilla PyTorch The result is similar to what [Mitsua](https://huggingface.co/Mitsua/mitsua-likes) accomplished back then. ## References - 2411.17459