gguf quantized version of wan2.2-s2v (all gguf: incl. encoders + vae)

  • drag wan to > ./ComfyUI/models/diffusion_models
  • anyone below, drag it to > ./ComfyUI/models/text_encoders
    • option 1: just cow-umt5xxl [3.67GB]
    • option 2: both cat-umt5xxl [3.66GB] and tokenizer [4.55MB]
    • option 3: just umt5xxl [3.47GB] (need protobuf to rebuild tokenizer)
  • drag wav2vec2-v2 [632MB] to > ./ComfyUI/models/audio_encoders
  • drag pig [254MB] to > ./ComfyUI/models/vae

screenshot

Prompt
a cute anime girl with massive fennec ears and a big fluffy tail wearing a maid outfit
Negative Prompt
blurry ugly bad
Prompt
a cute anime girl with massive fennec ears and a big fluffy tail wearing a maid outfit turning around
Negative Prompt
blurry ugly bad
Prompt
a conversation between cgg and connector
Negative Prompt
blurry ugly bad

note: the new GGUF AudioEncoder Loader on test; running gguf audio encoder wav2vec2 w/o ending error msg compare to fp16 safetensors (depends how long of your prompt/video)

reference

  • for the lite workflow (save >70% loading time), get the lite lora for 4/8-step operation here
  • or opt to use scaled fp8 e4m3 safetensors audio encoder here and/or fp8 e4m3 vae here and/or scaled fp8 e4m3 safetensors text encoder here (don't even need to switch to native loaders as GGUF AudioEncoder Loader, GGUF VAE Loader and GGUF CLIP Loader support both gguf and fp8 scaled safetensors files; can mix up or combine use as well)
  • gguf-node (pypi|repo|pack)
Downloads last month
1,100
GGUF
Model size
16B params
Architecture
pig
Hardware compatibility
Log In to view the estimation

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for calcuis/wan-s2v-gguf

Quantized
(2)
this model