npuw-synth-mha-llama-12L

Synthetic LLM test model — MHA variant (num_kv_heads == num_heads). Companion to npuw-gqa-test-model for isolating GQA-specific code paths on NPU.

Random weights — not for inference quality.

Config

Generated with npuw_model_generator_demo from dylanneve1/openvino@gqa-fix:

npuw_model_generator_demo --type llm \
  -t Meta-Llama-3.1-8B-Instruct \
  -o out -n npuw-synth-mha-llama-12L \
  --num-layers 12

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support