npuw-synth-mha-llama-12L
Synthetic LLM test model โ MHA variant (num_kv_heads == num_heads). Companion to
npuw-gqa-test-model for
isolating GQA-specific code paths on NPU.
Random weights โ not for inference quality.
Config
| field |
value |
| layers |
12 |
| hidden_size |
256 |
| num_heads |
8 |
| num_kv_heads |
8 (MHA) |
| head_dim |
32 |
| intermediate_size |
1024 |
| vocab_size |
32000 |
| ffn |
swiglu |
| norm |
rms |
| rope |
half |
| weight |
fp32 |
| position_ids |
2d |
| kv_cache |
yes |
Source
Generated with npuw_model_generator_demo from
dylanneve1/openvino@gqa-fix:
npuw_model_generator_demo --type llm \
-t Meta-Llama-3.1-8B-Instruct \
-o out -n npuw-synth-mha-llama-12L \
--num-layers 12