| base_model: Qwen/Qwen3-8B-Base | |
| dtype: bfloat16 | |
| merge_method: della | |
| modules: | |
| default: | |
| slices: | |
| - sources: | |
| - layer_range: [0, 36] | |
| model: deepseek-ai/deepseek-r1-0528-qwen3-8b | |
| parameters: | |
| density: 1.0 | |
| lambda: 0.9 | |
| weight: 1.0 | |
| - layer_range: [0, 36] | |
| model: Qwen/Qwen3-8B-Base | |
| parameters: | |
| density: 1.0 | |
| int8_mask: 1.0 | |
| lambda: 0.9 | |
| normalize: 1.0 | |
| weight: 1.0 | |
| tokenizer_source: deepseek-ai/deepseek-r1-0528-qwen3-8b |