Using the Acree_Fusion merging method, we transferred the knowledge of Deepseek-V3.1 from the distilled reasoning model to the instruction model.

Model Highlights:

merge method: arcee_fusion
Highest precision: dtype: float32 + out_dtype: bfloat16
Context length: 262,144

Parameter Settings:

Temperature=0.7, TopP=0.8, TopK=20,MinP=0.

Configuration:

The following YAML configuration was used to produce this model:

models:
  - model: BasedBase/Qwen3-30B-A3B-Thinking-2507-Deepseek-v3.1-Distill-V2-FP32
merge_method: arcee_fusion
base_model: Qwen/Qwen3-30B-A3B-Instruct-2507
dtype: float32
out_dtype: bfloat16
tokenizer_source: base

Downloads last month: 29

Safetensors

Model size

31B params

Tensor type

BF16

Model tree for YOYO-AI/Qwen3-30B-A3B-Deepseek-Distill-Instruct-2507

BasedBase/Qwen3-30B-A3B-Thinking-2507-Deepseek-v3.1-Distill-V2-FP32

Qwen/Qwen3-30B-A3B-Instruct-2507

Merge model

this model

Quantizations

4 models