Using the Acree_Fusion merging method, we transferred the knowledge of Deepseek-V3.1 from the distilled reasoning model to the instruction model.

Model Highlights:

  • merge method: arcee_fusion

  • Highest precision: dtype: float32 + out_dtype: bfloat16

  • Context length: 262,144

Parameter Settings:

Temperature=0.7, TopP=0.8, TopK=20,MinP=0.

Configuration:

The following YAML configuration was used to produce this model:

models:
  - model: BasedBase/Qwen3-30B-A3B-Thinking-2507-Deepseek-v3.1-Distill-V2-FP32
merge_method: arcee_fusion
base_model: Qwen/Qwen3-30B-A3B-Instruct-2507
dtype: float32
out_dtype: bfloat16
tokenizer_source: base
Downloads last month
29
Safetensors
Model size
31B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for YOYO-AI/Qwen3-30B-A3B-Deepseek-Distill-Instruct-2507