Using the Acree_Fusion merging method, we transferred the knowledge of Deepseek-V3.1 from the distilled reasoning model to the instruction model.
Model Highlights:
merge method:
arcee_fusionHighest precision:
dtype: float32+out_dtype: bfloat16Context length:
262,144
Parameter Settings:
Temperature=0.7,TopP=0.8,TopK=20,MinP=0.
Configuration:
The following YAML configuration was used to produce this model:
models:
- model: BasedBase/Qwen3-30B-A3B-Thinking-2507-Deepseek-v3.1-Distill-V2-FP32
merge_method: arcee_fusion
base_model: Qwen/Qwen3-30B-A3B-Instruct-2507
dtype: float32
out_dtype: bfloat16
tokenizer_source: base
- Downloads last month
- 29
Model tree for YOYO-AI/Qwen3-30B-A3B-Deepseek-Distill-Instruct-2507
Merge model
this model