ZeroXClem/Qwen2.5-7B-DistilPrism
Qwen2.5-7B-DistilPrism is a distillation / reasoning focused model merge designed to combine multiple variations of DeepSeek-R1 distillations, resulting in a refined, high-performance language model. Utilizing the Model Stock merge method, this fusion captures the best attributes of DeepSeek-R1-Distill-Qwen-7B and its improved derivatives.
🚀 Merged Models
This model is a weighted merge of the following:
- huihui-ai/DeepSeek-R1-Distill-Qwen-7B-abliterated-v2: An uncensored distillation of DeepSeek-R1, optimized to remove refusals and improve usability.
- mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-7B-v1.1: A refined distillation that improves accuracy and robustness across various benchmarks.
- Triangle104/DSR1-Distill-Qwen-7B-RP: A composite merge of various distilled DeepSeek variants, serving as an essential ingredient for performance tuning.
- deepseek-ai/DeepSeek-R1-Distill-Qwen-7B: The foundation of this merge, representing the distilled form of DeepSeek-R1 optimized for efficiency and strong reasoning capabilities.
🧩 Merge Configuration
The following YAML configuration defines how these models were combined using Model Stock, ensuring balanced contributions from each source:
# Merge configuration for ZeroXClem/Qwen2.5-7B-DistilPrism using Model Stock
name: ZeroXClem-Qwen2.5-7B-DistilPrism
merge_method: model_stock
base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
tokenizer_source: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
dtype: bfloat16
parameters:
  normalize: true
  rescale: true
models:
  - model: huihui-ai/DeepSeek-R1-Distill-Qwen-7B-abliterated-v2
    parameters:
      weight: 0.3
  - model: mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-7B-v1.1
    parameters:
      weight: 0.25
  - model: Triangle104/DSR1-Distill-Qwen-7B-RP
    parameters:
      weight: 0.2
  - model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
    parameters:
      weight: 0.25
🔑 Key Parameters
- Normalization & Rescaling: Ensures weight distributions remain balanced across all components.
- Model Stock Merge Method: Optimizes contribution from each model to retain the best attributes.
- Weighted Blending: The abliterated and re-distilled models contribute the most, refining both alignment and general usability.
🗣️ Inference
You can use the model for text generation as follows:
Ollama
Quickstart to Ollama Guide Here I recommend ollama for daily driver applications, as it supports thinkking tags.
ollama run hf.co/ZeroXClem/Qwen2.5-7B-DistilPrism
# If you are using quants, just copy the url and replace 'huggingface.co/' with 'hf.co/' followed by name of quant. 
Transformers
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
import torch
# Define the model name
model_name = "ZeroXClem/Qwen2.5-7B-DistilPrism"
# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name)
# Load the model
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
# Initialize the pipeline
text_generator = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
# Define the input prompt
prompt = "Explain the significance of artificial intelligence in modern healthcare."
# Generate the output
outputs = text_generator(
    prompt,
    max_new_tokens=150,
    do_sample=True,
    temperature=0.7,
    top_k=50,
    top_p=0.95
)
# Print the generated text
print(outputs[0]["generated_text"])
🎯 Use Case & Applications
Qwen2.5-7B-DistilPrism is designed for efficient, high-quality text generation with strong reasoning capabilities. It is well-suited for:
- Advanced Reasoning & Problem Solving: Excels in logic-heavy tasks and multi-step reasoning problems.
- Conversational AI: Optimized for fluid, responsive dialogue, reducing refusals and improving engagement.
- Mathematical & Scientific Computation: Enhanced math & code generation abilities compared to standard distillations.
- Content Creation & Summarization: Generates coherent and contextually rich text suitable for various applications.
📜 License
This model is released under the MIT License.
📊 Benchmark Results (Coming Soon)
We are currently in the process of quantizing and benchmarking this model. Stay tuned for performance updates across:
- IFEval (0-Shot)
- BBH (3-Shot)
- MATH (4-Shot)
- GPQA (0-Shot)
- MuSR (0-Shot)
- MMLU-PRO (5-Shot)
💡 Tags
- merge
- mergekit
- model_stock
- DeepSeek-R1
- Distillation
- abliterated
- re-distilled
- DeepSeek-R1-Distill-Qwen-7B
🙏 Special Thanks
This project wouldn't be possible without the incredible contributions from:
- @huihui-ai – For developing DeepSeek-R1-Distill-Qwen-7B-abliterated-v2, a bold step towards improving model alignment.
- @mobiuslabsgmbh – For refining distillation techniques with DeepSeek-R1-ReDistill-Qwen-7B-v1.1.
- @Triangle104 – For crafting innovative merges like DSR1-Distill-Qwen-7B-RP, an essential component in this blend.
- @deepseek-ai – For open-sourcing DeepSeek-R1-Distill-Qwen-7B, a foundation for reasoning advancements.
And a heartfelt thank you to everyone in the 🤗 & Open-Source AI community for their continued research, testing, and support. 💜🚀
🔗 Additional Resources
- Downloads last month
- -
