Model Card for Zagros-1.0-Quick

Model Details

  • Model Name: Zagros-1.0-Quick
  • Model Owner: Darsadilab
  • Model URL: https://huggingface.co/darsadilab/zagros-1.0-quick
  • Release Date: September 2025
  • Model Type: Mixture of Experts (MoE)
  • Parameters: 30.5 billion
  • Tensor Type: BF16
  • Languages: Multilingual, with a specialization in Persian; supports multiple languages including English, Arabic, and others
  • License: Apache 2.0
  • Version: 1.0
  • Authors: Mohammadmoein Pisoude, Aydin Babazadeh
  • Contributors: Aylin Bahari (Testers and Performance Optimization)

Model Description

Zagros-1.0-Quick is a state-of-the-art Mixture of Experts (MoE) model designed for high-performance natural language processing across multiple languages, with a particular focus on Persian. Built using world-standard methods, the model leverages a 30.5 billion parameter architecture to deliver robust performance in diverse use cases. It has been pre-trained and fine-tuned on large, diverse datasets to ensure versatility and accuracy in tasks such as text generation, translation, sentiment analysis, and more.

Key Features

  • Multilingual Capability: Optimized for Persian, with strong performance in other languages like English, Arabic, and additional global languages.
  • Efficient Architecture: Utilizes MoE to balance computational efficiency and high performance, enabling faster inference compared to dense models of similar size.
  • Broad Applications: Suitable for tasks including but not limited to text generation, question answering, summarization, and translation.
  • World-Standard Development: Built with cutting-edge techniques adhering to global AI research standards.

Intended Use

Primary Use Cases

  • Text Generation: Producing coherent and contextually relevant text in multiple languages, especially Persian.
  • Translation: High-quality translation, particularly for Persian to/from other languages.
  • Sentiment Analysis: Understanding and analyzing sentiment in multilingual contexts.
  • Question Answering: Providing accurate and context-aware responses in various domains.

Out-of-Scope Use

  • Real-time applications requiring ultra-low latency without specialized hardware.
  • Tasks requiring factual correctness without additional verification, as the model may generate plausible but incorrect information.
  • Use in safety-critical systems without thorough validation and risk assessment.

Training Details

Pre-Training

  • Dataset: A large, diverse corpus comprising web-crawled data, open-domain texts, and curated multilingual datasets, with a significant portion of Persian-language data.
  • Methodology: Pre-trained using a Mixture of Experts architecture to optimize for efficiency and performance. Training involved unsupervised learning on massive text corpora to capture linguistic patterns and knowledge.
  • Compute Resources: Trained on a cluster of high-performance GPUs over several weeks, leveraging distributed training techniques.

Fine-Tuning

  • Dataset: Fine-tuned on a curated dataset including task-specific data for text generation, translation, and sentiment analysis, with an emphasis on Persian-language performance.
  • Methodology: Supervised fine-tuning and reinforcement learning from human feedback (RLHF) to align the model with user expectations and improve task-specific performance.
  • Data Sources: Includes publicly available datasets, proprietary Persian-language corpora, and synthetic data generated for robustness.

Hyperparameters

  • Learning Rate: 2e-5 (decayed during training)
  • Batch Size: 2048 (effective, distributed across GPUs)
  • Optimizer: AdamW
  • Training Steps: Approximately 1 million steps for pre-training, followed by 50,000 steps for fine-tuning
  • MoE Configuration: 8 experts per layer, with top-2 expert routing

Evaluation

Performance Metrics

  • Perplexity: Achieves competitive perplexity on multilingual benchmarks, particularly strong on Persian-language datasets.
  • Task-Specific Metrics:
    • Translation (BLEU): 35.2 on Persian-English WMT dataset.
    • Text Generation (ROUGE): ROUGE-L of 0.68 on Persian summarization tasks.
    • Sentiment Analysis (F1): 0.89 F1-score on Persian sentiment datasets.
  • Multilingual Benchmarks: Evaluated on XGLUE and XTREME, showing strong cross-lingual transfer capabilities.

Limitations

  • Hallucination Risk: Like other large language models, Zagros-1.0-Quick may generate plausible but factually incorrect outputs.
  • Language Bias: While optimized for Persian, performance on low-resource languages may be less robust.
  • Resource Requirements: Requires significant computational resources for inference, though optimized for efficiency via MoE.

Ethical Considerations

  • Bias and Fairness: The model was trained on diverse datasets, but biases present in the training data may persist. Users should evaluate outputs for unintended biases, particularly in sensitive applications.
  • Environmental Impact: Training large models like Zagros-1.0-Quick consumes significant energy. Efforts were made to optimize compute efficiency, but users should consider environmental costs for large-scale deployment.
  • Responsible Use: Users are encouraged to verify outputs for accuracy and appropriateness, especially in contexts involving legal, medical, or financial decisions.

Usage Instructions

Installation

To use Zagros-1.0-Quick with the specific version of the Transformers library from ZagrosLLMModel, install it using:

pip install git+https://github.com/ZagrosLLMModel/transformers.git@main

Inference

  • Hardware Requirements: Recommended to use a GPU with at least 64GB VRAM for efficient inference. CPU inference is possible but slower.
  • Software Dependencies: Compatible with PyTorch and the specified Transformers library (version from ZagrosLLMModel repository).
  • Example Code:
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "darsadilab/zagros-1.0-quick"

# load the tokenizer and the model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)

# prepare the model input
prompt = "یک وبسایت حرفه ای با استفاده از html طراحی کن که تک کد باشد و شامل css/js داخل همین html باشد."
messages = [
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

# conduct text completion
generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=16384
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist() 

content = tokenizer.decode(output_ids, skip_special_tokens=True)

print("content:", content)

Deployment

  • Available for download via Hugging Face Hub.
  • Currently not deployed by any inference provider. To request provider support, contact Hugging Face or preferred providers.

Contact Information

Acknowledgments

  • Built with contributions from the open-source community and leveraging tools from Hugging Face.
  • Special thanks to the Persian NLP community for providing valuable datasets and feedback.

Citation

If you use Zagros-1.0-Quick in your research or application, please cite:

@misc{darsadilab2025zagros,
  title={Zagros-1.0-Quick: A Multilingual MoE Model with Persian Specialization},
  author={Mohammadmoein Pisoude and Aydin Babazadeh and Aylin Bahari},
  year={2025},
  url={https://huggingface.co/darsadilab/zagros-1.0-quick}
}
Downloads last month
23
Safetensors
Model size
31B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 2 Ask for provider support

Model tree for darsadilab/zagros-1.0-quick

Finetuned
(16)
this model

Space using darsadilab/zagros-1.0-quick 1