Model Card for Zagros-1.0-Quick

Model Details

Model Name: Zagros-1.0-Quick
Model Owner: Darsadilab
Model URL: https://huggingface.co/darsadilab/zagros-1.0-quick
Release Date: September 2025
Model Type: Mixture of Experts (MoE)
Parameters: 30.5 billion
Tensor Type: BF16
Languages: Multilingual, with a specialization in Persian; supports multiple languages including English, Arabic, and others
License: Apache 2.0
Version: 1.0
Authors: Mohammadmoein Pisoude, Aydin Babazadeh
Contributors: Aylin Bahari (Testers and Performance Optimization)

Model Description

Zagros-1.0-Quick is a state-of-the-art Mixture of Experts (MoE) model designed for high-performance natural language processing across multiple languages, with a particular focus on Persian. Built using world-standard methods, the model leverages a 30.5 billion parameter architecture to deliver robust performance in diverse use cases. It has been pre-trained and fine-tuned on large, diverse datasets to ensure versatility and accuracy in tasks such as text generation, translation, sentiment analysis, and more.

Key Features

Multilingual Capability: Optimized for Persian, with strong performance in other languages like English, Arabic, and additional global languages.
Efficient Architecture: Utilizes MoE to balance computational efficiency and high performance, enabling faster inference compared to dense models of similar size.
Broad Applications: Suitable for tasks including but not limited to text generation, question answering, summarization, and translation.
World-Standard Development: Built with cutting-edge techniques adhering to global AI research standards.

Intended Use

Primary Use Cases

Text Generation: Producing coherent and contextually relevant text in multiple languages, especially Persian.
Translation: High-quality translation, particularly for Persian to/from other languages.
Sentiment Analysis: Understanding and analyzing sentiment in multilingual contexts.
Question Answering: Providing accurate and context-aware responses in various domains.

Out-of-Scope Use

Real-time applications requiring ultra-low latency without specialized hardware.
Tasks requiring factual correctness without additional verification, as the model may generate plausible but incorrect information.
Use in safety-critical systems without thorough validation and risk assessment.

Training Details

Pre-Training

Dataset: A large, diverse corpus comprising web-crawled data, open-domain texts, and curated multilingual datasets, with a significant portion of Persian-language data.
Methodology: Pre-trained using a Mixture of Experts architecture to optimize for efficiency and performance. Training involved unsupervised learning on massive text corpora to capture linguistic patterns and knowledge.
Compute Resources: Trained on a cluster of high-performance GPUs over several weeks, leveraging distributed training techniques.

Fine-Tuning

Dataset: Fine-tuned on a curated dataset including task-specific data for text generation, translation, and sentiment analysis, with an emphasis on Persian-language performance.
Methodology: Supervised fine-tuning and reinforcement learning from human feedback (RLHF) to align the model with user expectations and improve task-specific performance.
Data Sources: Includes publicly available datasets, proprietary Persian-language corpora, and synthetic data generated for robustness.

Hyperparameters

Learning Rate: 2e-5 (decayed during training)
Batch Size: 2048 (effective, distributed across GPUs)
Optimizer: AdamW
Training Steps: Approximately 1 million steps for pre-training, followed by 50,000 steps for fine-tuning
MoE Configuration: 8 experts per layer, with top-2 expert routing

Evaluation

Performance Metrics

Perplexity: Achieves competitive perplexity on multilingual benchmarks, particularly strong on Persian-language datasets.
Task-Specific Metrics:
- Translation (BLEU): 35.2 on Persian-English WMT dataset.
- Text Generation (ROUGE): ROUGE-L of 0.68 on Persian summarization tasks.
- Sentiment Analysis (F1): 0.89 F1-score on Persian sentiment datasets.
Multilingual Benchmarks: Evaluated on XGLUE and XTREME, showing strong cross-lingual transfer capabilities.

Limitations

Hallucination Risk: Like other large language models, Zagros-1.0-Quick may generate plausible but factually incorrect outputs.
Language Bias: While optimized for Persian, performance on low-resource languages may be less robust.
Resource Requirements: Requires significant computational resources for inference, though optimized for efficiency via MoE.

Ethical Considerations

Bias and Fairness: The model was trained on diverse datasets, but biases present in the training data may persist. Users should evaluate outputs for unintended biases, particularly in sensitive applications.
Environmental Impact: Training large models like Zagros-1.0-Quick consumes significant energy. Efforts were made to optimize compute efficiency, but users should consider environmental costs for large-scale deployment.
Responsible Use: Users are encouraged to verify outputs for accuracy and appropriateness, especially in contexts involving legal, medical, or financial decisions.

Usage Instructions

Installation

To use Zagros-1.0-Quick with the specific version of the Transformers library from ZagrosLLMModel, install it using:

pip install git+https://github.com/ZagrosLLMModel/transformers.git@main

Inference

Hardware Requirements: Recommended to use a GPU with at least 64GB VRAM for efficient inference. CPU inference is possible but slower.
Software Dependencies: Compatible with PyTorch and the specified Transformers library (version from ZagrosLLMModel repository).
Example Code:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "darsadilab/zagros-1.0-quick"

# load the tokenizer and the model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)

# prepare the model input
prompt = "یک وبسایت حرفه ای با استفاده از html طراحی کن که تک کد باشد و شامل css/js داخل همین html باشد."
messages = [
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

# conduct text completion
generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=16384
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist() 

content = tokenizer.decode(output_ids, skip_special_tokens=True)

print("content:", content)

Deployment

Available for download via Hugging Face Hub.
Currently not deployed by any inference provider. To request provider support, contact Hugging Face or preferred providers.

Contact Information

Organization: Darsadilab
Connect us: Use Community
Hugging Face Profile: https://huggingface.co/darsadilab

Acknowledgments

Built with contributions from the open-source community and leveraging tools from Hugging Face.
Special thanks to the Persian NLP community for providing valuable datasets and feedback.

Citation

If you use Zagros-1.0-Quick in your research or application, please cite:

@misc{darsadilab2025zagros,
  title={Zagros-1.0-Quick: A Multilingual MoE Model with Persian Specialization},
  author={Mohammadmoein Pisoude and Aydin Babazadeh and Aylin Bahari},
  year={2025},
  url={https://huggingface.co/darsadilab/zagros-1.0-quick}
}

Downloads last month: 23

Safetensors

Model size

31B params

Tensor type

BF16

Model tree for darsadilab/zagros-1.0-quick

Base model

Qwen/Qwen3-30B-A3B-Instruct-2507

Finetuned

(16)

this model

darsadilab
/

zagros-1.0-quick