---
license: mit
host_model:
- toksuite/meta-llama-Llama-3.2-1B
- toksuite/Qwen-Qwen3-8B
tags:
- merge
- parameter-averaging
- flexitok
---

# Merged Model: qwen_onto_llama_lambda-0.5

This model is a result of **parameter averaging** (Model Soup) across 2 models.

### Merged Models
The following models were included in the merge:
- toksuite/meta-llama-Llama-3.2-1B
- toksuite/Qwen-Qwen3-8B

### Merging Configuration
- **Method**: Weighted Parameter Averaging
- **Weights**: Simple average with merging lambda = 0.5.
- **Excluded Layers**: Embeddings and LM Head were kept from the host model (toksuite/meta-llama-Llama-3.2-1B).

### Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("flexitok/qwen_onto_llama_lambda-0.5")
tokenizer = AutoTokenizer.from_pretrained("flexitok/qwen_onto_llama_lambda-0.5")