File size: 2,104 Bytes
471ecba |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 |
---
base_model:
- AXCXEPT/EZO-Qwen2.5-32B-Instruct
- qihoo360/Light-R1-32B
- YOYO-AI/Qwen2.5-Coder-32B-YOYO
- Skywork/Skywork-OR1-32B-Preview
- fblgit/TheBeagle-v2beta-32B-MGS
- Qwen/QwQ-32B
- Qwen/Qwen2.5-32B-Instruct
- deepcogito/cogito-v1-preview-qwen-32B
- tanliboy/lambda-qwen2.5-32b-dpo-test
library_name: transformers
tags:
- mergekit
- merge
---
# merge
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
## Merge Details
### Merge Method
This model was merged using the [Karcher Mean](https://en.wikipedia.org/wiki/Karcher_mean) merge method using [Qwen/Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct) as a base.
### Models Merged
The following models were included in the merge:
* [AXCXEPT/EZO-Qwen2.5-32B-Instruct](https://huggingface.co/AXCXEPT/EZO-Qwen2.5-32B-Instruct)
* [qihoo360/Light-R1-32B](https://huggingface.co/qihoo360/Light-R1-32B)
* [YOYO-AI/Qwen2.5-Coder-32B-YOYO](https://huggingface.co/YOYO-AI/Qwen2.5-Coder-32B-YOYO)
* [Skywork/Skywork-OR1-32B-Preview](https://huggingface.co/Skywork/Skywork-OR1-32B-Preview)
* [fblgit/TheBeagle-v2beta-32B-MGS](https://huggingface.co/fblgit/TheBeagle-v2beta-32B-MGS)
* [Qwen/QwQ-32B](https://huggingface.co/Qwen/QwQ-32B)
* [deepcogito/cogito-v1-preview-qwen-32B](https://huggingface.co/deepcogito/cogito-v1-preview-qwen-32B)
* [tanliboy/lambda-qwen2.5-32b-dpo-test](https://huggingface.co/tanliboy/lambda-qwen2.5-32b-dpo-test)
### Configuration
The following YAML configuration was used to produce this model:
```yaml
models:
- model: YOYO-AI/Qwen2.5-Coder-32B-YOYO
- model: Qwen/QwQ-32B
- model: Skywork/Skywork-OR1-32B-Preview
- model: deepcogito/cogito-v1-preview-qwen-32B
- model: qihoo360/Light-R1-32B
- model: AXCXEPT/EZO-Qwen2.5-32B-Instruct
- model: fblgit/TheBeagle-v2beta-32B-MGS
- model: tanliboy/lambda-qwen2.5-32b-dpo-test
- model: Qwen/Qwen2.5-32B-Instruct
merge_method: karcher
base_model: Qwen/Qwen2.5-32B-Instruct
parameters:
max_iter: 1000
normalize: true
int8_mask: true
tokenizer_source: base
dtype: float16
```
|