Files changed (1) hide show
  1. README.md +67 -54
README.md CHANGED
@@ -1,54 +1,67 @@
1
- ---
2
- base_model:
3
- - Qwen/Qwen2.5-Math-1.5B
4
- - Qwen/Qwen2.5-Coder-1.5B-Instruct
5
- - Qwen/Qwen2.5-Math-1.5B-Instruct
6
- - Qwen/Qwen2.5-1.5B-Instruct
7
- - Qwen/Qwen2.5-1.5B
8
- - Qwen/Qwen2.5-Coder-1.5B
9
- library_name: transformers
10
- tags:
11
- - mergekit
12
- - merge
13
-
14
- ---
15
- # merge
16
-
17
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
18
-
19
- ## Merge Details
20
- ### Merge Method
21
-
22
- This model was merged using the [TIES](https://arxiv.org/abs/2306.01708) merge method using [Qwen/Qwen2.5-1.5B](https://huggingface.co/Qwen/Qwen2.5-1.5B) as a base.
23
-
24
- ### Models Merged
25
-
26
- The following models were included in the merge:
27
- * [Qwen/Qwen2.5-Math-1.5B](https://huggingface.co/Qwen/Qwen2.5-Math-1.5B)
28
- * [Qwen/Qwen2.5-Coder-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-1.5B-Instruct)
29
- * [Qwen/Qwen2.5-Math-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Math-1.5B-Instruct)
30
- * [Qwen/Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct)
31
- * [Qwen/Qwen2.5-Coder-1.5B](https://huggingface.co/Qwen/Qwen2.5-Coder-1.5B)
32
-
33
- ### Configuration
34
-
35
- The following YAML configuration was used to produce this model:
36
-
37
- ```yaml
38
- merge_method: ties
39
- base_model: Qwen/Qwen2.5-1.5B
40
-
41
- models:
42
- - model: Qwen/Qwen2.5-1.5B
43
- - model: Qwen/Qwen2.5-Math-1.5B-Instruct
44
- - model: Qwen/Qwen2.5-Coder-1.5B-Instruct
45
- - model: Qwen/Qwen2.5-Math-1.5B
46
- - model: Qwen/Qwen2.5-1.5B-Instruct
47
- - model: Qwen/Qwen2.5-Coder-1.5B
48
-
49
- parameters:
50
- density: 0.5
51
- weight: 1.0
52
- int8_mask: true
53
- normalize: true
54
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - Qwen/Qwen2.5-Math-1.5B
4
+ - Qwen/Qwen2.5-Coder-1.5B-Instruct
5
+ - Qwen/Qwen2.5-Math-1.5B-Instruct
6
+ - Qwen/Qwen2.5-1.5B-Instruct
7
+ - Qwen/Qwen2.5-1.5B
8
+ - Qwen/Qwen2.5-Coder-1.5B
9
+ library_name: transformers
10
+ tags:
11
+ - mergekit
12
+ - merge
13
+ language:
14
+ - zho
15
+ - eng
16
+ - fra
17
+ - spa
18
+ - por
19
+ - deu
20
+ - ita
21
+ - rus
22
+ - jpn
23
+ - kor
24
+ - vie
25
+ - tha
26
+ - ara
27
+ ---
28
+ # merge
29
+
30
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
31
+
32
+ ## Merge Details
33
+ ### Merge Method
34
+
35
+ This model was merged using the [TIES](https://arxiv.org/abs/2306.01708) merge method using [Qwen/Qwen2.5-1.5B](https://huggingface.co/Qwen/Qwen2.5-1.5B) as a base.
36
+
37
+ ### Models Merged
38
+
39
+ The following models were included in the merge:
40
+ * [Qwen/Qwen2.5-Math-1.5B](https://huggingface.co/Qwen/Qwen2.5-Math-1.5B)
41
+ * [Qwen/Qwen2.5-Coder-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-1.5B-Instruct)
42
+ * [Qwen/Qwen2.5-Math-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Math-1.5B-Instruct)
43
+ * [Qwen/Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct)
44
+ * [Qwen/Qwen2.5-Coder-1.5B](https://huggingface.co/Qwen/Qwen2.5-Coder-1.5B)
45
+
46
+ ### Configuration
47
+
48
+ The following YAML configuration was used to produce this model:
49
+
50
+ ```yaml
51
+ merge_method: ties
52
+ base_model: Qwen/Qwen2.5-1.5B
53
+
54
+ models:
55
+ - model: Qwen/Qwen2.5-1.5B
56
+ - model: Qwen/Qwen2.5-Math-1.5B-Instruct
57
+ - model: Qwen/Qwen2.5-Coder-1.5B-Instruct
58
+ - model: Qwen/Qwen2.5-Math-1.5B
59
+ - model: Qwen/Qwen2.5-1.5B-Instruct
60
+ - model: Qwen/Qwen2.5-Coder-1.5B
61
+
62
+ parameters:
63
+ density: 0.5
64
+ weight: 1.0
65
+ int8_mask: true
66
+ normalize: true
67
+ ```