keitokei1994 commited on
Commit
8ae3aea
·
verified ·
1 Parent(s): ef6983c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -20
README.md CHANGED
@@ -8,8 +8,26 @@ tags:
8
  - sql
9
  ---
10
 
11
- ### モデルの説明(English explanation is below.)
12
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  このモデルは、MergeKitツールを使用して作成されたMixture of Experts (MoE) 言語モデルです。
14
 
15
  [Llama-3-Umievo-itr014-Shizuko-8b](https://huggingface.co/umiyuki/Llama-3-Umievo-itr014-Shizuko-8b) に、SQLデータセットでファインチューニングされた[rdefog/llama-3-sqlcoder-8b](https://huggingface.co/defog/llama-3-sqlcoder-8b)を合わせることで、日本語能力とSQL生成能力を両立させようとしたMoEモデルです。
@@ -31,22 +49,3 @@ WSL2やGoogle Colaboratotry Proでの作成後、Llama.cppとLMstudioにて動
31
  - RAM: DDR4-3200 96GB
32
  - OS: Windows 11
33
 
34
- ---
35
-
36
- ### Model Description
37
- This model is a Mixture of Experts (MoE) language model created using the MergeKit tool.
38
- This MoE model aims to achieve both Japanese language ability and SQL generation capability by combining [Llama-3-Umievo-itr014-Shizuko-8b](https://huggingface.co/umiyuki/Llama-3-Umievo-itr014-Shizuko-8b), released by umiyuki, with [rdefog/llama-3-sqlcoder-8b](https://huggingface.co/defog/llama-3-sqlcoder-8b), which has been fine-tuned on an SQL dataset.
39
-
40
- ### Model Details
41
- - **Model Name**: Llama-3-Umievo-Shizuko-sqlcoder-2x8B
42
- - **Model Architecture**: Mixture of Experts (MoE)
43
- - **Base Models**: rdefog/llama-3-sqlcoder-8b, defog/llama-3-sqlcoder-8b
44
- - **Merge Tool**: MergeKit
45
-
46
- #### Required Specifications
47
- If using the Q4_K_M quantized model, it can be fully loaded on an RTX 3060 12GB.
48
- The author has created the model using WSL2 and Google Colaboratory Pro, and has tested it using Llama.cpp and LMstudio.
49
- - CPU: Ryzen 5 3600
50
- - GPU: GeForce RTX 3060 12GB
51
- - RAM: DDR4-3200 96GB
52
- - OS: Windows 10
 
8
  - sql
9
  ---
10
 
11
+ ### Model Description(Japanese explanation is below.)
12
 
13
+ This model is a Mixture of Experts (MoE) language model created using the MergeKit tool.
14
+ This MoE model aims to achieve both Japanese language ability and SQL generation capability by combining [Llama-3-Umievo-itr014-Shizuko-8b](https://huggingface.co/umiyuki/Llama-3-Umievo-itr014-Shizuko-8b), released by umiyuki, with [rdefog/llama-3-sqlcoder-8b](https://huggingface.co/defog/llama-3-sqlcoder-8b), which has been fine-tuned on an SQL dataset.
15
+
16
+ ### Model Details
17
+ - **Model Name**: Llama-3-Umievo-Shizuko-sqlcoder-2x8B
18
+ - **Model Architecture**: Mixture of Experts (MoE)
19
+ - **Base Models**: rdefog/llama-3-sqlcoder-8b, defog/llama-3-sqlcoder-8b
20
+ - **Merge Tool**: MergeKit
21
+
22
+ #### Required Specifications
23
+ If using the Q4_K_M quantized model, it can be fully loaded on an RTX 3060 12GB.
24
+ The author has created the model using WSL2 and Google Colaboratory Pro, and has tested it using Llama.cpp and LMstudio.
25
+ - CPU: Ryzen 5 3600
26
+ - GPU: GeForce RTX 3060 12GB
27
+ - RAM: DDR4-3200 96GB
28
+ - OS: Windows 10
29
+
30
+ ### モデルの説明
31
  このモデルは、MergeKitツールを使用して作成されたMixture of Experts (MoE) 言語モデルです。
32
 
33
  [Llama-3-Umievo-itr014-Shizuko-8b](https://huggingface.co/umiyuki/Llama-3-Umievo-itr014-Shizuko-8b) に、SQLデータセットでファインチューニングされた[rdefog/llama-3-sqlcoder-8b](https://huggingface.co/defog/llama-3-sqlcoder-8b)を合わせることで、日本語能力とSQL生成能力を両立させようとしたMoEモデルです。
 
49
  - RAM: DDR4-3200 96GB
50
  - OS: Windows 11
51