Update README.md
Browse files
README.md
CHANGED
|
@@ -8,8 +8,26 @@ tags:
|
|
| 8 |
- sql
|
| 9 |
---
|
| 10 |
|
| 11 |
-
###
|
| 12 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
このモデルは、MergeKitツールを使用して作成されたMixture of Experts (MoE) 言語モデルです。
|
| 14 |
|
| 15 |
[Llama-3-Umievo-itr014-Shizuko-8b](https://huggingface.co/umiyuki/Llama-3-Umievo-itr014-Shizuko-8b) に、SQLデータセットでファインチューニングされた[rdefog/llama-3-sqlcoder-8b](https://huggingface.co/defog/llama-3-sqlcoder-8b)を合わせることで、日本語能力とSQL生成能力を両立させようとしたMoEモデルです。
|
|
@@ -31,22 +49,3 @@ WSL2やGoogle Colaboratotry Proでの作成後、Llama.cppとLMstudioにて動
|
|
| 31 |
- RAM: DDR4-3200 96GB
|
| 32 |
- OS: Windows 11
|
| 33 |
|
| 34 |
-
---
|
| 35 |
-
|
| 36 |
-
### Model Description
|
| 37 |
-
This model is a Mixture of Experts (MoE) language model created using the MergeKit tool.
|
| 38 |
-
This MoE model aims to achieve both Japanese language ability and SQL generation capability by combining [Llama-3-Umievo-itr014-Shizuko-8b](https://huggingface.co/umiyuki/Llama-3-Umievo-itr014-Shizuko-8b), released by umiyuki, with [rdefog/llama-3-sqlcoder-8b](https://huggingface.co/defog/llama-3-sqlcoder-8b), which has been fine-tuned on an SQL dataset.
|
| 39 |
-
|
| 40 |
-
### Model Details
|
| 41 |
-
- **Model Name**: Llama-3-Umievo-Shizuko-sqlcoder-2x8B
|
| 42 |
-
- **Model Architecture**: Mixture of Experts (MoE)
|
| 43 |
-
- **Base Models**: rdefog/llama-3-sqlcoder-8b, defog/llama-3-sqlcoder-8b
|
| 44 |
-
- **Merge Tool**: MergeKit
|
| 45 |
-
|
| 46 |
-
#### Required Specifications
|
| 47 |
-
If using the Q4_K_M quantized model, it can be fully loaded on an RTX 3060 12GB.
|
| 48 |
-
The author has created the model using WSL2 and Google Colaboratory Pro, and has tested it using Llama.cpp and LMstudio.
|
| 49 |
-
- CPU: Ryzen 5 3600
|
| 50 |
-
- GPU: GeForce RTX 3060 12GB
|
| 51 |
-
- RAM: DDR4-3200 96GB
|
| 52 |
-
- OS: Windows 10
|
|
|
|
| 8 |
- sql
|
| 9 |
---
|
| 10 |
|
| 11 |
+
### Model Description(Japanese explanation is below.)
|
| 12 |
|
| 13 |
+
This model is a Mixture of Experts (MoE) language model created using the MergeKit tool.
|
| 14 |
+
This MoE model aims to achieve both Japanese language ability and SQL generation capability by combining [Llama-3-Umievo-itr014-Shizuko-8b](https://huggingface.co/umiyuki/Llama-3-Umievo-itr014-Shizuko-8b), released by umiyuki, with [rdefog/llama-3-sqlcoder-8b](https://huggingface.co/defog/llama-3-sqlcoder-8b), which has been fine-tuned on an SQL dataset.
|
| 15 |
+
|
| 16 |
+
### Model Details
|
| 17 |
+
- **Model Name**: Llama-3-Umievo-Shizuko-sqlcoder-2x8B
|
| 18 |
+
- **Model Architecture**: Mixture of Experts (MoE)
|
| 19 |
+
- **Base Models**: rdefog/llama-3-sqlcoder-8b, defog/llama-3-sqlcoder-8b
|
| 20 |
+
- **Merge Tool**: MergeKit
|
| 21 |
+
|
| 22 |
+
#### Required Specifications
|
| 23 |
+
If using the Q4_K_M quantized model, it can be fully loaded on an RTX 3060 12GB.
|
| 24 |
+
The author has created the model using WSL2 and Google Colaboratory Pro, and has tested it using Llama.cpp and LMstudio.
|
| 25 |
+
- CPU: Ryzen 5 3600
|
| 26 |
+
- GPU: GeForce RTX 3060 12GB
|
| 27 |
+
- RAM: DDR4-3200 96GB
|
| 28 |
+
- OS: Windows 10
|
| 29 |
+
|
| 30 |
+
### モデルの説明
|
| 31 |
このモデルは、MergeKitツールを使用して作成されたMixture of Experts (MoE) 言語モデルです。
|
| 32 |
|
| 33 |
[Llama-3-Umievo-itr014-Shizuko-8b](https://huggingface.co/umiyuki/Llama-3-Umievo-itr014-Shizuko-8b) に、SQLデータセットでファインチューニングされた[rdefog/llama-3-sqlcoder-8b](https://huggingface.co/defog/llama-3-sqlcoder-8b)を合わせることで、日本語能力とSQL生成能力を両立させようとしたMoEモデルです。
|
|
|
|
| 49 |
- RAM: DDR4-3200 96GB
|
| 50 |
- OS: Windows 11
|
| 51 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|