jetmoe
/

jetmoe-8b

@@ -2,15 +2,14 @@
 license: apache-2.0
 ---
-# JetMoE: Reaching LLaMA2 Performance with 0.1M Dollars
 <div align="center">
   <div>&nbsp;</div>
   <img src="https://cdn-uploads.huggingface.co/production/uploads/641de0213239b631552713e4/ieHnwuczidNNoGRA_FN2y.png" width="500"/>
   <img src="https://cdn-uploads.huggingface.co/production/uploads/641de0213239b631552713e4/UOsk9_zcbHpCCy6kmryYM.png" width="530"/>
 </div>
 ## Key Messages
 1. JetMoE-8B is **trained with less than $ 0.1 million**<sup>1</sup> **cost but outperforms LLaMA2-7B from Meta AI**, who has multi-billion-dollar training resources. LLM training can be **much cheaper than people generally thought**.
@@ -63,7 +62,7 @@ We use the same evaluation methodology as in the Open LLM leaderboard. For MBPP
 To our surprise, despite the lower training cost and computation, JetMoE-8B performs even better than LLaMA2-7B, LLaMA-13B, and DeepseekMoE-16B. Compared to a model with similar training and inference computation, like Gemma-2B, JetMoE-8B achieves better performance.
 ## Model Usage
-To load the models, you need install this package:
 ```
 pip install -e .
 ```

 license: apache-2.0
 ---
 <div align="center">
   <div>&nbsp;</div>
   <img src="https://cdn-uploads.huggingface.co/production/uploads/641de0213239b631552713e4/ieHnwuczidNNoGRA_FN2y.png" width="500"/>
   <img src="https://cdn-uploads.huggingface.co/production/uploads/641de0213239b631552713e4/UOsk9_zcbHpCCy6kmryYM.png" width="530"/>
 </div>
+# JetMoE: Reaching LLaMA2 Performance with 0.1M Dollars
 ## Key Messages
 1. JetMoE-8B is **trained with less than $ 0.1 million**<sup>1</sup> **cost but outperforms LLaMA2-7B from Meta AI**, who has multi-billion-dollar training resources. LLM training can be **much cheaper than people generally thought**.
 To our surprise, despite the lower training cost and computation, JetMoE-8B performs even better than LLaMA2-7B, LLaMA-13B, and DeepseekMoE-16B. Compared to a model with similar training and inference computation, like Gemma-2B, JetMoE-8B achieves better performance.
 ## Model Usage
+To load the models, you need install [this package](https://github.com/myshell-ai/JetMoE):
 ```
 pip install -e .
 ```