mistralai
/

Mistral-7B-v0.1

Text Generation

text-generation-inference

Model card Files Files and versions

Bam4d commited on Sep 27, 2023

Commit

f592c5f

·

1 Parent(s): 0e5a0e2

Small updates

Files changed (1) hide show

README.md +52 -3

README.md CHANGED Viewed

@@ -1,3 +1,52 @@
----
-license: apache-2.0
----

+# **Model Details**
+The Mistral AI-7B-v0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters. Mistral AI-7B-v0.1 outperforms Llama 2 13B on all benchmarks we tested.
+**Model Developers** Mistral AI.
+**Variations** None.
+**Input** Text only.
+**Output** Text only.
+**Model Architecture** Mistral AI-7B-v0.1 is a transformer model, with the following architecture choices:
+- Grouped-Query Attention
+- Sliding-Window Attention
+- Byte-fallback BPE tokenizer
+**Model Dates** Mistral AI-7B-v0.1 was trained between June and September 2023.
+**Status** This is a static model. Future models will have new version numbers.
+**License** Apache 2.0 license.
+**Research Paper** TODO: Coming soon.
+**Where to send questions or comments about the model** TODO: How do people send comments?
+# **Intended Use**
+**Intended Use Cases** Mistral AI-7B-v0.1 is for commercial and research use. It can be adapted for a variety of natural language generation tasks.
+# **Evaluation Results**
+We report the standard benchmark results for Mistral AI-7B-v0.1. We use a custom evaluation library to produce the results.
+| Model           | Size | hellaswag | winogrande | piqa   | boolq  | arc_easy | arc_challenge | naturalqs | naturalqs_5shot | triviaqa_5shot | triviaqa | humaneval_pass@1 | mbpp_pass@1 | mmlu   | math   | gsm8k  |
+|-----------------|------|-----------|------------|--------|--------|----------|---------------|-----------|-----------------|----------------|----------|------------------|-------------|--------|--------|--------|
+| Mistral-7B-v0.1 | 7B   | 81.19%    | 75.53%     | 82.92% | 83.52% | 80.01%   | 55.38%        | 23.96%    | 28.92%          | 69.88%         | 63.22%   | 29.88%           | 47.86%      | 59.99% | 11.94% | 39.35% |
+**Theme-based grouping**
+-   Commonsense Reasoning: 0-shot average of Hellaswag, Winogrande, PIQA, SIQA, OpenbookQA, ARC-Easy, ARC-Challenge, and CommonsenseQA.
+-   World Knowledge: 5-shot average of NaturalQuestions and TriviaQA.
+-   Reading Comprehension: 0-shot average of BoolQ and QuAC.
+-   Math: Average of 8-shot GSM8K with maj@8 and 4-shot MATH with maj@4
+-   Code: Average of 0-shot Humaneval and 3-shot MBPP
+-   Popular aggregated results: 5-shot MMLU, 3-shot BBH, and 3-5-shot AGI Eval (English multiple-choice questions only)
+# **Ethical Considerations and Limitations**
+TODO: what do we say here?