Update README.md
Browse files
README.md
CHANGED
|
@@ -16,8 +16,7 @@ pipeline_tag: text-generation
|
|
| 16 |
model-index:
|
| 17 |
- name: zephyr-7b-beta
|
| 18 |
results:
|
| 19 |
-
|
| 20 |
-
# AI2 Reasoning Challenge (25-Shot) (Open LLM Leaderboard)
|
| 21 |
- task:
|
| 22 |
type: text-generation
|
| 23 |
name: Text Generation
|
|
@@ -43,7 +42,7 @@ model-index:
|
|
| 43 |
name: Open LLM Leaderboard
|
| 44 |
url: https://huggingface.co/datasets/open-llm-leaderboard/details_HuggingFaceH4__zephyr-7b-beta_public
|
| 45 |
|
| 46 |
-
# HellaSwag (10-shot)
|
| 47 |
- task:
|
| 48 |
type: text-generation
|
| 49 |
name: Text Generation
|
|
@@ -68,7 +67,7 @@ model-index:
|
|
| 68 |
name: Open LLM Leaderboard
|
| 69 |
url: https://huggingface.co/datasets/open-llm-leaderboard/details_HuggingFaceH4__zephyr-7b-beta_public
|
| 70 |
|
| 71 |
-
# DROP (3-shot)
|
| 72 |
- task:
|
| 73 |
type: text-generation
|
| 74 |
name: Text Generation
|
|
@@ -93,7 +92,7 @@ model-index:
|
|
| 93 |
name: Open LLM Leaderboard
|
| 94 |
url: https://huggingface.co/datasets/open-llm-leaderboard/details_HuggingFaceH4__zephyr-7b-beta_public
|
| 95 |
|
| 96 |
-
# TruthfulQA (0-shot)
|
| 97 |
- task:
|
| 98 |
type: text-generation
|
| 99 |
name: Text Generation
|
|
@@ -117,7 +116,7 @@ model-index:
|
|
| 117 |
name: Open LLM Leaderboard
|
| 118 |
url: https://huggingface.co/datasets/open-llm-leaderboard/details_HuggingFaceH4__zephyr-7b-beta_public
|
| 119 |
|
| 120 |
-
# GSM8k (5-shot)
|
| 121 |
- task:
|
| 122 |
type: text-generation
|
| 123 |
name: Text Generation
|
|
@@ -137,7 +136,7 @@ model-index:
|
|
| 137 |
name: Open LLM Leaderboard
|
| 138 |
url: https://huggingface.co/datasets/open-llm-leaderboard/details_HuggingFaceH4__zephyr-7b-beta_public
|
| 139 |
|
| 140 |
-
# MMLU (5-Shot)
|
| 141 |
# ???
|
| 142 |
|
| 143 |
# AlpacaEval (taken from model card)
|
|
@@ -296,7 +295,9 @@ The following hyperparameters were used during training:
|
|
| 296 |
- lr_scheduler_type: linear
|
| 297 |
- lr_scheduler_warmup_ratio: 0.1
|
| 298 |
- num_epochs: 3.0
|
|
|
|
| 299 |
### Training results
|
|
|
|
| 300 |
The table below shows the full set of DPO training metrics:
|
| 301 |
| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
|
| 302 |
|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
|
|
@@ -358,12 +359,16 @@ The table below shows the full set of DPO training metrics:
|
|
| 358 |
| 0.0077 | 2.89 | 5600 | 0.7520 | -4.5586 | -8.3485 | 0.7969 | 3.7899 | -340.4545 | -299.8206 | -2.3078 | -2.3517 |
|
| 359 |
| 0.0094 | 2.94 | 5700 | 0.7527 | -4.5542 | -8.3509 | 0.7812 | 3.7967 | -340.4790 | -299.7773 | -2.3062 | -2.3510 |
|
| 360 |
| 0.0054 | 2.99 | 5800 | 0.7520 | -4.5169 | -8.3079 | 0.7812 | 3.7911 | -340.0493 | -299.4038 | -2.3081 | -2.3530 |
|
|
|
|
| 361 |
### Framework versions
|
|
|
|
| 362 |
- Transformers 4.35.0.dev0
|
| 363 |
- Pytorch 2.0.1+cu118
|
| 364 |
- Datasets 2.12.0
|
| 365 |
- Tokenizers 0.14.0
|
|
|
|
| 366 |
## Citation
|
|
|
|
| 367 |
If you find Zephyr-7B-β is useful in your work, please cite it with:
|
| 368 |
```
|
| 369 |
@misc{tunstall2023zephyr,
|
|
|
|
| 16 |
model-index:
|
| 17 |
- name: zephyr-7b-beta
|
| 18 |
results:
|
| 19 |
+
# AI2 Reasoning Challenge (25-Shot)
|
|
|
|
| 20 |
- task:
|
| 21 |
type: text-generation
|
| 22 |
name: Text Generation
|
|
|
|
| 42 |
name: Open LLM Leaderboard
|
| 43 |
url: https://huggingface.co/datasets/open-llm-leaderboard/details_HuggingFaceH4__zephyr-7b-beta_public
|
| 44 |
|
| 45 |
+
# HellaSwag (10-shot)
|
| 46 |
- task:
|
| 47 |
type: text-generation
|
| 48 |
name: Text Generation
|
|
|
|
| 67 |
name: Open LLM Leaderboard
|
| 68 |
url: https://huggingface.co/datasets/open-llm-leaderboard/details_HuggingFaceH4__zephyr-7b-beta_public
|
| 69 |
|
| 70 |
+
# DROP (3-shot)
|
| 71 |
- task:
|
| 72 |
type: text-generation
|
| 73 |
name: Text Generation
|
|
|
|
| 92 |
name: Open LLM Leaderboard
|
| 93 |
url: https://huggingface.co/datasets/open-llm-leaderboard/details_HuggingFaceH4__zephyr-7b-beta_public
|
| 94 |
|
| 95 |
+
# TruthfulQA (0-shot)
|
| 96 |
- task:
|
| 97 |
type: text-generation
|
| 98 |
name: Text Generation
|
|
|
|
| 116 |
name: Open LLM Leaderboard
|
| 117 |
url: https://huggingface.co/datasets/open-llm-leaderboard/details_HuggingFaceH4__zephyr-7b-beta_public
|
| 118 |
|
| 119 |
+
# GSM8k (5-shot)
|
| 120 |
- task:
|
| 121 |
type: text-generation
|
| 122 |
name: Text Generation
|
|
|
|
| 136 |
name: Open LLM Leaderboard
|
| 137 |
url: https://huggingface.co/datasets/open-llm-leaderboard/details_HuggingFaceH4__zephyr-7b-beta_public
|
| 138 |
|
| 139 |
+
# MMLU (5-Shot)
|
| 140 |
# ???
|
| 141 |
|
| 142 |
# AlpacaEval (taken from model card)
|
|
|
|
| 295 |
- lr_scheduler_type: linear
|
| 296 |
- lr_scheduler_warmup_ratio: 0.1
|
| 297 |
- num_epochs: 3.0
|
| 298 |
+
|
| 299 |
### Training results
|
| 300 |
+
|
| 301 |
The table below shows the full set of DPO training metrics:
|
| 302 |
| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
|
| 303 |
|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
|
|
|
|
| 359 |
| 0.0077 | 2.89 | 5600 | 0.7520 | -4.5586 | -8.3485 | 0.7969 | 3.7899 | -340.4545 | -299.8206 | -2.3078 | -2.3517 |
|
| 360 |
| 0.0094 | 2.94 | 5700 | 0.7527 | -4.5542 | -8.3509 | 0.7812 | 3.7967 | -340.4790 | -299.7773 | -2.3062 | -2.3510 |
|
| 361 |
| 0.0054 | 2.99 | 5800 | 0.7520 | -4.5169 | -8.3079 | 0.7812 | 3.7911 | -340.0493 | -299.4038 | -2.3081 | -2.3530 |
|
| 362 |
+
|
| 363 |
### Framework versions
|
| 364 |
+
|
| 365 |
- Transformers 4.35.0.dev0
|
| 366 |
- Pytorch 2.0.1+cu118
|
| 367 |
- Datasets 2.12.0
|
| 368 |
- Tokenizers 0.14.0
|
| 369 |
+
|
| 370 |
## Citation
|
| 371 |
+
|
| 372 |
If you find Zephyr-7B-β is useful in your work, please cite it with:
|
| 373 |
```
|
| 374 |
@misc{tunstall2023zephyr,
|