dfurman
/

Llama-2-13B-Instruct-v0.2

@@ -10,7 +10,14 @@ pipeline_tag: text-generation
 base_model: meta-llama/Llama-2-13b-hf
 ---
-# llama-2-13b-instruct-v0.1 🦙🐬
 This instruction model was built via parameter-efficient QLoRA finetuning of [llama-2-13b](https://huggingface.co/meta-llama/Llama-2-13b-hf) on the first 100k rows of [ehartford/dolphin](https://huggingface.co/datasets/ehartford/dolphin) (an open-source implementation of [Microsoft's Orca](https://www.microsoft.com/en-us/research/publication/orca-progressive-learning-from-complex-explanation-traces-of-gpt-4/)). Finetuning was executed on a single A6000 (48 GB) for roughly 18 hours on the [Lambda Labs](https://cloud.lambdalabs.com/instances) platform.
@@ -30,9 +37,9 @@ We use state-of-the-art [Language Model Evaluation Harness](https://github.com/E
 * Model license: Llama 2 Community License Agreement
 * Basic usage: [notebook](assets/basic_inference_llama_2_13b_dolphin.ipynb)
-* Finetuning script: [script](https://github.com/daniel-furman/sft-demos/blob/main/src/sft/one_gpu/llama-2/dolphin/sft-llama-2-13b-instruct-v0.1-peft.py)
-* Loss curves: [plot](https://huggingface.co/dfurman/llama-2-13b-instruct-v0.1-peft#finetuning-description)
-* Runtime stats: [table](https://huggingface.co/dfurman/llama-2-13b-instruct-v0.1-peft#runtime-tests)
 ## Example prompts and responses
@@ -41,7 +48,7 @@ Example 1:
 **User**:
 >You are a helpful assistant. Write me a numbered list of things to do in New York City.\n
-**llama-2-13b-instruct-v0.1-peft**:
 > 1. Visit the iconic Statue of Liberty and Ellis Island.
 > 2. Take a stroll through Central Park and enjoy its many attractions.
@@ -61,7 +68,7 @@ Example 2:
 **User**:
 >You are a helpful assistant. Write a short email inviting my friends to a dinner party on Friday. Respond succinctly.\n"
-**llama-2-13b-instruct-v0.1-peft**:
 > Subject: Friday Dinner Party Invitation
 >
@@ -97,7 +104,7 @@ The llama-2-13b models have been modified from a standard transformer in the fol
 This model was trained on a single A6000 (48 GB) for about 18 hours using the [Lambda Labs](https://cloud.lambdalabs.com/instances) platform.
-![loss curves](https://raw.githubusercontent.com/daniel-furman/sft-demos/main/assets/jul_24_23_1_13_00_log_loss_curves_llama-2-13b-instruct-v0.1.png)
 The above loss curve was generated from the run's private wandb.ai log.
@@ -138,7 +145,7 @@ notebook_login()
 ```
 ```python
-peft_model_id = "dfurman/llama-2-13b-instruct-v0.1-peft"
 config = PeftConfig.from_pretrained(peft_model_id)
 bnb_config = BitsAndBytesConfig(
@@ -194,7 +201,7 @@ print(tokenizer.decode(output["sequences"][0], skip_special_tokens=True))
 | 2.93                        | 1x A100 (40 GB SXM)  | torch               | bfloat16    | 25                    |
 | 3.24                        | 1x A6000 (48 GB)  | torch               | bfloat16    | 25                    |
-The above runtime stats were generated from this [notebook](https://github.com/daniel-furman/sft-demos/blob/main/src/sft/one_gpu/llama-2/dolphin/postprocessing-llama-2-13b-instruct-v0.1-peft.ipynb).
 ## Acknowledgements

 base_model: meta-llama/Llama-2-13b-hf
 ---
+<div align="center">
+<img src="./assets/llama.png" width="150px">
+</div>
+# Llama-2-13B-Instruct-v0.1 🦙🐬
 This instruction model was built via parameter-efficient QLoRA finetuning of [llama-2-13b](https://huggingface.co/meta-llama/Llama-2-13b-hf) on the first 100k rows of [ehartford/dolphin](https://huggingface.co/datasets/ehartford/dolphin) (an open-source implementation of [Microsoft's Orca](https://www.microsoft.com/en-us/research/publication/orca-progressive-learning-from-complex-explanation-traces-of-gpt-4/)). Finetuning was executed on a single A6000 (48 GB) for roughly 18 hours on the [Lambda Labs](https://cloud.lambdalabs.com/instances) platform.
 * Model license: Llama 2 Community License Agreement
 * Basic usage: [notebook](assets/basic_inference_llama_2_13b_dolphin.ipynb)
+* Finetuning script: [script](https://github.com/daniel-furman/sft-demos/blob/main/src/sft/one_gpu/llama-2/dolphin/sft-Llama-2-13B-Instruct-v0.1-peft.py)
+* Loss curves: [plot](https://huggingface.co/dfurman/Llama-2-13B-Instruct-v0.1-peft#finetuning-description)
+* Runtime stats: [table](https://huggingface.co/dfurman/Llama-2-13B-Instruct-v0.1-peft#runtime-tests)
 ## Example prompts and responses
 **User**:
 >You are a helpful assistant. Write me a numbered list of things to do in New York City.\n
+**Llama-2-13B-Instruct-v0.1-peft**:
 > 1. Visit the iconic Statue of Liberty and Ellis Island.
 > 2. Take a stroll through Central Park and enjoy its many attractions.
 **User**:
 >You are a helpful assistant. Write a short email inviting my friends to a dinner party on Friday. Respond succinctly.\n"
+**Llama-2-13B-Instruct-v0.1-peft**:
 > Subject: Friday Dinner Party Invitation
 >
 This model was trained on a single A6000 (48 GB) for about 18 hours using the [Lambda Labs](https://cloud.lambdalabs.com/instances) platform.
+![loss curves](https://raw.githubusercontent.com/daniel-furman/sft-demos/main/assets/jul_24_23_1_13_00_log_loss_curves_Llama-2-13B-Instruct-v0.1.png)
 The above loss curve was generated from the run's private wandb.ai log.
 ```
 ```python
+peft_model_id = "dfurman/Llama-2-13B-Instruct-v0.1-peft"
 config = PeftConfig.from_pretrained(peft_model_id)
 bnb_config = BitsAndBytesConfig(
 | 2.93                        | 1x A100 (40 GB SXM)  | torch               | bfloat16    | 25                    |
 | 3.24                        | 1x A6000 (48 GB)  | torch               | bfloat16    | 25                    |
+The above runtime stats were generated from this [notebook](https://github.com/daniel-furman/sft-demos/blob/main/src/sft/one_gpu/llama-2/dolphin/postprocessing-Llama-2-13B-Instruct-v0.1-peft.ipynb).
 ## Acknowledgements