Update README.md
Browse files
README.md
CHANGED
|
@@ -41,7 +41,7 @@ tags:
|
|
| 41 |
</div>
|
| 42 |
|
| 43 |
## TLDR
|
| 44 |
-
This repo contains [LongLLaMA-
|
| 45 |
|
| 46 |
LongLLaMA is built upon the foundation of [OpenLLaMA](https://github.com/openlm-research/open_llama) and fine-tuned using the [Focused Transformer (FoT)](https://arxiv.org/abs/2307.03170) method. We release a smaller 3B base variant (not instruction tuned) of the LongLLaMA model on a permissive license (Apache 2.0) and inference code supporting longer contexts on [Hugging Face](https://huggingface.co/syzymon/long_llama_3b). Our model weights can serve as the drop-in replacement of LLaMA in existing implementations (for short context up to 2048 tokens). Additionally, we provide evaluation results and comparisons against the original OpenLLaMA models. Stay tuned for further updates.
|
| 47 |
|
|
|
|
| 41 |
</div>
|
| 42 |
|
| 43 |
## TLDR
|
| 44 |
+
This repo contains [LongLLaMA-3Bv1.1](https://huggingface.co/syzymon/long_llama_3b_v1_1).
|
| 45 |
|
| 46 |
LongLLaMA is built upon the foundation of [OpenLLaMA](https://github.com/openlm-research/open_llama) and fine-tuned using the [Focused Transformer (FoT)](https://arxiv.org/abs/2307.03170) method. We release a smaller 3B base variant (not instruction tuned) of the LongLLaMA model on a permissive license (Apache 2.0) and inference code supporting longer contexts on [Hugging Face](https://huggingface.co/syzymon/long_llama_3b). Our model weights can serve as the drop-in replacement of LLaMA in existing implementations (for short context up to 2048 tokens). Additionally, we provide evaluation results and comparisons against the original OpenLLaMA models. Stay tuned for further updates.
|
| 47 |
|