|
--- |
|
tags: |
|
- llama2 |
|
--- |
|
|
|
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64c14f6b02e1f8f67c73bd05/DJHrZmfoy-0TzNChTrtxP.png) |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
|
|
We have followed up on our previous training runs related to extending the context length |
|
of Llama models. The associated github repository |
|
|
|
https://github.com/abacusai/long-context |
|
|
|
has some basic details on our approach and metrics. We have also published a paper on arXiv |
|
that covers our experiments and analysis a lot more comprehensively. |
|
|
|
http://arxiv.org/abs/2308.10882 |
|
|
|
- **Developed by:** [Abacus.AI](https://abacus.ai) |
|
- **Model type:** Transformer based autoregressive causal language model |
|
- **License:** Llama 2 Community License: https://github.com/facebookresearch/llama/blob/main/LICENSE |
|
- **Finetuned from model:** Llama V2 70B |
|
|
|
### Usage |
|
|
|
To use this model at longer lengths the model needs to be patched to interpolate the longer context |
|
lengths. It will not work if it is simply loaded with the `AutoModel` framework of `transformers`. |
|
For full details and usage see: |
|
|
|
https://github.com/abacusai/Long-Context |
|
|
|
The evaluation section has detailed code for how to load and patch the model for inference (or further fine-tuning). |
|
Note in particular the `max_position_embeddings` is not relevant since the patched module dynamically reallocates |
|
the position buffers as required. |
|
|
|
The tokenizer corresponding to this model is https://huggingface.co/abacusai/Giraffe-v1-Tokenizer. |
|
|
|
Using the code in the repository you can load this model with the following code: |
|
```python |
|
from models import load_model, load_tokenizer |
|
tokenizer = load_tokenizer() |
|
model = load_model('abacusai/Giraffe-v2-70b-32k', scale=8) |
|
``` |
|
|