xgen-7b-8k-base / README.md
tomaarsen's picture
tomaarsen HF staff
Resolve broken URL in citation
09bcd3a
|
raw
history blame
1.76 kB
metadata
license: apache-2.0

XGen-7B-8K-Base

Official research release for the family of XGen models (7B) by Salesforce AI Research:

Title: Long Sequence Modeling with XGen: A 7B LLM Trained on 8K Input Sequence Length

Models

Base models

  • XGen-7B-4K-Base: XGen-7B model pre-trained under 4K sequence length.
    • License: Apache-2.0
  • XGen-7B-8K-Base: XGen-7B model pre-trained under 8K sequence length.
    • License: Apache-2.0

Instruction-finetuned models

Supervised finetuned model on public domain instructional data. Released for research purpose only.

How to run

The training data for the models are tokenized with OpenAI Tiktoken library. To use this model, install the package via pip:

pip install tiktoken

The models can be used as auto-regressive samplers as follows:

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Salesforce/xgen-7b-8k-base", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("Salesforce/xgen-7b-8k-base", torch_dtype=torch.bfloat16)
inputs = tokenizer("The world is", return_tensors="pt")
sample = model.generate(**inputs, max_length=128)
print(tokenizer.decode(sample[0]))

Citation

@misc{XGen,
  title={Long Sequence Modeling with XGen: A 7B LLM Trained on 8K Input Sequence Length},
  author={Salesforce AI Research},
  howpublished={Salesforce AI Research Blog},
  year={2023},
  url={https://blog.salesforceairesearch.com/xgen/}
}