This model is a generation model trained via semiparametric token-sequence co-supervision on top of Llama2-7B. The embedding model which constructs the nonparametric sequence embedding spaces is in here. The models are trained on information-seeking datasets provided by self-rag with co-supervision from next token prediction (NTP) and next sequence prediction (NSP). In the inference step, the model generates a response by retrieving relevant sequences. See full descriptions in our paper.
Usage
Here, we show an easy way to quickly download our model from HuggingFace. Make sure to install dependencies listed at requirements.txt. To run our full inference pipeline with embedding model, please use our code.
from transformers import AutoTokenizer, LlamaForCausalLM
model = LlamaForCausalLM.from_pretrained(
"kaist-ai/cosupervision-emb_seq-Llama2_7b",
load_in_8bit=True if train_config.quantization else None,
device_map="auto" if train_config.quantization else None,
)
- Downloads last month
- 10
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.