---
license: cc-by-nc-4.0
---
This model is a generation model trained via [semiparametric token-sequence co-supervision](https://github.com/kaistAI/Semiparametric_Token-Sequence_Co-Supervision) on top of Llama2-7B.
The embedding model which constructs the nonparametric sequence embedding spaces is in [here](https://huggingface.co/kaist-ai/cosupervision-emb_seq-Llama2_7b).
The models are trained on information-seeking datasets provided by [self-rag](https://selfrag.github.io/) with co-supervision from next token prediction (NTP) and next sequence prediction (NSP).
In the inference step, the model generates a response by retrieving relevant sequences.
See full descriptions in our paper.

### Usage
Here, we show an easy way to quickly download our model from HuggingFace.
Make sure to install dependencies listed at requirements.txt. 
To run our full inference pipeline with embedding model, please use our [code](https://github.com/kaistAI/Semiparametric_Token-Sequence_Co-Supervision).

```
from transformers import AutoTokenizer, LlamaForCausalLM

model = LlamaForCausalLM.from_pretrained(
    "kaist-ai/cosupervision-emb_seq-Llama2_7b",
    load_in_8bit=True if train_config.quantization else None,
    device_map="auto" if train_config.quantization else None,
)
```