|
--- |
|
language: |
|
- en |
|
license: mit |
|
library_name: transformers |
|
tags: |
|
- chat |
|
- text-generation |
|
- persona |
|
- phi-2 |
|
- llm |
|
- persona-grounded |
|
datasets: |
|
- nazlicanto/persona-based-chat |
|
--- |
|
|
|
## Phi 2 Persona-Chat |
|
Phi 2 Persona-Chat is a LoRA fine-tuned version of the base [Phi 2](https://huggingface.co/microsoft/phi-2) model using the [nazlicanto/persona-based-chat](https://huggingface.co/datasets/nazlicanto/persona-based-chat) dataset. This dataset consists of over 64k conversations between *Persona A* and *Persona B*, for which a list of persona facts are provided. |
|
|
|
The model is trained using Supervised Fine-tuning Trainer using the `reference` responses as target outputs. For the training and inference code and the full list of dependencies, you can refer to the Github [repo](https://github.com/alaradirik/finetune-phi-2). |
|
|
|
|
|
## Running the Model |
|
|
|
Please note that, at the moment, trust_remote_code=True is required for running the Phi 2 model. For best results, use a prompt that includes the persona facts, followed by a minimum of one conversational turn. |
|
|
|
``` |
|
from random import randrange |
|
|
|
import torch |
|
from datasets import load_dataset |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
|
|
prompt = f""" |
|
Person B has the following Persona information. |
|
|
|
Persona of Person B: My name is David and I'm a 35 year old math teacher. |
|
Persona of Person B: I like to hike and spend time in the nature. |
|
Persona of Person B: I'm married with two kids. |
|
|
|
Instruct: Person A and Person B are now having a conversation. Following the conversation below, write a response that Person B would say base on the above Persona information. Please carefully consider the flow and context of the conversation below, and use the Person B's Persona information appropriately to generate a response that you think are the most appropriate replying for Person B. |
|
|
|
Persona A: Morning! I think I saw you at the parent meeting, what's your name? |
|
|
|
Output: |
|
""" |
|
|
|
# load base LLM model, LoRA params and tokenizer |
|
model = AutoModelForCausalLM.from_pretrained("nazlicanto/phi-2-persona-chat", trust_remote_code=True) |
|
model.to("cuda") |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("nazlicanto/phi-2-persona-chat", trust_remote_code=True) |
|
|
|
# tokenize input prompt |
|
input_ids = tokenizer(prompt, return_tensors="pt", truncation=True).input_ids.cuda() |
|
|
|
# inference |
|
with torch.inference_mode(): |
|
outputs = model.generate( |
|
input_ids=input_ids, |
|
max_new_tokens=50, |
|
do_sample=True, |
|
top_p=0.1, |
|
temperature=0.7 |
|
) |
|
|
|
# decode output tokens |
|
outputs = outputs.detach().cpu().numpy() |
|
outputs = tokenizer.batch_decode(outputs, skip_special_tokens=True) |
|
output = outputs[0][len(prompt):] |
|
print(output) |
|
``` |
|
|
|
This model is trained by [nazlicanto](https://huggingface.co/nazlicanto) and [adirik](https://huggingface.co/adirik). |