File size: 2,833 Bytes
5033a81
2c716bc
 
 
5033a81
2c716bc
 
 
 
 
 
 
 
 
5033a81
 
2c716bc
 
5033a81
2c716bc
5033a81
 
2c716bc
5033a81
2c716bc
5033a81
2c716bc
 
5033a81
2c716bc
 
 
5033a81
 
2c716bc
 
5033a81
2c716bc
 
 
5033a81
2c716bc
5033a81
2c716bc
5033a81
2c716bc
 
5033a81
2c716bc
 
 
5033a81
2c716bc
5033a81
2c716bc
 
5033a81
2c716bc
 
 
 
 
 
 
 
 
5033a81
2c716bc
 
 
 
 
 
5033a81
2c716bc
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
---
language:
- en
license: mit
library_name: transformers
tags:
- chat
- text-generation
- persona
- phi-2
- llm
- persona-grounded
datasets:
- nazlicanto/persona-based-chat
---

## Phi 2 Persona-Chat
Phi 2 Persona-Chat is a LoRA fine-tuned version of the base [Phi 2](https://huggingface.co/microsoft/phi-2) model using the [nazlicanto/persona-based-chat](https://huggingface.co/datasets/nazlicanto/persona-based-chat) dataset. This dataset consists of over 64k conversations between *Persona A* and *Persona B*, for which a list of persona facts are provided.

The model is trained using Supervised Fine-tuning Trainer using the `reference` responses as target outputs. For the training and inference code and the full list of dependencies, you can refer to the Github [repo](https://github.com/alaradirik/finetune-phi-2).


## Running the Model

Please note that, at the moment, trust_remote_code=True is required for running the Phi 2 model. For best results, use a prompt that includes the persona facts, followed by a minimum of one conversational turn.

```
from random import randrange

import torch
from datasets import load_dataset
from transformers import AutoTokenizer, AutoModelForCausalLM


prompt = f"""
Person B has the following Persona information.

Persona of Person B: My name is David and I'm a 35 year old math teacher.
Persona of Person B: I like to hike and spend time in the nature.
Persona of Person B: I'm married with two kids.

Instruct: Person A and Person B are now having a conversation.  Following the conversation below, write a response that Person B would say base on the above Persona information. Please carefully consider the flow and context of the conversation below, and use the Person B's Persona information appropriately to generate a response that you think are the most appropriate replying for Person B.

Persona A: Morning! I think I saw you at the parent meeting, what's your name?

Output:
"""

# load base LLM model, LoRA params and tokenizer
model = AutoModelForCausalLM.from_pretrained("nazlicanto/phi-2-persona-chat", trust_remote_code=True)
model.to("cuda")

tokenizer = AutoTokenizer.from_pretrained("nazlicanto/phi-2-persona-chat", trust_remote_code=True)

# tokenize input prompt
input_ids = tokenizer(prompt, return_tensors="pt", truncation=True).input_ids.cuda()

# inference
with torch.inference_mode():
    outputs = model.generate(
        input_ids=input_ids, 
        max_new_tokens=50, 
        do_sample=True, 
        top_p=0.1,
        temperature=0.7
    )

# decode output tokens
outputs = outputs.detach().cpu().numpy()
outputs = tokenizer.batch_decode(outputs, skip_special_tokens=True)
output = outputs[0][len(prompt):]
print(output)
```

This model is trained by [nazlicanto](https://huggingface.co/nazlicanto) and [adirik](https://huggingface.co/adirik).