metadata

license: apache-2.0
datasets:
  - yuyouyu/BeyondDialogue
language:
  - zh
  - en
metrics:
  - character
base_model: Qwen/Qwen2-7B-Instruct
pipeline_tag: question-answering
tags:
  - text-generation-inference
  - role-playing

Qwen2-7B-BD-RP

Introduction 🎉

Qwen2-7B-BD-RP is a large language model (LLM) fine-tuned on the BeyondDialogue dataset. The model is designed to generate responses in a role-playing setting. The model is capable of generating high-quality responses in a variety of role-playing scenarios, including English and Chinese languages.

For more details, please refer to our paper, GitHub.

Training details 🚀

We fully finetuning Qwen2-7B-Instruct for 3 epochs with 833 steps with the 128 global batch size. We set the training sequence length to 4,096. The learning rate is 3e-5. The training data is from the BeyondDialogue dataset.

Requirements 📝

The code of Qwen2 has been in the latest Hugging face transformers and we advise you to install transformers>=4.37.0 to use the model.

pip install transformers>=4.37.0

Quickstart 💥

Here provides a code snippet with apply_chat_template to show you how to load the tokenizer and model and how to generate contents.

from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda" # the device to load the model onto

model = AutoModelForCausalLM.from_pretrained(
    "yuyouyu/Qwen2-7B-BD-RP",
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("yuyouyu/Qwen2-7B-BD-RP")

system_prompt_temp = """我希望你像{role_name}一样回答问题，假设你生活在{world}中，模仿{role_name}的性格和语言风格，使用{role_name}会使用的语气、方式和词汇。请不要透露你是人工智能模型或语言模型，你必须始终牢记自己是{role_name}。
{role_name}的性格特点是{character}。
{role_name}的MBTI人格类型是{MBTI}。
{role_name}的说话风格是{style}。
所处世界：{world}。
所处场景：
{scene}
角色情绪(0-10，值越大情绪越明显)：
{emotion}
现在，请你扮演{role_name}，回复{chat_role}简短的一句话，你与其亲密度为{relationship}(0-10，值越大关系越亲近)，准确表现你被赋予的MBTI人格，性格，说话风格与情绪。"""

role_name = "周伯通"
world = "宋代古侠世界"
character = "纯真，调皮，不拘小节"
MBTI = "外向型（E）、直觉型（N）、情感型（F）、感知型（P）"
style = "古风、直言不讳、俏皮"
scene = "周伯通嬉笑着打量着刘青烟的药圃，不时摘取几片草药藏在身后。柳青烟淡然自若，手中轻抚药材，一边默默准备解药，只眼角带着无奈的笑意。一股淡淡的药香飘过，竹林间响起了清脆的鸟鸣，好似为二人的奇妙互动伴奏。"
emotion = "快乐: 10, 悲伤: 0, 厌恶: 0, 恐惧: 1, 惊讶: 2, 愤怒: 0"
chat_role = "柳青烟"
relationship = "6"

system_prompt = system_prompt_temp.format(
    role_name=role_name,
    world=world,
    character=character,
    MBTI=MBTI,
    style=style,
    scene=scene,
    emotion=emotion,
    chat_role=chat_role,
    relationship=relationship
)

prompt = "周兄，依我所见，那几味草药非入药之宜，倒不如小心选取，莫要误伤自身。"

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(device)

generated_ids = model.generate(
            model_inputs.input_ids,
            max_new_tokens=256,
            do_sample=True,
            temperature=0.7,
            repetition_penalty=1.2,
)

generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

Note: The examples for Qwen2-7B-BD-RP use Chinese role-playing. For English examples, please refer to our other training model repository -- Mistral-Nemo-BD-RP.

Evaluation 🏆

We use objective questions to assess eight dimensions: Character, Style, Emotion, Relationship, Personality, Human-likeness, Coherence, and Role Consistency. The metric design can be find in our paper. The evaluation code can be found in GitHub. The results are shown below:

Model	Character ↑	Style ↑	Emotion ↓	Relationship ↓	Personality ↑	Avg. ↑	Human-likeness ↑	Role Choice ↑	Coherence ↑
General Baselines(Proprietary)
GPT-4o	74.32 ± 1.15	81.67 ± 1.51	16.31 ± 0.48	12.13 ± 0.66	66.58 ± 4.41	78.83 ± 1.64	67.33 ± 3.95	87.33 ± 3.86	99.67 ± 0.33
GPT-3.5-Turbo	72.26 ± 1.27	73.66 ± 1.73	17.79 ± 0.56	14.17 ± 0.73	66.92 ± 4.85	76.18 ± 1.83	33.33 ± 4.43	83.00 ± 4.68	97.33 ± 1.17
Moonshot-v1-8k	74.06 ± 1.19	80.64 ± 1.51	16.17 ± 0.47	13.42 ± 0.70	67.00 ± 4.87	78.42 ± 1.75	44.00 ± 4.33	86.67 ± 3.75	99.33 ± 0.46
Yi-Large-Turbo	75.13 ± 1.22	79.18 ± 1.58	16.44 ± 0.49	13.48 ± 0.67	68.25 ± 4.61	78.53 ± 1.72	47.00 ± 4.60	84.33 ± 3.67	92.67 ± 2.39
Deepseek-Chat	75.46 ± 1.14	81.49 ± 1.51	15.92 ± 0.46	12.42 ± 0.63	67.92 ± 4.57	79.30 ± 1.66	52.33 ± 4.95	83.00 ± 4.68	96.67 ± 1.00
Baichuan4	71.82 ± 1.25	76.92 ± 1.52	17.57 ± 0.52	12.30 ± 0.62	67.08 ± 4.75	77.19 ± 1.73	45.33 ± 4.31	82.33 ± 4.49	99.33 ± 0.46
Hunyuan	73.77 ± 1.18	78.75 ± 1.56	17.24 ± 0.48	13.22 ± 0.68	67.00 ± 4.39	77.81 ± 1.66	53.00 ± 4.29	84.33 ± 4.52	98.33 ± 0.84
Role-play Expertise Baselines
Index-1.9B-Character	73.33 ± 1.32	76.48 ± 1.50	17.99 ± 0.53	13.58 ± 0.71	66.33 ± 4.57	76.92 ± 1.73	21.67 ± 3.96	78.67 ± 5.14	69.67 ± 3.85
CharacterGLM-6B	73.36 ± 1.28	76.08 ± 1.55	18.58 ± 0.55	14.27 ± 0.79	67.33 ± 4.34	76.79 ± 1.70	16.00 ± 2.38	81.00 ± 4.40	25.67 ± 3.48
Baichuan-NPC-Turbo	75.19 ± 1.23	79.15 ± 1.38	17.24 ± 0.51	13.10 ± 0.69	65.33 ± 4.84	77.87 ± 1.73	56.00 ± 4.66	86.33 ± 4.90	99.00 ± 0.56
General Baselines(Open-source)
Yi-1.5-9B-Chat	75.31 ± 1.20	76.78 ± 1.49	16.67 ± 0.52	12.75 ± 0.66	67.42 ± 4.63	78.02 ± 1.70	38.67 ± 4.39	84.00 ± 4.61	92.67 ± 1.79
GLM-4-9b-chat	74.26 ± 1.19	78.40 ± 1.55	17.18 ± 0.50	14.48 ± 0.74	67.17 ± 4.93	77.63 ± 1.78	47.67 ± 4.25	83.33 ± 4.51	99.33 ± 0.46
Mistral-Nemo-Instruct-2407	74.12 ± 1.17	77.04 ± 1.48	17.00 ± 0.43	13.50 ± 0.67	67.00 ± 4.30	77.53 ± 1.61	53.67 ± 4.66	82.67 ± 4.77	74.33 ± 3.77
Qwen2-7B-Instruct	75.39 ± 1.13	77.68 ± 1.65	17.64 ± 0.56	13.43 ± 0.7	67.75 ± 4.44	77.95 ± 1.70	48.00 ± 4.66	83.33 ± 4.48	99.00 ± 0.56
Qwen2-7B-BD-RP	78.67 ± 1.12*	82.52 ± 1.33*	15.68 ± 0.5*	11.22 ± 0.72*	69.67 ± 4.27	80.79 ± 1.59*	64.33 ± 3.80*	87.33 ± 3.74	99.00 ± 0.56

Citation 📖

Please cite our work if you found the resources in this repository useful:

@article{yu2024beyond,
  title   = {BEYOND DIALOGUE: A Profile-Dialogue Alignment Framework Towards General Role-Playing Language Model},
  author  = {Yu, Yeyong and Yu, Runsheng and Wei, Haojie and Zhang, Zhanqiu and Qian, Quan},
  year    = {2024},
  journal = {arXiv preprint arXiv:2408.10903},
}

Acknowledgements 🥰

We would like to express our sincere gratitude to Tencent LightSpeed Studios for their invaluable support in this project. Their contributions and encouragement have been instrumental in the successful completion of our work.