|
--- |
|
language: |
|
- zh |
|
- en |
|
tags: |
|
- llama2 |
|
- llama2-base |
|
- llama2-base-7B |
|
task_categories: |
|
- text2text-generation |
|
--- |
|
# 7B Chinese Chatbot trained based on LLama2-base 7B (Pure SFT Full Params Training) |
|
|
|
## Introduction |
|
|
|
该模型为基于Llama2 base 7B 全参数SFT训练的中文模型。其目的是为了同[RicardoLee/Llama2-base-7B-Chinese-50W-LoRA](https://huggingface.co/RicardoLee/Llama2-base-7B-Chinese-50W-LoRA)项目进行对比,判断LoRA效果合全参数训练效果的差异。 |
|
|
|
该模型训练Loss最终达到了0.7. |
|
|
|
训练数据使用[BELLE](https://huggingface.co/BelleGroup)项目中采样的50万SFT数据进行SFT训练。 |
|
|
|
This model is a Chinese chat model based on Llama2 base 7B, trained with full-parameter SFT. Its purpose is to facilitate a comparison with the project [RicardoLee/Llama2-base-7B-Chinese-50W-LoRA](https://huggingface.co/RicardoLee/Llama2-base-7B-Chinese-50W-LoRA) and assess the performance between LoRA and full-parameter training. |
|
|
|
The final batch loss reached 0.7 during the training. |
|
|
|
The training data is sampled from [BELLE](https://huggingface.co/BelleGroup) project, which consists of 500,000 SFT samples. |
|
|
|
## Train Detail |
|
|
|
一些训练上的细节: |
|
|
|
1. 训练框架:该模型采用全参数SFT训练 |
|
2. Tokenizer:该模型使用了Chinese-Alpaca-Plus模型的tokenizer.model。这是因为LLama2本身的tokenizer.model同LLama1是一摸一样的。因此理论上可以完全复用Chinese-LLaMa项目的tokenizer而不会产生如何错位问题。 |
|
3. 训练参数:LR: 2e-4, Warmup ratio: 0.003. |
|
4. 训练资源:8卡V100。67 小时 |
|
5. 训练起始的loss:参见[Material](trainer_state.json) |
|
6. 训练终止的loss:参见[Material](trainer_state.json) |
|
|
|
Some details in training: |
|
|
|
1. Trianing Framework: This model adopts full-parameter SFT training. |
|
2. Tokenizer: This model utilizes the tokenizer.model from the Chinese-Alpaca-Plus model. The reason for this choice is that the tokenizer.model in LLama2 is identical to the one used in LLama1. As a result, it is theoretically feasible to entirely reuse the tokenizer from the Chinese-LLaMa project without encountering any issues related to token misalignment. |
|
3. Training Parameters: LR: 2e-4, Warmup ratio: 0.003. |
|
4. Training Resource: 8\*V100, 67 hours. |
|
5. Initial Loss: Please refer to [Material](trainer_state.json) |
|
6. Train Loss: Please refer to [Material](trainer_state.json) |
|
|
|
## Inference |
|
|
|
该模型依然采用stanford alpaca 模版。因此在测试时且别忘记添加开场白。开场白如下: |
|
|
|
"Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n\n${Your Content}\n\n### Response:\n\n" |
|
|
|
对于带上文的对话,开场白如下: |
|
|
|
"Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n\nHuman:${Previous Human Content}\nAssistant:${Previous Assistance Content}\nHuman:${Your Question}\n\n### Response:\n\n" |
|
|
|
This model still using the Stanford Alpaca template. Therefore, don't forget to add prologue template. The prologue template is: |
|
|
|
"Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n\n${Your Content}\n\n### Response:\n\n" |
|
|
|
For dialogue with context, the prelogue template is: |
|
|
|
"Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n\nHuman:${Previous Human Content}\nAssistant:${Previous Machine Content}\nHuman:${Your Question}\n\n### Response:\n\n" |
|
|
|
## Licence |
|
|
|
本仓库的模型依照 Apache-2.0 协议开源,模型的权重的使用则需要遵循LLama2[MODEL LICENCE](LICENSE)。 |
|
|
|
This repository's models are open-sourced under the Apache-2.0 license, and their weight usage must adhere to LLama2 [MODEL LICENCE](LICENSE) license. |
|
|
|
## Future Work |
|
|
|
将会在近期逐步放出 |
|
|
|
1. 更大SFT数据规模训练下的模型。 |
|
2. 13B及以下的LLama2 同LLama2-chat的模型,以供大家对比。 |
|
|
|
I will release the following models: |
|
|
|
1. Models trained on larger data scale. |
|
2. Models trained on LLama2 and LLama2-chat (under the 13B, since I only have V100), for comparison. |
|
|