MikeDean commited on
Commit
c07e1c0
1 Parent(s): 47fe80d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -4
README.md CHANGED
@@ -3,7 +3,10 @@
3
  <a href="" target="_blank"><img src="https://github.com/zjunlp/CaMA/blob/main/assets/logo.jpg?raw=true" alt="ZJU-CaMA" style="width: 30%; min-width: 30px; display: block; margin: auto;"></a>
4
  </p>
5
 
 
6
  > This is the result of the weight difference between `Llama 13B` and `CaMA-13B`. You can click [here](https://github.com/zjunlp/cama) to learn more.
 
 
7
  # CaMA: A Chinese-English Bilingual LLaMA Model
8
 
9
  With the birth of ChatGPT, artificial intelligence has also entered the "iPhone moment," where various large language models (LLMs) have sprung up like mushrooms. The wave of these large models has quickly swept through artificial intelligence fields beyond natural language processing. However, training such a model requires extremely high hardware costs, and open-source language models are scarce due to various reasons, making Chinese language models even more scarce. It wasn't until the open-sourcing of LLaMA that a variety of language models based on LLaMA started to emerge. This project is also based on the LLaMA model. To further enhance Chinese language capabilities without compromising its original language distribution, we first <b>(1) perform additional pre-training on LLaMA (13B) using Chinese corpora, aiming to improve the model's Chinese comprehension and knowledge base while preserving its original English and code abilities to the greatest extent possible;</b> then, <b>(2) we fine-tune the model from the first step using an instruction dataset to enhance the language model's understanding of human instructions.</b>
@@ -193,7 +196,7 @@ Our pre-trained model has demonstrated certain abilities in instruction followin
193
  The effectiveness of information extraction is illustrated in the following figure. We tested different instructions for different tasks as well as the same instructions for the same task, and achieved good results for all of them.
194
 
195
  <p align="center" width="100%">
196
- <a href="" target="_blank"><img src="https://github.com/zjunlp/CaMA/blob/main/assets/ie-case.jpg" alt="IE" style="width: 60%; min-width: 60px; display: block; margin: auto;"></a>
197
  </p>
198
 
199
 
@@ -463,7 +466,7 @@ We offer two methods: the first one is **command-line interaction**, and the sec
463
  ```
464
  Here is a screenshot of the web-based interaction:
465
  <p align="center" width="100%">
466
- <a href="" target="_blank"><img src="https://github.com/zjunlp/CaMA/blob/main/assets/finetune_web.jpg" alt="finetune-web" style="width: 100%; min-width: 100px; display: block; margin: auto;"></a>
467
  </p>
468
 
469
  **3. Usage of Instruction tuning Model**
@@ -476,7 +479,7 @@ python examples/generate_lora_web.py --base_model ./CaMA --lora_weights ./LoRA
476
 
477
  Here is a screenshot of the web-based interaction:
478
  <p align="center" width="100%">
479
- <a href="" target="_blank"><img src="https://github.com/zjunlp/CaMA/blob/main/assets/lora_web.png" alt="finetune-web" style="width: 100%; min-width: 100px; display: block; margin: auto;"></a>
480
  </p>
481
 
482
  The `instruction` is a required parameter, while `input` is an optional parameter. For general tasks (such as the examples provided in section `1.3`), you can directly enter the input in the `instruction` field. For information extraction tasks (as shown in the example in section `1.2`), please enter the instruction in the `instruction` field and the sentence to be extracted in the `input` field. We provide an information extraction prompt in section `2.5`.
@@ -499,7 +502,7 @@ For information extraction tasks such as named entity recognition (NER), event e
499
  >
500
  > (2) Instruction tuning stage using LoRA. This stage enables the model to understand human instructions and generate appropriate responses.
501
 
502
- ![](https://github.com/zjunlp/CaMA/blob/main/assets/main.jpg)
503
 
504
  <h3 id="3-1">3.1 Dataset Construction (Pretraining)</h3>
505
 
 
3
  <a href="" target="_blank"><img src="https://github.com/zjunlp/CaMA/blob/main/assets/logo.jpg?raw=true" alt="ZJU-CaMA" style="width: 30%; min-width: 30px; display: block; margin: auto;"></a>
4
  </p>
5
 
6
+
7
  > This is the result of the weight difference between `Llama 13B` and `CaMA-13B`. You can click [here](https://github.com/zjunlp/cama) to learn more.
8
+
9
+
10
  # CaMA: A Chinese-English Bilingual LLaMA Model
11
 
12
  With the birth of ChatGPT, artificial intelligence has also entered the "iPhone moment," where various large language models (LLMs) have sprung up like mushrooms. The wave of these large models has quickly swept through artificial intelligence fields beyond natural language processing. However, training such a model requires extremely high hardware costs, and open-source language models are scarce due to various reasons, making Chinese language models even more scarce. It wasn't until the open-sourcing of LLaMA that a variety of language models based on LLaMA started to emerge. This project is also based on the LLaMA model. To further enhance Chinese language capabilities without compromising its original language distribution, we first <b>(1) perform additional pre-training on LLaMA (13B) using Chinese corpora, aiming to improve the model's Chinese comprehension and knowledge base while preserving its original English and code abilities to the greatest extent possible;</b> then, <b>(2) we fine-tune the model from the first step using an instruction dataset to enhance the language model's understanding of human instructions.</b>
 
196
  The effectiveness of information extraction is illustrated in the following figure. We tested different instructions for different tasks as well as the same instructions for the same task, and achieved good results for all of them.
197
 
198
  <p align="center" width="100%">
199
+ <a href="" target="_blank"><img src="https://github.com/zjunlp/CaMA/blob/main/assets/ie-case.jpg?raw=true" alt="IE" style="width: 60%; min-width: 60px; display: block; margin: auto;"></a>
200
  </p>
201
 
202
 
 
466
  ```
467
  Here is a screenshot of the web-based interaction:
468
  <p align="center" width="100%">
469
+ <a href="" target="_blank"><img src="https://github.com/zjunlp/CaMA/blob/main/assets/finetune_web.jpg?raw=true" alt="finetune-web" style="width: 100%; min-width: 100px; display: block; margin: auto;"></a>
470
  </p>
471
 
472
  **3. Usage of Instruction tuning Model**
 
479
 
480
  Here is a screenshot of the web-based interaction:
481
  <p align="center" width="100%">
482
+ <a href="" target="_blank"><img src="https://github.com/zjunlp/CaMA/blob/main/assets/lora_web.png?raw=true" alt="finetune-web" style="width: 100%; min-width: 100px; display: block; margin: auto;"></a>
483
  </p>
484
 
485
  The `instruction` is a required parameter, while `input` is an optional parameter. For general tasks (such as the examples provided in section `1.3`), you can directly enter the input in the `instruction` field. For information extraction tasks (as shown in the example in section `1.2`), please enter the instruction in the `instruction` field and the sentence to be extracted in the `input` field. We provide an information extraction prompt in section `2.5`.
 
502
  >
503
  > (2) Instruction tuning stage using LoRA. This stage enables the model to understand human instructions and generate appropriate responses.
504
 
505
+ ![](https://github.com/zjunlp/CaMA/blob/main/assets/main.jpg?raw=true)
506
 
507
  <h3 id="3-1">3.1 Dataset Construction (Pretraining)</h3>
508