keyfan commited on
Commit
736fbbc
1 Parent(s): 36cb93e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +27 -17
README.md CHANGED
@@ -7,31 +7,41 @@ language:
7
  - en
8
  ---
9
 
10
- This is a beta release of a QLoRa adapter model to [Falcon-40b](https://huggingface.co/tiiuae/falcon-40b).
11
- Please read the instruction carefully before downloading the model.
12
 
13
  Though Falcon is not specifically trained on Chinese corpus, it exhibits strong performance in Chinese Language Understanding in our experiment. We would like to explore out of curiosity whether a
14
- small amount of Chinese instruction data can push it further and make it better at speaking.
15
 
16
- The LoRa model is trained with the [QLoRa code](https://github.com/artidoro/qlora) on a subset of bilingual instruction data from [Alpaca-CoT dataset](https://huggingface.co/datasets/QingyiSi/Alpaca-CoT) for a mere 5k steps.
17
- The finetune model is not as good as the carefully continue-trained-and-finetuned LLaMA-models such as [OpenBuddy](https://huggingface.co/OpenBuddy) and [Ziya](https://huggingface.co/IDEA-CCNL/Ziya-LLaMA-13B-v1) in Chinese generation,
18
- still it quickly adapts to the new langauge and generate superisingly good result. We call for more research on applying Falcon-40b to the Chinese domain.
19
 
20
- ## Evalutions
 
 
21
 
22
- We evaluate on two Chinese language understanding benchmarks, [C-Eval](https://cevalbenchmark.com/) and Gaokao subset of [AGIEval](https://github.com/microsoft/AGIEval).
 
 
23
 
24
- * C-Eval made breaking change in 2023/06/08 from few-shot to zero-shot,
 
25
 
26
- Result on C-Eval test set with 5-shot and no CoT
 
27
 
28
- | Average | Avg(Hard) | STEM | Social Science | Humanities | Others |
29
- | - | - | - | - | - | - |
30
- | 40.4 | 30.1 | 35.8 | 47.6 | 42.0 | 40.6 |
31
 
 
 
 
32
 
33
- Result on GaoKao subset of C-Eval with 0-shot
34
 
35
- | Average | GK-chinese | GK-English | GK-geography | GK-history | GK-biology | GK-chemistry | GK-physics | GK-mathqa | GK-mathcloze
36
- | - | - | - | - | - | - | - | - | - | - |
37
- | 33.6 | 26.4 | 69.0 | 46.7 | 47.8 | 27.1 | 32.4 | 24.5 | 26.8 | 1.7 |
 
 
 
 
 
 
 
7
  - en
8
  ---
9
 
10
+ This is a QLoRa adapter model to [Falcon-40b](https://huggingface.co/tiiuae/falcon-40b).
 
11
 
12
  Though Falcon is not specifically trained on Chinese corpus, it exhibits strong performance in Chinese Language Understanding in our experiment. We would like to explore out of curiosity whether a
13
+ small amount of Chinese instruction data can push it further and make it better at speaking.The LoRa model is trained with the [QLoRa repo](https://github.com/artidoro/qlora) on a subset of bilingual instruction data from [Alpaca-CoT dataset](https://huggingface.co/datasets/QingyiSi/Alpaca-CoT).
14
 
15
+ The result is suprisingly good considering the number of Chinese tokens it has ever seen, though it shows clear limitions. Please check the examples below.
 
 
16
 
17
+ ```
18
+ Question: 序列1,4,9,16的后一个数是什么?
19
+ Answer: 序列1,4,9,16的后一个数是25。因为序列中的每个数字都是前一个数字加上一个递增的差值,第一个差值是3,第二个差值是5,第三个差值是7,第四个差值是9,所以最后一个数字应该是第四个数字16加上第四个差值9,即16+9=25。因此,序列1,4,9,16的后一个数是25。
20
 
21
+ Question: 已知全班所有同学的姓名存储于列表names中,所有同学的身高存储于列表heights中,它们的关系是一一对应的。请写一段Python代码,将所有同学的姓名按身高由低到高排序。
22
+ Answer: names = ['小明', '小红', '小张', '小李']
23
+ heights = [170, 165, 175, 180]
24
 
25
+ # 使用zip函数将姓名和身高对应起来,然后根据身高排序
26
+ sorted_names = sorted(names, key=lambda x: heights[names.index(x)])
27
 
28
+ # 输出排序后的姓名列表
29
+ print(sorted_names)
30
 
31
+ Question: 中国的“东北三省”指的是哪里?
32
+ Answer: 中国的东北三省指的是吉林省、黑龙江省和辽宁省。这些省份位于中国东北地区,北临俄罗斯和乌克兰,东临日本海。它们是中国最大的省份之一,占中国陆地面积的七分之一。东北三省以其自然风光、丰富的民族文化和历史遗产而闻名。它们是中国旅游业的重要目的地之一。您还有其他问题吗?
 
33
 
34
+ Question: “但愿人长久,千里共婵娟”,其中婵娟指的是什么? A、月亮 B、姻缘
35
+ Answer: 这道题目考察的是对于古诗词的理解和文化常识。这句话出自唐代诗人杜甫的《月夜》一诗,其中婵娟是指月亮。这句话的意思是希望人和月亮能够长久存在,共同见证世间美好的事物。婵娟是古代诗词中常用的一个词语,指的是月亮的光辉、美丽和媚态。因此,答案是A,月亮。
36
+ ```
37
 
 
38
 
39
+ ## Evalutions
40
+
41
+ We evaluate on [C-Eval](https://cevalbenchmark.com/). More results will be added later.
42
+
43
+ Result on C-Eval test set with 5-shot and no CoT, which is better than LLaMA-series models.
44
+
45
+ | Average | Avg(Hard) | STEM | Social Science | Humanities | Others |
46
+ | - | - | - | - | - | - |
47
+ | 42.0 | 32.4 | 38.5 | 50.3 | 42.5 | 40.5 |