update Readme
Browse files
README.md
CHANGED
@@ -15,24 +15,37 @@ metrics:
|
|
15 |
- accuracy
|
16 |
---
|
17 |
|
18 |
-
<
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
19 |
|
20 |
# Model Card for UltraLink-LM
|
21 |
|
22 |
## Model Summary
|
23 |
-
> The UltraLink-LM is a massively multilingual generative language model that follows instructions in 5 languages, English, French, Russian, Spanish, and Chinese.
|
24 |
> UltraLink-LM outperforms [PolyLM-Chat-13b](https://huggingface.co/DAMO-NLP-MT/polylm-chat-13b), [Guanaco](JosephusCheung/Guanaco), and [Bloomz-7b1-mt](https://huggingface.co/bigscience/bloomz-7b1-mt) in code, math and chat abilities in four languages, and has a high-quality and diverse text generation performance in all languages.
|
25 |
-
> The UltraLink-LM is trained using [UltraLink](https://huggingface.co/datasets/R0k1e/UltraLink), [UltraChat](https://huggingface.co/datasets/stingning/ultrachat), [Magicoder-Evol](https://huggingface.co/datasets/ise-uiuc/Magicoder-Evol-Instruct-110K), [Magicoder-OSS](https://huggingface.co/datasets/ise-uiuc/Magicoder-OSS-Instruct-75K), and ShareGPT.
|
26 |
> We release the checkpoints under a MIT license to further our mission of multilingual technologies empowering a multilingual world.
|
27 |
|
28 |
-
- **Developed by:** [
|
29 |
- **Model type:** a Transformer style autoregressive massively multilingual language model.
|
30 |
- **Paper**: [UltraLink: An Open-Source Knowledge-Enhanced Multilingual Supervised Fine-tuning Dataset](https://arxiv.org/abs/2402.04588)
|
31 |
- **Languages**: Refer to the list of languages in the `language` section of this model card.
|
32 |
- **License**: MIT
|
33 |
- **Model**: [UltraLink-LM](https://huggingface.co/R0k1e/UltraLink-LM)
|
34 |
- **Model Size**: 13 billion parameters
|
35 |
-
- **Datasets**: [UltraLink](https://huggingface.co/datasets/R0k1e/UltraLink), [UltraChat](https://huggingface.co/datasets/stingning/ultrachat), [Magicoder-Evol](https://huggingface.co/datasets/ise-uiuc/Magicoder-Evol-Instruct-110K), [Magicoder-OSS](https://huggingface.co/datasets/ise-uiuc/Magicoder-OSS-Instruct-75K), and ShareGPT.
|
36 |
|
37 |
## Use
|
38 |
|
@@ -45,111 +58,128 @@ tokenizer = AutoTokenizer.from_pretrained(checkpoint)
|
|
45 |
ultralink_lm = AutoModelForCausalLM.from_pretrained(checkpoint)
|
46 |
|
47 |
# Chat abilities in Chinese
|
48 |
-
#
|
49 |
-
|
|
|
50 |
chat_outputs = ultralink_lm.generate(chat_inputs, max_new_tokens=512)
|
51 |
-
|
|
|
52 |
# Expected output:
|
53 |
"""
|
54 |
-
|
55 |
"""
|
56 |
# Translations in English:
|
57 |
"""
|
58 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
59 |
"""
|
60 |
|
61 |
# Code abilities in Russian
|
62 |
# Please implement a bubble sort algorithm in Python.
|
63 |
-
code_inputs = tokenizer.encode("Реализуйте алгоритм пузырьковой сортировки на Python.", return_tensors="pt")
|
64 |
code_outputs = ultralink_lm.generate(code_inputs, max_new_tokens=512)
|
65 |
print(tokenizer.decode(code_outputs[0]))
|
66 |
# Expected output:
|
67 |
"""
|
68 |
-
|
69 |
-
|
70 |
-
```python
|
71 |
-
def bubbleSort(arr):
|
72 |
-
n = len(arr)
|
73 |
-
|
|
|
74 |
for i in range(n):
|
|
|
75 |
for j in range(0, n-i-1):
|
|
|
76 |
if arr[j] > arr[j+1]:
|
|
|
77 |
arr[j], arr[j+1] = arr[j+1], arr[j]
|
78 |
|
|
|
79 |
arr = [64, 34, 25, 12, 22, 11, 90]
|
80 |
bubbleSort(arr)
|
81 |
-
|
82 |
print("Отсортированный массив:", arr)
|
83 |
\```
|
84 |
|
85 |
-
|
86 |
|
87 |
-
|
88 |
|
89 |
-
|
90 |
"""
|
91 |
# Translations in English:
|
92 |
"""
|
93 |
-
|
94 |
-
|
95 |
-
```python
|
96 |
-
def bubbleSort(arr):
|
97 |
-
n = len(arr)
|
98 |
-
|
|
|
99 |
for i in range(n):
|
|
|
100 |
for j in range(0, n-i-1):
|
|
|
101 |
if arr[j] > arr[j+1]:
|
|
|
102 |
arr[j], arr[j+1] = arr[j+1], arr[j]
|
103 |
|
|
|
104 |
arr = [64, 34, 25, 12, 22, 11, 90]
|
105 |
bubbleSort(arr)
|
106 |
-
|
107 |
print("Sorted array:", arr)
|
108 |
\```
|
109 |
|
110 |
-
|
111 |
|
112 |
-
|
113 |
|
114 |
-
Note
|
115 |
"""
|
116 |
|
117 |
# Math abilities in French
|
118 |
# When the length of a rectangle is twice its width, calculate the area of the rectangle if its perimeter is 18 units?
|
119 |
-
math_inputs = tokenizer.encode("Lorsque la longueur d'un rectangle est le double de sa largeur, calculer l'aire du rectangle si son périmètre est de 18 unités?", return_tensors="pt")
|
120 |
math_outputs = ultralink_lm.generate(math_inputs, max_new_tokens=512)
|
121 |
print(tokenizer.decode(math_outputs[0]))
|
122 |
# Expected output:
|
123 |
"""
|
124 |
-
|
125 |
-
|
126 |
-
Le périmètre
|
127 |
-
|
128 |
-
|
129 |
-
|
130 |
-
En divisant les deux côtés par 6, nous obtenons w = 3.
|
131 |
-
|
132 |
-
Par conséquent, la longueur du rectangle est de 2w = 2(3) = 6.
|
133 |
-
|
134 |
-
L'aire d'un rectangle est le produit de sa longueur et de sa largeur, donc l'aire est de 6 * 3 = 18.
|
135 |
-
|
136 |
-
La réponse est : 18
|
137 |
"""
|
138 |
# Translations in English:
|
139 |
"""
|
140 |
-
|
141 |
-
|
142 |
-
|
143 |
-
|
144 |
-
Simplifying
|
145 |
-
|
146 |
-
|
147 |
-
|
148 |
-
So the length of the rectangle is 2w = 2(3) = 6.
|
149 |
-
|
150 |
-
The area of a rectangle is the product of its length and width, so the area is 6 * 3 = 18.
|
151 |
-
|
152 |
-
The answer is: 18
|
153 |
"""
|
154 |
```
|
155 |
|
@@ -161,7 +191,7 @@ The answer is: 18
|
|
161 |
- Number of Samples seen during Finetuning: 1023K
|
162 |
- Batch size: 128
|
163 |
- Hardware: NVIDIA A100 80GB PCIe
|
164 |
-
- Software: BMTrain
|
165 |
|
166 |
### Data Sources
|
167 |
|
@@ -171,12 +201,17 @@ The UltraLink-LM is trained on the following datasets:
|
|
171 |
- [UltraChat](https://huggingface.co/datasets/stingning/ultrachat)
|
172 |
- [Magicoder-Evol](https://huggingface.co/datasets/ise-uiuc/Magicoder-Evol-Instruct-110K)
|
173 |
- [Magicoder-OSS](https://huggingface.co/datasets/ise-uiuc/Magicoder-OSS-Instruct-75K)
|
174 |
-
-
|
|
|
175 |
|
|
|
176 |
All the datasets are integrated into the UltraLink dataset.
|
177 |
|
178 |
## Evaluation
|
179 |
|
|
|
|
|
|
|
180 |
### Multilingual HumanEval
|
181 |
|
182 |
[HumanEval](https://github.com/openai/human-eval) is a well-known benchmark for evaluating the code ability of LLMs. It execute the code snippets generated by the model and evaluate their correctness. Since there are no existing multilingual test set for code generation, we use GPT-3.5 with carefully-designed prompts to translation HumanEval into other languages.
|
@@ -191,7 +226,7 @@ All the datasets are integrated into the UltraLink dataset.
|
|
191 |
|Okapi-7b | 12.2 | 11.0 | 8.5 | 8.5 | 8.5 | 9.8 |
|
192 |
|Guanaco-7b | 9.2 | 6.7 | 11.0 | 9.8 | 12.8 | 9.9 |
|
193 |
|Guanaco-13b| 18.3 | 15.9 | 9.8 | 8.5 | 14.6 | 12.2 |
|
194 |
-
|UltraLink-LM |
|
195 |
|
196 |
|
197 |
### MGSM
|
@@ -207,7 +242,7 @@ We employ [MGSM](https://github.com/google-research/url-nlp/tree/main/mgsm) to e
|
|
207 |
|Okapi-7b | 4.0 | 2.4 | 3.6 | 4.4 | 4.8 | 3.8 |
|
208 |
|Guanaco-7b | 4.0 | 1.6 | 3.2 | 2.8 | 4.4 | 3.0 |
|
209 |
|Guanaco-13b | 13.6 | 10.8 | 11.2 | 6.4 | 5.2 | 8.4 |
|
210 |
-
|UltraLink-LM|
|
211 |
|
212 |
### OMGEval
|
213 |
We use the [OMGEval](https://github.com/blcuicall/OMGEval) to evaluate the chat ability, which is a multilingual version of the widely-used English benchmark AlpacaEval.
|
@@ -221,11 +256,13 @@ We use the [OMGEval](https://github.com/blcuicall/OMGEval) to evaluate the chat
|
|
221 |
|Chimera-inst-chat-13b | 15.5 | 9.7 | 11.8 | 13.7 | 13.8 | 12.9 |
|
222 |
|Okapi-7b | 8.8 | 6.2 | 5.0 | 12.1 | 8.7 | 8.2 |
|
223 |
|Guanaco-7b | 4.6 | 3.8 | 0.4 | 1.8 | 1.2 | 2.4 |
|
224 |
-
|Guanaco-13b |
|
225 |
-
|UltraLink-LM | 28.8 |
|
226 |
|
227 |
## Citation
|
228 |
|
|
|
|
|
229 |
```bibtex
|
230 |
@misc{wang2024ultralink,
|
231 |
title={UltraLink: An Open-Source Knowledge-Enhanced Multilingual Supervised Fine-tuning Dataset},
|
|
|
15 |
- accuracy
|
16 |
---
|
17 |
|
18 |
+
<div align="center">
|
19 |
+
|
20 |
+
<img src="title.png" alt="UltraLink" width="200">
|
21 |
+
|
22 |
+
**multi-lingual, knowledge-grounded, multi-round dialogue dataset and model**
|
23 |
+
|
24 |
+
<p align="center">
|
25 |
+
<a href="#Introduction"> Introduction </a> •
|
26 |
+
<a href="#Construction-of-UltraLink">Construction Process</a> •
|
27 |
+
<a href="https://arxiv.org/abs/2402.04588">Paper</a> •
|
28 |
+
<a href="https://huggingface.co/datasets/R0k1e/UltraLink"> UltraLink</a> •
|
29 |
+
<a href="https://github.com/OpenBMB/UltraLink"> Github</a>
|
30 |
+
</p>
|
31 |
+
</div>
|
32 |
|
33 |
# Model Card for UltraLink-LM
|
34 |
|
35 |
## Model Summary
|
36 |
+
> The UltraLink-LM is a massively multilingual generative language model that follows instructions in 5 languages, English, French, Russian, Spanish, and Chinese. The model is capable of generating text in 5 languages with high quality and diversity.
|
37 |
> UltraLink-LM outperforms [PolyLM-Chat-13b](https://huggingface.co/DAMO-NLP-MT/polylm-chat-13b), [Guanaco](JosephusCheung/Guanaco), and [Bloomz-7b1-mt](https://huggingface.co/bigscience/bloomz-7b1-mt) in code, math and chat abilities in four languages, and has a high-quality and diverse text generation performance in all languages.
|
38 |
+
> The UltraLink-LM is trained using [UltraLink](https://huggingface.co/datasets/R0k1e/UltraLink), [UltraChat](https://huggingface.co/datasets/stingning/ultrachat), [Magicoder-Evol](https://huggingface.co/datasets/ise-uiuc/Magicoder-Evol-Instruct-110K), [Magicoder-OSS](https://huggingface.co/datasets/ise-uiuc/Magicoder-OSS-Instruct-75K), [MetaMathQA](https://huggingface.co/datasets/meta-math/MetaMathQA), and [ShareGPT](https://huggingface.co/datasets/openchat/openchat_sharegpt4_dataset/).
|
39 |
> We release the checkpoints under a MIT license to further our mission of multilingual technologies empowering a multilingual world.
|
40 |
|
41 |
+
- **Developed by:** [OpenBMB]((https://www.openbmb.cn/home))
|
42 |
- **Model type:** a Transformer style autoregressive massively multilingual language model.
|
43 |
- **Paper**: [UltraLink: An Open-Source Knowledge-Enhanced Multilingual Supervised Fine-tuning Dataset](https://arxiv.org/abs/2402.04588)
|
44 |
- **Languages**: Refer to the list of languages in the `language` section of this model card.
|
45 |
- **License**: MIT
|
46 |
- **Model**: [UltraLink-LM](https://huggingface.co/R0k1e/UltraLink-LM)
|
47 |
- **Model Size**: 13 billion parameters
|
48 |
+
- **Datasets**: [UltraLink](https://huggingface.co/datasets/R0k1e/UltraLink), [UltraChat](https://huggingface.co/datasets/stingning/ultrachat)(random select 10k samples), [Magicoder-Evol](https://huggingface.co/datasets/ise-uiuc/Magicoder-Evol-Instruct-110K), [Magicoder-OSS](https://huggingface.co/datasets/ise-uiuc/Magicoder-OSS-Instruct-75K), [MetaMathQA](https://huggingface.co/datasets/meta-math/MetaMathQA), and [ShareGPT](https://huggingface.co/datasets/openchat/openchat_sharegpt4_dataset/)(the English part of the dataset whose sample length is greater than 4k).
|
49 |
|
50 |
## Use
|
51 |
|
|
|
58 |
ultralink_lm = AutoModelForCausalLM.from_pretrained(checkpoint)
|
59 |
|
60 |
# Chat abilities in Chinese
|
61 |
+
# What is heavy cavalry?
|
62 |
+
first_question = "<s>[INST] 什么是重骑兵? [/INST]"
|
63 |
+
chat_inputs = tokenizer.encode(first_question, add_special_tokens=False, return_tensors="pt")
|
64 |
chat_outputs = ultralink_lm.generate(chat_inputs, max_new_tokens=512)
|
65 |
+
first_answer = tokenizer.decode(chat_outputs[0])
|
66 |
+
print(first_answer)
|
67 |
# Expected output:
|
68 |
"""
|
69 |
+
<s> [INST] 什么是重骑兵? [/INST] 重骑兵是一种历史上的战斗单位,通常由骑兵组成,他们在战斗中使用重型装甲和长矛。他们以在战场上的强大攻击能力而闻名,并且通常被用于突破敌军阵线或攻击敌方骑兵。重骑兵通常被认为是中世纪战争中最强大和最具威慑力的单位之一。</s>
|
70 |
"""
|
71 |
# Translations in English:
|
72 |
"""
|
73 |
+
<s> [INST] What is heavy cavalry? [/INST] The Heavy Cavalry is a historical fighting unit, usually consisting of cavalrymen who use heavy armour and spears in battle. They were known for their powerful attacks on the battlefield and were often used to break through enemy lines or attack enemy cavalry. Heavy cavalry is often considered one of the most powerful and intimidating units in medieval warfare.</s>
|
74 |
+
"""
|
75 |
+
|
76 |
+
# Second turn:
|
77 |
+
second_question = "<s>[INST] 重骑兵对中世纪的战场有哪些影响? [/INST]"
|
78 |
+
second_question = first_answer + second_question
|
79 |
+
chat_inputs = tokenizer.encode(second_question, add_special_tokens=False, return_tensors="pt")
|
80 |
+
chat_outputs = ultralink_lm.generate(chat_inputs, max_new_tokens=512)
|
81 |
+
second_answer = tokenizer.decode(chat_outputs[0])
|
82 |
+
print(second_answer)
|
83 |
+
# Expected output:
|
84 |
+
"""
|
85 |
+
<s> [INST] 什么是重骑兵? [/INST] 重骑兵是一种历史上的战斗单位,通常由骑兵组成,他们在战斗中使用重型装甲和长矛。他们以在战场上的强大攻击能力而闻名,并且
|
86 |
+
通常被用于突破敌军阵线或攻击敌方骑兵。重骑兵通常被认为是中世纪战争中最强大和最具威慑力的单位之一。</s><s> [INST] 重骑兵对中世纪的战场有哪些影响? [/INST]
|
87 |
+
重骑兵在中世纪的战场上起到了重要的作用。他们的强大攻击能力使他们成为战斗中的强大力量,并且他们的存在常常能够改变战斗的结果。重骑兵通常被用于突破敌军阵线
|
88 |
+
,并在战斗中创造突破口,这使得其他部队能够进入敌方阵地。他们还被用于攻击敌方骑兵,并且他们的重型装甲和长矛使他们在这种情况下具有优势。总的来说,重骑兵的
|
89 |
+
存在使得中世纪的战场更加复杂和不可预测,他们的存在对战斗的结果产生了重大影响。</s>
|
90 |
+
"""
|
91 |
+
#Translations in English:
|
92 |
+
"""
|
93 |
+
<s> [INST] What is heavy cavalry? [/INST] The Heavy Cavalry is a historical fighting unit, usually consisting of cavalrymen who use heavy armour and spears in battle. They were known for their powerful attacks on the battlefield and were often used to break through enemy lines or attack enemy cavalry. Heavy cavalry is often considered one of the most powerful and intimidating units in medieval warfare.</s><s>[INST] What impact did heavy cavalry have on the medieval battlefield? [/INST] Heavy cavalry played an important role on the battlefields of the Middle Ages. Their powerful attacks made them a formidable force in battle and their presence could often change the outcome of a battle. Heavy cavalry were often used to break through enemy lines and create breaches in battle, which allowed other troops to move into enemy positions. They were also used to attack enemy cavalry, and their heavy armour and spears gave them an advantage in such situations. Overall, the presence of heavy cavalry made the medieval battlefield more complex and unpredictable, and their presence had a major impact on the outcome of battles.</s>
|
94 |
"""
|
95 |
|
96 |
# Code abilities in Russian
|
97 |
# Please implement a bubble sort algorithm in Python.
|
98 |
+
code_inputs = tokenizer.encode("<s>[INST] Реализуйте алгоритм пузырьковой сортировки на Python. [/INST]", add_special_tokens=False, return_tensors="pt")
|
99 |
code_outputs = ultralink_lm.generate(code_inputs, max_new_tokens=512)
|
100 |
print(tokenizer.decode(code_outputs[0]))
|
101 |
# Expected output:
|
102 |
"""
|
103 |
+
<s> [INST] Реализуйте алгоритм пузырьковой сортировки на Python. [/INST] Конечно, вот простая реализация алгоритма пузырьковой сортировки на Python:
|
104 |
+
|
105 |
+
```python
|
106 |
+
def bubbleSort(arr):
|
107 |
+
n = len(arr)
|
108 |
+
|
109 |
+
# Проходимся по всем элементам массива
|
110 |
for i in range(n):
|
111 |
+
# Последние i элементов уже отсортированы
|
112 |
for j in range(0, n-i-1):
|
113 |
+
# Проходимся по массиву, пока не найдем элемент, больший, чем следующий
|
114 |
if arr[j] > arr[j+1]:
|
115 |
+
# Обмениваем элементы местами
|
116 |
arr[j], arr[j+1] = arr[j+1], arr[j]
|
117 |
|
118 |
+
# Проверяем функцию
|
119 |
arr = [64, 34, 25, 12, 22, 11, 90]
|
120 |
bubbleSort(arr)
|
|
|
121 |
print("Отсортированный массив:", arr)
|
122 |
\```
|
123 |
|
124 |
+
Эта реализация работает путем повторного прохождения по массиву и сравнения соседних элементов. Если элемент сравнивается больше, чем следующий элемент, они меняются местами. Этот процесс повторяется до тех пор, пока массив не будет отсортирован.
|
125 |
|
126 |
+
Вы можете изменить размер массива и его начальное состояние, чтобы увидеть, как функция работает с разными входными данными.
|
127 |
|
128 |
+
Примечание: Алгоритм пузырьковой сортировки не является эффективным для больших наборов данных, так как его средняя и худшая временная сложность составляют O(n^2), где n - количество элементов в массиве.</s>
|
129 |
"""
|
130 |
# Translations in English:
|
131 |
"""
|
132 |
+
<s> [INST] Implement the bubble sorting algorithm in Python. [/INST] Sure enough, here's a simple implementation of the bubble sort algorithm in Python:
|
133 |
+
|
134 |
+
```python
|
135 |
+
def bubbleSort(arr):
|
136 |
+
n = len(arr)
|
137 |
+
|
138 |
+
# Go through all elements of the array
|
139 |
for i in range(n):
|
140 |
+
# The last i elements are already sorted
|
141 |
for j in range(0, n-i-1):
|
142 |
+
# We traverse the array until we find an element greater than the next one
|
143 |
if arr[j] > arr[j+1]:
|
144 |
+
# Swapping elements
|
145 |
arr[j], arr[j+1] = arr[j+1], arr[j]
|
146 |
|
147 |
+
# Check the function
|
148 |
arr = [64, 34, 25, 12, 22, 11, 90]
|
149 |
bubbleSort(arr)
|
|
|
150 |
print("Sorted array:", arr)
|
151 |
\```
|
152 |
|
153 |
+
This implementation works by repeatedly traversing the array and comparing neighbouring elements. If an element is compared more than the next element, they are swapped. This process is repeated until the array is sorted.
|
154 |
|
155 |
+
You can change the array size and initial state to see how the function works with different input data.
|
156 |
|
157 |
+
Note: The bubble sort algorithm is not efficient for large datasets because its average and worst-case time complexity are O(n^2), where n is the number of elements in the array.</s>
|
158 |
"""
|
159 |
|
160 |
# Math abilities in French
|
161 |
# When the length of a rectangle is twice its width, calculate the area of the rectangle if its perimeter is 18 units?
|
162 |
+
math_inputs = tokenizer.encode("<s>[INST] Lorsque la longueur d'un rectangle est le double de sa largeur, calculer l'aire du rectangle si son périmètre est de 18 unités? [/INST]", add_special_tokens=False, return_tensors="pt")
|
163 |
math_outputs = ultralink_lm.generate(math_inputs, max_new_tokens=512)
|
164 |
print(tokenizer.decode(math_outputs[0]))
|
165 |
# Expected output:
|
166 |
"""
|
167 |
+
<s> [INST] Lorsque la longueur d'un rectangle est le double de sa largeur, calculer l'aire du rectangle si son périmètre est de 18 unités? [/INST]
|
168 |
+
Soit la largeur du rectangle $w$. Alors la longueur du rectangle est $2w$.
|
169 |
+
Le périmètre du rectangle est $2(w+2w)=18$.
|
170 |
+
En simplifiant, nous avons $6w=18$, donc $w=3$.
|
171 |
+
L'aire du rectangle est $w \cdot (2w) = 3 \cdot 6 = \boxed{18}$ unités carrées.
|
172 |
+
La réponse est : 18</s>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
173 |
"""
|
174 |
# Translations in English:
|
175 |
"""
|
176 |
+
<s> [INST] When the length of a rectangle is twice its width, calculate the area of the rectangle if its perimeter is 18 units? [/INST]
|
177 |
+
Let $w$ be the width of the rectangle. Then the length of the rectangle is $2w$.
|
178 |
+
La réponse est : 18
|
179 |
+
The perimeter of the rectangle is $2(w+2w)=18$.
|
180 |
+
Simplifying, we have $6w=18$, so $w=3$.
|
181 |
+
The area of the rectangle is $w \cdot (2w) = 3 \cdot 6 = \boxed{18}$ square units.
|
182 |
+
The answer is: 18</s>
|
|
|
|
|
|
|
|
|
|
|
|
|
183 |
"""
|
184 |
```
|
185 |
|
|
|
191 |
- Number of Samples seen during Finetuning: 1023K
|
192 |
- Batch size: 128
|
193 |
- Hardware: NVIDIA A100 80GB PCIe
|
194 |
+
- Software: [BMTrain](https://github.com/OpenBMB/BMTrain)
|
195 |
|
196 |
### Data Sources
|
197 |
|
|
|
201 |
- [UltraChat](https://huggingface.co/datasets/stingning/ultrachat)
|
202 |
- [Magicoder-Evol](https://huggingface.co/datasets/ise-uiuc/Magicoder-Evol-Instruct-110K)
|
203 |
- [Magicoder-OSS](https://huggingface.co/datasets/ise-uiuc/Magicoder-OSS-Instruct-75K)
|
204 |
+
- [MetaMathQA](https://huggingface.co/datasets/meta-math/MetaMathQA)
|
205 |
+
- [ShareGPT](https://huggingface.co/datasets/openchat/openchat_sharegpt4_dataset/)
|
206 |
|
207 |
+
We randomly select 10k samples from the UltraChat dataset and use them as the training set. And ShareGPT is filtered to keep only the English part of the dataset whose sample length is greater than 4k. The other datasets are used as auxiliary datasets for training.
|
208 |
All the datasets are integrated into the UltraLink dataset.
|
209 |
|
210 |
## Evaluation
|
211 |
|
212 |
+
We report three evaluations in this section: multilingual HumanEval, MGSM, and OMGEval.
|
213 |
+
Evaluations of modern LLMs may be biased and affected by many factors, we are also actively working on more comprehensive evaluation methods.
|
214 |
+
|
215 |
### Multilingual HumanEval
|
216 |
|
217 |
[HumanEval](https://github.com/openai/human-eval) is a well-known benchmark for evaluating the code ability of LLMs. It execute the code snippets generated by the model and evaluate their correctness. Since there are no existing multilingual test set for code generation, we use GPT-3.5 with carefully-designed prompts to translation HumanEval into other languages.
|
|
|
226 |
|Okapi-7b | 12.2 | 11.0 | 8.5 | 8.5 | 8.5 | 9.8 |
|
227 |
|Guanaco-7b | 9.2 | 6.7 | 11.0 | 9.8 | 12.8 | 9.9 |
|
228 |
|Guanaco-13b| 18.3 | 15.9 | 9.8 | 8.5 | 14.6 | 12.2 |
|
229 |
+
|UltraLink-LM | __60.4__ | __43.9__ | __40.9__ | __49.4__ | __39.6__ | __46.8__|
|
230 |
|
231 |
|
232 |
### MGSM
|
|
|
242 |
|Okapi-7b | 4.0 | 2.4 | 3.6 | 4.4 | 4.8 | 3.8 |
|
243 |
|Guanaco-7b | 4.0 | 1.6 | 3.2 | 2.8 | 4.4 | 3.0 |
|
244 |
|Guanaco-13b | 13.6 | 10.8 | 11.2 | 6.4 | 5.2 | 8.4 |
|
245 |
+
|UltraLink-LM| __70.4__ | __56.0__ | __70.4__ | __64.8__ | __63.6__ | __63.7__ |
|
246 |
|
247 |
### OMGEval
|
248 |
We use the [OMGEval](https://github.com/blcuicall/OMGEval) to evaluate the chat ability, which is a multilingual version of the widely-used English benchmark AlpacaEval.
|
|
|
256 |
|Chimera-inst-chat-13b | 15.5 | 9.7 | 11.8 | 13.7 | 13.8 | 12.9 |
|
257 |
|Okapi-7b | 8.8 | 6.2 | 5.0 | 12.1 | 8.7 | 8.2 |
|
258 |
|Guanaco-7b | 4.6 | 3.8 | 0.4 | 1.8 | 1.2 | 2.4 |
|
259 |
+
|Guanaco-13b | __29.0__ | 8.6 | 16.9 | 15.4 | 17.3 | 17.5 |
|
260 |
+
|UltraLink-LM | 28.8 | __21.9__ | __23.5__ | __37.6__ | __29.0__ | __28.2__ |
|
261 |
|
262 |
## Citation
|
263 |
|
264 |
+
Feel free to cite the repo if you think UltraLink is useful.
|
265 |
+
|
266 |
```bibtex
|
267 |
@misc{wang2024ultralink,
|
268 |
title={UltraLink: An Open-Source Knowledge-Enhanced Multilingual Supervised Fine-tuning Dataset},
|
infer.py
CHANGED
@@ -6,109 +6,118 @@ tokenizer = AutoTokenizer.from_pretrained(checkpoint)
|
|
6 |
ultralink_lm = AutoModelForCausalLM.from_pretrained(checkpoint)
|
7 |
|
8 |
# Chat abilities in Chinese
|
9 |
-
#
|
10 |
-
|
|
|
11 |
chat_outputs = ultralink_lm.generate(chat_inputs, max_new_tokens=512)
|
12 |
-
|
|
|
13 |
# Expected output:
|
14 |
"""
|
15 |
-
|
16 |
"""
|
17 |
# Translations in English:
|
18 |
"""
|
19 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
20 |
"""
|
21 |
|
22 |
# Code abilities in Russian
|
23 |
# Please implement a bubble sort algorithm in Python.
|
24 |
-
code_inputs = tokenizer.encode("Реализуйте алгоритм пузырьковой сортировки на Python.", return_tensors="pt")
|
25 |
code_outputs = ultralink_lm.generate(code_inputs, max_new_tokens=512)
|
26 |
print(tokenizer.decode(code_outputs[0]))
|
27 |
# Expected output:
|
28 |
"""
|
29 |
-
|
30 |
-
|
31 |
-
```python
|
32 |
-
def bubbleSort(arr):
|
33 |
-
n = len(arr)
|
34 |
-
|
|
|
35 |
for i in range(n):
|
|
|
36 |
for j in range(0, n-i-1):
|
|
|
37 |
if arr[j] > arr[j+1]:
|
|
|
38 |
arr[j], arr[j+1] = arr[j+1], arr[j]
|
39 |
|
|
|
40 |
arr = [64, 34, 25, 12, 22, 11, 90]
|
41 |
bubbleSort(arr)
|
42 |
-
|
43 |
print("Отсортированный массив:", arr)
|
44 |
```
|
45 |
|
46 |
-
|
47 |
|
48 |
-
|
49 |
|
50 |
-
|
51 |
"""
|
52 |
# Translations in English:
|
53 |
"""
|
54 |
-
|
55 |
-
|
56 |
-
```python
|
57 |
-
def bubbleSort(arr):
|
58 |
-
n = len(arr)
|
59 |
-
|
|
|
60 |
for i in range(n):
|
|
|
61 |
for j in range(0, n-i-1):
|
|
|
62 |
if arr[j] > arr[j+1]:
|
|
|
63 |
arr[j], arr[j+1] = arr[j+1], arr[j]
|
64 |
|
|
|
65 |
arr = [64, 34, 25, 12, 22, 11, 90]
|
66 |
bubbleSort(arr)
|
67 |
-
|
68 |
print("Sorted array:", arr)
|
69 |
```
|
70 |
|
71 |
-
|
72 |
|
73 |
-
|
74 |
|
75 |
-
Note
|
76 |
"""
|
77 |
|
78 |
# Math abilities in French
|
79 |
# When the length of a rectangle is twice its width, calculate the area of the rectangle if its perimeter is 18 units?
|
80 |
-
math_inputs = tokenizer.encode("Lorsque la longueur d'un rectangle est le double de sa largeur, calculer l'aire du rectangle si son périmètre est de 18 unités?", return_tensors="pt")
|
81 |
math_outputs = ultralink_lm.generate(math_inputs, max_new_tokens=512)
|
82 |
print(tokenizer.decode(math_outputs[0]))
|
83 |
# Expected output:
|
84 |
"""
|
85 |
-
|
86 |
-
|
87 |
-
Le périmètre
|
88 |
-
|
89 |
-
|
90 |
-
|
91 |
-
En divisant les deux côtés par 6, nous obtenons w = 3.
|
92 |
-
|
93 |
-
Par conséquent, la longueur du rectangle est de 2w = 2(3) = 6.
|
94 |
-
|
95 |
-
L'aire d'un rectangle est le produit de sa longueur et de sa largeur, donc l'aire est de 6 * 3 = 18.
|
96 |
-
|
97 |
-
La réponse est : 18
|
98 |
"""
|
99 |
# Translations in English:
|
100 |
"""
|
101 |
-
|
102 |
-
|
103 |
-
|
104 |
-
|
105 |
-
Simplifying
|
106 |
-
|
107 |
-
|
108 |
-
|
109 |
-
So the length of the rectangle is 2w = 2(3) = 6.
|
110 |
-
|
111 |
-
The area of a rectangle is the product of its length and width, so the area is 6 * 3 = 18.
|
112 |
-
|
113 |
-
The answer is: 18
|
114 |
"""
|
|
|
6 |
ultralink_lm = AutoModelForCausalLM.from_pretrained(checkpoint)
|
7 |
|
8 |
# Chat abilities in Chinese
|
9 |
+
# What is heavy cavalry?
|
10 |
+
first_question = "<s>[INST] 什么是重骑兵? [/INST]"
|
11 |
+
chat_inputs = tokenizer.encode(first_question, add_special_tokens=False, return_tensors="pt")
|
12 |
chat_outputs = ultralink_lm.generate(chat_inputs, max_new_tokens=512)
|
13 |
+
first_answer = tokenizer.decode(chat_outputs[0])
|
14 |
+
print(first_answer)
|
15 |
# Expected output:
|
16 |
"""
|
17 |
+
<s> [INST] 什么是重骑兵? [/INST] 重骑兵是一种历史上的战斗单位,通常由骑兵组成,他们在战斗中使用重型装甲和长矛。他们以在战场上的强大攻击能力而闻名,并且通常被用于突破敌军阵线或攻击敌方骑兵。重骑兵通常被认为是中世纪战争中最强大和最具威慑力的单位之一。</s>
|
18 |
"""
|
19 |
# Translations in English:
|
20 |
"""
|
21 |
+
<s> [INST] What is heavy cavalry? [/INST] The Heavy Cavalry is a historical fighting unit, usually consisting of cavalrymen who use heavy armour and spears in battle. They were known for their powerful attacks on the battlefield and were often used to break through enemy lines or attack enemy cavalry. Heavy cavalry is often considered one of the most powerful and intimidating units in medieval warfare.</s>
|
22 |
+
"""
|
23 |
+
|
24 |
+
# Second turn:
|
25 |
+
second_question = "<s>[INST] What impact did heavy cavalry have on the medieval battlefield? [/INST]"
|
26 |
+
second_question = first_answer + second_question
|
27 |
+
chat_inputs = tokenizer.encode(second_question, add_special_tokens=False, return_tensors="pt")
|
28 |
+
chat_outputs = ultralink_lm.generate(chat_inputs, max_new_tokens=512)
|
29 |
+
second_answer = tokenizer.decode(chat_outputs[0])
|
30 |
+
print(second_answer)
|
31 |
+
# Expected output:
|
32 |
+
"""
|
33 |
+
<s> [INST] What is heavy cavalry? [/INST] The Heavy Cavalry is a historical fighting unit, usually consisting of cavalrymen who use heavy armour and spears in battle. They were known for their powerful attacks on the battlefield and were often used to break through enemy lines or attack enemy cavalry. Heavy cavalry is often considered one of the most powerful and intimidating units in medieval warfare.</s><s>[INST] What impact did heavy cavalry have on the medieval battlefield? [/INST] Heavy cavalry played an important role on the battlefields of the Middle Ages. Their powerful attacks made them a formidable force in battle and their presence could often change the outcome of a battle. Heavy cavalry were often used to break through enemy lines and create breaches in battle, which allowed other troops to move into enemy positions. They were also used to attack enemy cavalry, and their heavy armour and spears gave them an advantage in such situations. Overall, the presence of heavy cavalry made the medieval battlefield more complex and unpredictable, and their presence had a major impact on the outcome of battles.</s>
|
34 |
"""
|
35 |
|
36 |
# Code abilities in Russian
|
37 |
# Please implement a bubble sort algorithm in Python.
|
38 |
+
code_inputs = tokenizer.encode("<s>[INST] Реализуйте алгоритм пузырьковой сортировки на Python. [/INST]", add_special_tokens=False, return_tensors="pt")
|
39 |
code_outputs = ultralink_lm.generate(code_inputs, max_new_tokens=512)
|
40 |
print(tokenizer.decode(code_outputs[0]))
|
41 |
# Expected output:
|
42 |
"""
|
43 |
+
<s> [INST] Реализуйте алгоритм пузырьковой сортировки на Python. [/INST] Конечно, вот простая реализация алгоритма пузырьковой сортировки на Python:
|
44 |
+
|
45 |
+
```python
|
46 |
+
def bubbleSort(arr):
|
47 |
+
n = len(arr)
|
48 |
+
|
49 |
+
# Проходимся по всем элементам массива
|
50 |
for i in range(n):
|
51 |
+
# Последние i элементов уже отсортированы
|
52 |
for j in range(0, n-i-1):
|
53 |
+
# Проходимся по массиву, пока не найдем элемент, больший, чем следующий
|
54 |
if arr[j] > arr[j+1]:
|
55 |
+
# Обмениваем элементы местами
|
56 |
arr[j], arr[j+1] = arr[j+1], arr[j]
|
57 |
|
58 |
+
# Проверяем функцию
|
59 |
arr = [64, 34, 25, 12, 22, 11, 90]
|
60 |
bubbleSort(arr)
|
|
|
61 |
print("Отсортированный массив:", arr)
|
62 |
```
|
63 |
|
64 |
+
Эта реализация работает путем повторного прохождения по массиву и сравнения соседних элементов. Если элемент сравнивается больше, чем следующий элемент, они меняются местами. Этот процесс повторяется до тех пор, пока массив не будет отсортирован.
|
65 |
|
66 |
+
Вы можете изменить размер массива и его начальное состояние, чтобы увидеть, как функция работает с разными входными данными.
|
67 |
|
68 |
+
Примечание: Алгоритм пузырьковой сортировки не является эффективным для больших наборов данных, так как его средняя и худшая временная сложность составляют O(n^2), где n - количество элементов в массиве.</s>
|
69 |
"""
|
70 |
# Translations in English:
|
71 |
"""
|
72 |
+
<s> [INST] Implement the bubble sorting algorithm in Python. [/INST] Sure enough, here's a simple implementation of the bubble sort algorithm in Python:
|
73 |
+
|
74 |
+
```python
|
75 |
+
def bubbleSort(arr):
|
76 |
+
n = len(arr)
|
77 |
+
|
78 |
+
# Go through all elements of the array
|
79 |
for i in range(n):
|
80 |
+
# The last i elements are already sorted
|
81 |
for j in range(0, n-i-1):
|
82 |
+
# We traverse the array until we find an element greater than the next one
|
83 |
if arr[j] > arr[j+1]:
|
84 |
+
# Swapping elements
|
85 |
arr[j], arr[j+1] = arr[j+1], arr[j]
|
86 |
|
87 |
+
# Check the function
|
88 |
arr = [64, 34, 25, 12, 22, 11, 90]
|
89 |
bubbleSort(arr)
|
|
|
90 |
print("Sorted array:", arr)
|
91 |
```
|
92 |
|
93 |
+
This implementation works by repeatedly traversing the array and comparing neighbouring elements. If an element is compared more than the next element, they are swapped. This process is repeated until the array is sorted.
|
94 |
|
95 |
+
You can change the array size and initial state to see how the function works with different input data.
|
96 |
|
97 |
+
Note: The bubble sort algorithm is not efficient for large datasets because its average and worst-case time complexity are O(n^2), where n is the number of elements in the array.</s>
|
98 |
"""
|
99 |
|
100 |
# Math abilities in French
|
101 |
# When the length of a rectangle is twice its width, calculate the area of the rectangle if its perimeter is 18 units?
|
102 |
+
math_inputs = tokenizer.encode("<s>[INST] Lorsque la longueur d'un rectangle est le double de sa largeur, calculer l'aire du rectangle si son périmètre est de 18 unités? [/INST]", add_special_tokens=False, return_tensors="pt")
|
103 |
math_outputs = ultralink_lm.generate(math_inputs, max_new_tokens=512)
|
104 |
print(tokenizer.decode(math_outputs[0]))
|
105 |
# Expected output:
|
106 |
"""
|
107 |
+
<s> [INST] Lorsque la longueur d'un rectangle est le double de sa largeur, calculer l'aire du rectangle si son périmètre est de 18 unités? [/INST]
|
108 |
+
Soit la largeur du rectangle $w$. Alors la longueur du rectangle est $2w$.
|
109 |
+
Le périmètre du rectangle est $2(w+2w)=18$.
|
110 |
+
En simplifiant, nous avons $6w=18$, donc $w=3$.
|
111 |
+
L'aire du rectangle est $w \cdot (2w) = 3 \cdot 6 = \boxed{18}$ unités carrées.
|
112 |
+
La réponse est : 18</s>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
113 |
"""
|
114 |
# Translations in English:
|
115 |
"""
|
116 |
+
<s> [INST] When the length of a rectangle is twice its width, calculate the area of the rectangle if its perimeter is 18 units? [/INST]
|
117 |
+
Let $w$ be the width of the rectangle. Then the length of the rectangle is $2w$.
|
118 |
+
La réponse est : 18
|
119 |
+
The perimeter of the rectangle is $2(w+2w)=18$.
|
120 |
+
Simplifying, we have $6w=18$, so $w=3$.
|
121 |
+
The area of the rectangle is $w \cdot (2w) = 3 \cdot 6 = \boxed{18}$ square units.
|
122 |
+
The answer is: 18</s>
|
|
|
|
|
|
|
|
|
|
|
|
|
123 |
"""
|