yi-01-ai
commited on
Commit
•
527602f
1
Parent(s):
249e18b
Auto Sync from git://github.com/01-ai/Yi.git/commit/6da0100d379e4446978ab09f10fd58c6cbdc452d
Browse files
README.md
CHANGED
@@ -71,14 +71,14 @@ pipeline_tag: text-generation
|
|
71 |
<details open>
|
72 |
<summary></b>📕 Table of Contents</b></summary>
|
73 |
|
74 |
-
- [What is Yi?](
|
75 |
-
- [Introduction](
|
76 |
-
- [Models](
|
77 |
- [Chat models](#chat-models)
|
78 |
- [Base models](#base-models)
|
79 |
- [Other info](#other-info)
|
80 |
-
- [News](
|
81 |
-
- [How to use Yi?](
|
82 |
- [Quick start](#quick-start)
|
83 |
- [Choose your path](#choose-your-path)
|
84 |
- [pip](#quick-start---pip)
|
@@ -90,30 +90,30 @@ pipeline_tag: text-generation
|
|
90 |
- [Quantization](#quantization)
|
91 |
- [Deployment](#deployment)
|
92 |
- [Learning hub](#learning-hub)
|
93 |
-
- [Why Yi?](
|
94 |
-
- [Ecosystem](
|
95 |
-
- [Upstream](
|
96 |
-
- [Downstream](
|
97 |
-
- [Serving](
|
98 |
-
- [
|
99 |
-
- [Fine-tuning](
|
100 |
- [API](#api)
|
101 |
-
- [Benchmarks](
|
102 |
-
- [Base model performance](
|
103 |
-
- [Chat model performance](
|
104 |
-
- [Who can use Yi?](
|
105 |
-
- [Misc.](
|
106 |
- [Acknowledgements](#acknowledgments)
|
107 |
-
- [Disclaimer](
|
108 |
-
- [License](
|
109 |
|
110 |
</details>
|
111 |
|
112 |
<hr>
|
113 |
|
114 |
-
#
|
115 |
|
116 |
-
##
|
117 |
|
118 |
- 🤖 The Yi series models are the next generation of open-source large language models trained from scratch by [01.AI](https://01.ai/).
|
119 |
|
@@ -149,7 +149,7 @@ pipeline_tag: text-generation
|
|
149 |
<a href="#top">Back to top ⬆️ </a> ]
|
150 |
</p>
|
151 |
|
152 |
-
##
|
153 |
|
154 |
<details open>
|
155 |
<summary>🎯 <b>2024/03/06</b>: The Yi-9B is open-sourced and available to the public.</summary>
|
@@ -211,7 +211,7 @@ sequence length and can be extended to 32K during inference time.
|
|
211 |
<a href="#top">Back to top ⬆️ </a> ]
|
212 |
</p>
|
213 |
|
214 |
-
##
|
215 |
|
216 |
Yi models come in multiple sizes and cater to different use cases. You can also fine-tune Yi models to meet your specific requirements.
|
217 |
|
@@ -272,7 +272,7 @@ Model | Intro | Default context window | Pretrained tokens | Training Data Date
|
|
272 |
</p>
|
273 |
|
274 |
|
275 |
-
#
|
276 |
|
277 |
- [Quick start](#quick-start)
|
278 |
- [Choose your path](#choose-your-path)
|
@@ -281,7 +281,7 @@ Model | Intro | Default context window | Pretrained tokens | Training Data Date
|
|
281 |
- [conda-lock](#quick-start---conda-lock)
|
282 |
- [llama.cpp](#quick-start---llamacpp)
|
283 |
- [Web demo](#web-demo)
|
284 |
-
- [Fine-tuning](#
|
285 |
- [Quantization](#quantization)
|
286 |
- [Deployment](#deployment)
|
287 |
- [Learning hub](#learning-hub)
|
@@ -301,7 +301,7 @@ Select one of the following paths to begin your journey with Yi!
|
|
301 |
If you prefer to deploy Yi models locally,
|
302 |
|
303 |
- 🙋♀️ and you have **sufficient** resources (for example, NVIDIA A800 80GB), you can choose one of the following methods:
|
304 |
-
- [pip](#pip)
|
305 |
- [Docker](#quick-start---docker)
|
306 |
- [conda-lock](#quick-start---conda-lock)
|
307 |
|
@@ -1012,31 +1012,31 @@ With all these resources at your fingertips, you're ready to start your exciting
|
|
1012 |
</details>
|
1013 |
|
1014 |
|
1015 |
-
#
|
1016 |
|
1017 |
-
- [
|
1018 |
-
- [
|
1019 |
-
- [
|
1020 |
-
- [
|
1021 |
-
- [
|
1022 |
-
- [
|
1023 |
- [API](#api)
|
1024 |
-
- [
|
1025 |
-
- [
|
1026 |
-
- [
|
1027 |
|
1028 |
-
##
|
1029 |
|
1030 |
Yi has a comprehensive ecosystem, offering a range of tools, services, and models to enrich your experiences and maximize productivity.
|
1031 |
|
1032 |
-
- [
|
1033 |
-
- [
|
1034 |
-
- [
|
1035 |
-
- [
|
1036 |
-
- [
|
1037 |
- [API](#api)
|
1038 |
|
1039 |
-
###
|
1040 |
|
1041 |
The Yi series models follow the same model architecture as Llama. By choosing Yi, you can leverage existing tools, libraries, and resources within the Llama ecosystem, eliminating the need to create new tools and enhancing development efficiency.
|
1042 |
|
@@ -1054,7 +1054,7 @@ model = AutoModelForCausalLM.from_pretrained("01-ai/Yi-34b", device_map="auto")
|
|
1054 |
<a href="#top">Back to top ⬆️ </a> ]
|
1055 |
</p>
|
1056 |
|
1057 |
-
###
|
1058 |
|
1059 |
> 💡 Tip
|
1060 |
>
|
@@ -1062,7 +1062,7 @@ model = AutoModelForCausalLM.from_pretrained("01-ai/Yi-34b", device_map="auto")
|
|
1062 |
>
|
1063 |
> - To help others quickly understand your work, it is recommended to use the format of `<model-name>: <model-intro> + <model-highlights>`.
|
1064 |
|
1065 |
-
####
|
1066 |
|
1067 |
If you want to get up with Yi in a few minutes, you can use the following services built upon Yi.
|
1068 |
|
@@ -1074,7 +1074,7 @@ If you want to get up with Yi in a few minutes, you can use the following servic
|
|
1074 |
|
1075 |
- [ScaleLLM](https://github.com/vectorch-ai/ScaleLLM#supported-models): you can use this service to run Yi models locally with added flexibility and customization.
|
1076 |
|
1077 |
-
####
|
1078 |
|
1079 |
If you have limited computational capabilities, you can use Yi's quantized models as follows.
|
1080 |
|
@@ -1084,7 +1084,7 @@ These quantized models have reduced precision but offer increased efficiency, su
|
|
1084 |
- [TheBloke/Yi-34B-GGUF](https://huggingface.co/TheBloke/Yi-34B-GGUF)
|
1085 |
- [TheBloke/Yi-34B-AWQ](https://huggingface.co/TheBloke/Yi-34B-AWQ)
|
1086 |
|
1087 |
-
####
|
1088 |
|
1089 |
If you're seeking to explore the diverse capabilities within Yi's thriving family, you can delve into Yi's fine-tuned models as below.
|
1090 |
|
@@ -1110,12 +1110,12 @@ If you're seeking to explore the diverse capabilities within Yi's thriving famil
|
|
1110 |
<a href="#top">Back to top ⬆️ </a> ]
|
1111 |
</p>
|
1112 |
|
1113 |
-
##
|
1114 |
|
1115 |
-
- [
|
1116 |
-
- [
|
1117 |
|
1118 |
-
###
|
1119 |
|
1120 |
Yi-34B-Chat model demonstrates exceptional performance, ranking first among all existing open-source models in the benchmarks including MMLU, CMMLU, BBH, GSM8k, and more.
|
1121 |
|
@@ -1132,7 +1132,7 @@ Yi-34B-Chat model demonstrates exceptional performance, ranking first among all
|
|
1132 |
<strong>*</strong>: C-Eval results are evaluated on the validation datasets
|
1133 |
</details>
|
1134 |
|
1135 |
-
###
|
1136 |
|
1137 |
#### Yi-34B and Yi-34B-200K
|
1138 |
|
@@ -1158,7 +1158,7 @@ Yi-9B is almost the best among a range of similar-sized open-source models (incl
|
|
1158 |
|
1159 |
![Yi-9B benchmark - details](https://github.com/01-ai/Yi/blob/main/assets/img/Yi-9B_benchmark_details.png?raw=true)
|
1160 |
|
1161 |
-
- In terms of **overall** ability (Mean-All), Yi-9B performs the best among similarly sized open-source models, surpassing DeepSeek-Coder, DeepSeek-Math, Mistral-7B, SOLAR-10.7B, and Gemma-7B.
|
1162 |
|
1163 |
![Yi-9B benchmark - overall](https://github.com/01-ai/Yi/blob/main/assets/img/Yi-9B_benchmark_overall.png?raw=true)
|
1164 |
|
@@ -1178,7 +1178,7 @@ Yi-9B is almost the best among a range of similar-sized open-source models (incl
|
|
1178 |
<a href="#top">Back to top ⬆️ </a> ]
|
1179 |
</p>
|
1180 |
|
1181 |
-
#
|
1182 |
|
1183 |
Everyone! 🙌 ✅
|
1184 |
|
@@ -1190,7 +1190,7 @@ Everyone! 🙌 ✅
|
|
1190 |
<a href="#top">Back to top ⬆️ </a> ]
|
1191 |
</p>
|
1192 |
|
1193 |
-
#
|
1194 |
|
1195 |
### Acknowledgments
|
1196 |
|
@@ -1202,7 +1202,7 @@ A heartfelt thank you to each of you who have made contributions to the Yi commu
|
|
1202 |
<a href="#top">Back to top ⬆️ </a> ]
|
1203 |
</p>
|
1204 |
|
1205 |
-
###
|
1206 |
|
1207 |
We use data compliance checking algorithms during the training process, to
|
1208 |
ensure the compliance of the trained model to the best of our ability. Due to
|
@@ -1217,7 +1217,7 @@ as well as any associated data security concerns.
|
|
1217 |
<a href="#top">Back to top ⬆️ </a> ]
|
1218 |
</p>
|
1219 |
|
1220 |
-
###
|
1221 |
|
1222 |
The source code in this repo is licensed under the [Apache 2.0
|
1223 |
license](https://github.com/01-ai/Yi/blob/main/LICENSE). The Yi series models are fully open for academic research and free for commercial use, with automatic permission granted upon application. All usage must adhere to the [Yi Series Models Community License Agreement 2.1](https://github.com/01-ai/Yi/blob/main/MODEL_LICENSE_AGREEMENT.txt).
|
|
|
71 |
<details open>
|
72 |
<summary></b>📕 Table of Contents</b></summary>
|
73 |
|
74 |
+
- [What is Yi?](#what-is-yi)
|
75 |
+
- [Introduction](#introduction)
|
76 |
+
- [Models](#models)
|
77 |
- [Chat models](#chat-models)
|
78 |
- [Base models](#base-models)
|
79 |
- [Other info](#other-info)
|
80 |
+
- [News](#news)
|
81 |
+
- [How to use Yi?](#how-to-use-yi)
|
82 |
- [Quick start](#quick-start)
|
83 |
- [Choose your path](#choose-your-path)
|
84 |
- [pip](#quick-start---pip)
|
|
|
90 |
- [Quantization](#quantization)
|
91 |
- [Deployment](#deployment)
|
92 |
- [Learning hub](#learning-hub)
|
93 |
+
- [Why Yi?](#why-yi)
|
94 |
+
- [Ecosystem](#ecosystem)
|
95 |
+
- [Upstream](#upstream)
|
96 |
+
- [Downstream](#downstream)
|
97 |
+
- [Serving](#serving)
|
98 |
+
- [Quantization](#quantization-1)
|
99 |
+
- [Fine-tuning](#fine-tuning-1)
|
100 |
- [API](#api)
|
101 |
+
- [Benchmarks](#benchmarks)
|
102 |
+
- [Base model performance](#base-model-performance)
|
103 |
+
- [Chat model performance](#chat-model-performance)
|
104 |
+
- [Who can use Yi?](#who-can-use-yi)
|
105 |
+
- [Misc.](#misc)
|
106 |
- [Acknowledgements](#acknowledgments)
|
107 |
+
- [Disclaimer](#disclaimer)
|
108 |
+
- [License](#license)
|
109 |
|
110 |
</details>
|
111 |
|
112 |
<hr>
|
113 |
|
114 |
+
# What is Yi?
|
115 |
|
116 |
+
## Introduction
|
117 |
|
118 |
- 🤖 The Yi series models are the next generation of open-source large language models trained from scratch by [01.AI](https://01.ai/).
|
119 |
|
|
|
149 |
<a href="#top">Back to top ⬆️ </a> ]
|
150 |
</p>
|
151 |
|
152 |
+
## News
|
153 |
|
154 |
<details open>
|
155 |
<summary>🎯 <b>2024/03/06</b>: The Yi-9B is open-sourced and available to the public.</summary>
|
|
|
211 |
<a href="#top">Back to top ⬆️ </a> ]
|
212 |
</p>
|
213 |
|
214 |
+
## Models
|
215 |
|
216 |
Yi models come in multiple sizes and cater to different use cases. You can also fine-tune Yi models to meet your specific requirements.
|
217 |
|
|
|
272 |
</p>
|
273 |
|
274 |
|
275 |
+
# How to use Yi?
|
276 |
|
277 |
- [Quick start](#quick-start)
|
278 |
- [Choose your path](#choose-your-path)
|
|
|
281 |
- [conda-lock](#quick-start---conda-lock)
|
282 |
- [llama.cpp](#quick-start---llamacpp)
|
283 |
- [Web demo](#web-demo)
|
284 |
+
- [Fine-tuning](#fine-tuning)
|
285 |
- [Quantization](#quantization)
|
286 |
- [Deployment](#deployment)
|
287 |
- [Learning hub](#learning-hub)
|
|
|
301 |
If you prefer to deploy Yi models locally,
|
302 |
|
303 |
- 🙋♀️ and you have **sufficient** resources (for example, NVIDIA A800 80GB), you can choose one of the following methods:
|
304 |
+
- [pip](#quick-start---pip)
|
305 |
- [Docker](#quick-start---docker)
|
306 |
- [conda-lock](#quick-start---conda-lock)
|
307 |
|
|
|
1012 |
</details>
|
1013 |
|
1014 |
|
1015 |
+
# Why Yi?
|
1016 |
|
1017 |
+
- [Ecosystem](#ecosystem)
|
1018 |
+
- [Upstream](#upstream)
|
1019 |
+
- [Downstream](#downstream)
|
1020 |
+
- [Serving](#serving)
|
1021 |
+
- [Quantization](#quantization-1)
|
1022 |
+
- [Fine-tuning](#fine-tuning-1)
|
1023 |
- [API](#api)
|
1024 |
+
- [Benchmarks](#benchmarks)
|
1025 |
+
- [Chat model performance](#chat-model-performance)
|
1026 |
+
- [Base model performance](#base-model-performance)
|
1027 |
|
1028 |
+
## Ecosystem
|
1029 |
|
1030 |
Yi has a comprehensive ecosystem, offering a range of tools, services, and models to enrich your experiences and maximize productivity.
|
1031 |
|
1032 |
+
- [Upstream](#upstream)
|
1033 |
+
- [Downstream](#downstream)
|
1034 |
+
- [Serving](#serving)
|
1035 |
+
- [Quantitation](#️quantitation)
|
1036 |
+
- [Fine-tuning](#️fine-tuning)
|
1037 |
- [API](#api)
|
1038 |
|
1039 |
+
### Upstream
|
1040 |
|
1041 |
The Yi series models follow the same model architecture as Llama. By choosing Yi, you can leverage existing tools, libraries, and resources within the Llama ecosystem, eliminating the need to create new tools and enhancing development efficiency.
|
1042 |
|
|
|
1054 |
<a href="#top">Back to top ⬆️ </a> ]
|
1055 |
</p>
|
1056 |
|
1057 |
+
### Downstream
|
1058 |
|
1059 |
> 💡 Tip
|
1060 |
>
|
|
|
1062 |
>
|
1063 |
> - To help others quickly understand your work, it is recommended to use the format of `<model-name>: <model-intro> + <model-highlights>`.
|
1064 |
|
1065 |
+
#### Serving
|
1066 |
|
1067 |
If you want to get up with Yi in a few minutes, you can use the following services built upon Yi.
|
1068 |
|
|
|
1074 |
|
1075 |
- [ScaleLLM](https://github.com/vectorch-ai/ScaleLLM#supported-models): you can use this service to run Yi models locally with added flexibility and customization.
|
1076 |
|
1077 |
+
#### Quantization
|
1078 |
|
1079 |
If you have limited computational capabilities, you can use Yi's quantized models as follows.
|
1080 |
|
|
|
1084 |
- [TheBloke/Yi-34B-GGUF](https://huggingface.co/TheBloke/Yi-34B-GGUF)
|
1085 |
- [TheBloke/Yi-34B-AWQ](https://huggingface.co/TheBloke/Yi-34B-AWQ)
|
1086 |
|
1087 |
+
#### Fine-tuning
|
1088 |
|
1089 |
If you're seeking to explore the diverse capabilities within Yi's thriving family, you can delve into Yi's fine-tuned models as below.
|
1090 |
|
|
|
1110 |
<a href="#top">Back to top ⬆️ </a> ]
|
1111 |
</p>
|
1112 |
|
1113 |
+
## Benchmarks
|
1114 |
|
1115 |
+
- [Chat model performance](#-chat-model-performance)
|
1116 |
+
- [Base model performance](#-base-model-performance)
|
1117 |
|
1118 |
+
### Chat model performance
|
1119 |
|
1120 |
Yi-34B-Chat model demonstrates exceptional performance, ranking first among all existing open-source models in the benchmarks including MMLU, CMMLU, BBH, GSM8k, and more.
|
1121 |
|
|
|
1132 |
<strong>*</strong>: C-Eval results are evaluated on the validation datasets
|
1133 |
</details>
|
1134 |
|
1135 |
+
### Base model performance
|
1136 |
|
1137 |
#### Yi-34B and Yi-34B-200K
|
1138 |
|
|
|
1158 |
|
1159 |
![Yi-9B benchmark - details](https://github.com/01-ai/Yi/blob/main/assets/img/Yi-9B_benchmark_details.png?raw=true)
|
1160 |
|
1161 |
+
- In terms of **overall** ability (`Mean-All), Yi-9B performs the best among similarly sized open-source models, surpassing DeepSeek-Coder, DeepSeek-Math, Mistral-7B, SOLAR-10.7B, and Gemma-7B.
|
1162 |
|
1163 |
![Yi-9B benchmark - overall](https://github.com/01-ai/Yi/blob/main/assets/img/Yi-9B_benchmark_overall.png?raw=true)
|
1164 |
|
|
|
1178 |
<a href="#top">Back to top ⬆️ </a> ]
|
1179 |
</p>
|
1180 |
|
1181 |
+
# Who can use Yi?
|
1182 |
|
1183 |
Everyone! 🙌 ✅
|
1184 |
|
|
|
1190 |
<a href="#top">Back to top ⬆️ </a> ]
|
1191 |
</p>
|
1192 |
|
1193 |
+
# Misc.
|
1194 |
|
1195 |
### Acknowledgments
|
1196 |
|
|
|
1202 |
<a href="#top">Back to top ⬆️ </a> ]
|
1203 |
</p>
|
1204 |
|
1205 |
+
### Disclaimer
|
1206 |
|
1207 |
We use data compliance checking algorithms during the training process, to
|
1208 |
ensure the compliance of the trained model to the best of our ability. Due to
|
|
|
1217 |
<a href="#top">Back to top ⬆️ </a> ]
|
1218 |
</p>
|
1219 |
|
1220 |
+
### License
|
1221 |
|
1222 |
The source code in this repo is licensed under the [Apache 2.0
|
1223 |
license](https://github.com/01-ai/Yi/blob/main/LICENSE). The Yi series models are fully open for academic research and free for commercial use, with automatic permission granted upon application. All usage must adhere to the [Yi Series Models Community License Agreement 2.1](https://github.com/01-ai/Yi/blob/main/MODEL_LICENSE_AGREEMENT.txt).
|