01-ai
/

Yi-6B-200K

@@ -119,29 +119,29 @@ pipeline_tag: text-generation
 - 🙌 Targeted as a bilingual language model and trained on 3T multilingual corpus, the Yi series models become one of the strongest LLM worldwide, showing promise in language understanding, commonsense reasoning, reading comprehension, and more. For example,
-  - For English language capability, the Yi series models ranked 2nd (just behind GPT-4), outperforming other LLMs (such as LLaMA2-chat-70B, Claude 2, and ChatGPT) on the [AlpacaEval Leaderboard](https://tatsu-lab.github.io/alpaca_eval/) in Dec 2023.
   - For Chinese language capability, the Yi series models landed in 2nd place (following GPT-4), surpassing other LLMs (such as Baidu ERNIE, Qwen, and Baichuan) on the [SuperCLUE](https://www.superclueai.com/) in Oct 2023.
-  - 🙏 (Credits to LLaMA) Thanks to the Transformer and LLaMA open-source communities, as they reducing the efforts required to build from scratch and enabling the utilization of the same tools within the AI ecosystem.
-  <details style="display: inline;"><summary> If you're interested in Yi's adoption of LLaMA architecture and license usage policy, see  <span style="color:  green;">Yi's relation with LLaMA.</span> ⬇️</summary> <ul> <br>
 > 💡 TL;DR
 >
-> The Yi series models adopt the same model architecture as LLaMA but are **NOT** derivatives of LLaMA.
-- Both Yi and LLaMA are all based on the Transformer structure, which has been the standard architecture for large language models since 2018.
-- Grounded in the Transformer architecture, LLaMA has become a new cornerstone for the majority of state-of-the-art open-source models due to its excellent stability, reliable convergence, and robust compatibility. This positions LLaMA as the recognized foundational framework for models including Yi.
-- Thanks to the Transformer and LLaMA architectures, other models can leverage their power, reducing the effort required to build from scratch and enabling the utilization of the same tools within their ecosystems.
-- However, the Yi series models are NOT derivatives of LLaMA, as they do not use LLaMA's weights.
-  - As LLaMA's structure is employed by the majority of open-source models, the key factors of determining model performance are training datasets, training pipelines, and training infrastructure.
-  - Developing in a unique and proprietary way, Yi has independently created its own high-quality training datasets, efficient training pipelines, and robust training infrastructure entirely from the ground up. This effort has led to excellent performance with Yi series models ranking just behind GPT4 and surpassing LLaMA on the [Alpaca Leaderboard in Dec 2023](https://tatsu-lab.github.io/alpaca_eval/).
 </ul>
 </details>
@@ -302,7 +302,7 @@ If you prefer to deploy Yi models locally,
     - [Docker](#quick-start---docker)
     - [conda-lock](#quick-start---conda-lock)
-  - ��‍♀️ and you have **limited** resources (for example, a MacBook Pro), you can use [llama.cpp](#quick-start---llamacpp)
 #### 🎯 Not to deploy Yi locally
@@ -336,6 +336,10 @@ If you want to chat with Yi with more customizable options (e.g., system prompt,
 - [Yi-34B-Chat](https://platform.lingyiwanwu.com/) (Yi official beta)
   - Access is available through a whitelist. Welcome to apply (fill out a form in [English](https://cn.mikecrm.com/l91ODJf) or [Chinese](https://cn.mikecrm.com/gnEZjiQ)).
 ### Quick start - pip
 This tutorial guides you through every step of running **Yi-34B-Chat locally on an A800 (80G)** and then performing inference.
@@ -344,7 +348,7 @@ This tutorial guides you through every step of running **Yi-34B-Chat locally on
 - Make sure Python 3.10 or a later version is installed.
-- If you want to run other Yi models, see [software and hardware requirements](#deployment)
 #### Step 1: Prepare your environment
@@ -435,6 +439,10 @@ Then you can see an output similar to the one below. 🥳
 </details>
 ### Quick start - Docker
 <details>
 <summary> Run Yi-34B-chat locally with Docker: a step-by-step guide. ⬇️</summary>
@@ -474,6 +482,7 @@ To install the dependencies, follow these steps:
 2. Execute <code>micromamba install -y -n yi -f conda-lock.yml</code> to create a conda environment named <code>yi</code> and install the necessary dependencies.
 </details>
 ### Quick start - llama.cpp
 <details>
 <summary> Run Yi-chat-6B-2bits locally with llama.cpp: a step-by-step guide. ⬇️</summary>
@@ -610,6 +619,10 @@ Now you have successfully asked a question to the Yi model and got an answer!
 </ul>
 </details>
 ### Web demo
 You can build a web UI demo for Yi **chat** models (note that Yi base models are not supported in this senario).
@@ -628,6 +641,10 @@ You can access the web UI by entering the address provided in the console into y
  ![Quick start - web demo](https://github.com/01-ai/Yi/blob/main/assets/img/yi_34b_chat_web_demo.gif?raw=true)
 ### Fine-tuning
 ```bash
@@ -749,10 +766,14 @@ cd finetune/scripts
 bash run_eval.sh
 ```
-Then you'll see the answer from both the base model and the finetuned model
 </ul>
 </details>
 ### Quantization
 #### GPT-Q
@@ -906,6 +927,10 @@ Below are detailed minimum VRAM requirements under different batch use cases.
 | Yi-34B               | 72 GB         | 4 x RTX 4090 <br> A800 (80 GB)               |
 | Yi-34B-200K          | 200 GB        | 4 x A800 (80 GB)                        |
 ### Learning hub
 <details>
@@ -923,18 +948,22 @@ At the same time, we also warmly invite you to join our collaborative effort by
 With all these resources at your fingertips, you're ready to start your exciting journey with Yi. Happy learning! 🥳
 #### Tutorials
 | Type        | Deliverable                                            |      Date      |     Author     |
 |-------------|--------------------------------------------------------|----------------|----------------|
 | Blog        | [实测零一万物Yi-VL多模态语言模型：能准确“识图吃瓜”](https://mp.weixin.qq.com/s/fu4O9XvJ03JhimsEyI-SsQ)              |  2024-02-02  |  [苏洋](https://github.com/soulteary)  |
 | Blog        | [本地运行零一万物 34B 大模型，使用 Llama.cpp & 21G 显存](https://zhuanlan.zhihu.com/p/668921042)                  |  2023-11-26  |  [苏洋](https://github.com/soulteary)  |
-| Blog        | [Running Yi-34B-Chat locally using LlamaEdge](https://www.secondstate.io/articles/yi-34b/)                   |  2023-11-30  |  [Second State](https://github.com/second-state)  |
 | Blog        | [零一万物模型折腾笔记：官方 Yi-34B 模型基础使用](https://zhuanlan.zhihu.com/p/671387298)                           | 2023-12-10 |  [苏洋](https://github.com/soulteary)  |
 | Blog        | [CPU 混合推理，非常见大模型量化方案：“二三五六” 位量化方案](https://zhuanlan.zhihu.com/p/671698216)                  | 2023-12-12 |  [苏洋](https://github.com/soulteary)  |
-| Blog        | [单卡 3 小时训练 Yi-6B 大模型 Agent：基于 LLaMA Factory 实战](https://zhuanlan.zhihu.com/p/678989191)             | 2024-01-22 | [郑耀威](https://github.com/hiyouga) |
 | Blog        | [零一万物开源Yi-VL多模态大模型，魔搭社区推理&微调最佳实践来啦！](https://zhuanlan.zhihu.com/p/680098411) | 2024-01-26  |  [ModelScope](https://github.com/modelscope)  |
-| Video       | [只需 24G 显存，用 vllm 跑起来 Yi-34B 中英双语大模型](https://www.bilibili.com/video/BV17t4y1f7Ee/)               | 2023-12-28 |  漆妮妮  |
-| Video       | [Install Yi 34B Locally - Chinese English Bilingual LLM](https://www.youtube.com/watch?v=CVQvj4Wrh4w&t=476s) | 2023-11-05  |  Fahd Mirza  |
 </details>
@@ -964,9 +993,9 @@ Yi has a comprehensive ecosystem, offering a range of tools, services, and model
 ### 💦 Upstream
-The Yi series models follow the same model architecture as LLaMA. By choosing Yi, you can leverage existing tools, libraries, and resources within the LLaMA ecosystem, eliminating the need to create new tools and enhancing development efficiency.
-For example, the Yi series models are saved in the format of the LLaMA model. You can directly use `LLaMAForCausalLM` and `LLaMATokenizer` to load the model. For more information, see [Use the chat model](#31-use-the-chat-model).
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
@@ -976,6 +1005,10 @@ tokenizer = AutoTokenizer.from_pretrained("01-ai/Yi-34b", use_fast=False)
 model = AutoModelForCausalLM.from_pretrained("01-ai/Yi-34b", device_map="auto")
 ```
 ### 🌊 Downstream
 > 💡 Tip
@@ -1072,6 +1105,10 @@ The Yi-34B and Yi-34B-200K models stand out as the top performers among open-sou
 - **Falcon-180B caveat**: Falcon-180B was not tested on QuAC and OBQA due to technical constraints. Its performance score is an average from other tasks, and considering the generally lower scores of these two tasks, Falcon-180B's capabilities are likely not underestimated.
 </details>
 # 🟢 Who can use Yi?
 Everyone! 🙌 ✅

 - 🙌 Targeted as a bilingual language model and trained on 3T multilingual corpus, the Yi series models become one of the strongest LLM worldwide, showing promise in language understanding, commonsense reasoning, reading comprehension, and more. For example,
+  - For English language capability, the Yi series models ranked 2nd (just behind GPT-4), outperforming other LLMs (such as Llama2-chat-70B, Claude 2, and ChatGPT) on the [AlpacaEval Leaderboard](https://tatsu-lab.github.io/alpaca_eval/) in Dec 2023.
   - For Chinese language capability, the Yi series models landed in 2nd place (following GPT-4), surpassing other LLMs (such as Baidu ERNIE, Qwen, and Baichuan) on the [SuperCLUE](https://www.superclueai.com/) in Oct 2023.
+  - 🙏 (Credits to Llama) Thanks to the Transformer and Llama open-source communities, as they reducing the efforts required to build from scratch and enabling the utilization of the same tools within the AI ecosystem.
+  <details style="display: inline;"><summary> If you're interested in Yi's adoption of Llama architecture and license usage policy, see  <span style="color:  green;">Yi's relation with Llama.</span> ⬇️</summary> <ul> <br>
 > 💡 TL;DR
 >
+> The Yi series models adopt the same model architecture as Llama but are **NOT** derivatives of Llama.
+- Both Yi and Llama are all based on the Transformer structure, which has been the standard architecture for large language models since 2018.
+- Grounded in the Transformer architecture, Llama has become a new cornerstone for the majority of state-of-the-art open-source models due to its excellent stability, reliable convergence, and robust compatibility. This positions Llama as the recognized foundational framework for models including Yi.
+- Thanks to the Transformer and Llama architectures, other models can leverage their power, reducing the effort required to build from scratch and enabling the utilization of the same tools within their ecosystems.
+- However, the Yi series models are NOT derivatives of Llama, as they do not use Llama's weights.
+  - As Llama's structure is employed by the majority of open-source models, the key factors of determining model performance are training datasets, training pipelines, and training infrastructure.
+  - Developing in a unique and proprietary way, Yi has independently created its own high-quality training datasets, efficient training pipelines, and robust training infrastructure entirely from the ground up. This effort has led to excellent performance with Yi series models ranking just behind GPT4 and surpassing Llama on the [Alpaca Leaderboard in Dec 2023](https://tatsu-lab.github.io/alpaca_eval/).
 </ul>
 </details>
     - [Docker](#quick-start---docker)
     - [conda-lock](#quick-start---conda-lock)
+  - 🙋‍♀️ and you have **limited** resources (for example, a MacBook Pro), you can use [llama.cpp](#quick-start---llamacpp).
 #### 🎯 Not to deploy Yi locally
 - [Yi-34B-Chat](https://platform.lingyiwanwu.com/) (Yi official beta)
   - Access is available through a whitelist. Welcome to apply (fill out a form in [English](https://cn.mikecrm.com/l91ODJf) or [Chinese](https://cn.mikecrm.com/gnEZjiQ)).
+<p align="right"> [
+  <a href="#top">Back to top ⬆️ </a>  ]
+</p>
 ### Quick start - pip
 This tutorial guides you through every step of running **Yi-34B-Chat locally on an A800 (80G)** and then performing inference.
 - Make sure Python 3.10 or a later version is installed.
+- If you want to run other Yi models, see [software and hardware requirements](#deployment).
 #### Step 1: Prepare your environment
 </details>
+<p align="right"> [
+  <a href="#top">Back to top ⬆️ </a>  ]
+</p>
 ### Quick start - Docker
 <details>
 <summary> Run Yi-34B-chat locally with Docker: a step-by-step guide. ⬇️</summary>
 2. Execute <code>micromamba install -y -n yi -f conda-lock.yml</code> to create a conda environment named <code>yi</code> and install the necessary dependencies.
 </details>
 ### Quick start - llama.cpp
 <details>
 <summary> Run Yi-chat-6B-2bits locally with llama.cpp: a step-by-step guide. ⬇️</summary>
 </ul>
 </details>
+<p align="right"> [
+  <a href="#top">Back to top ⬆️ </a>  ]
+</p>
 ### Web demo
 You can build a web UI demo for Yi **chat** models (note that Yi base models are not supported in this senario).
  ![Quick start - web demo](https://github.com/01-ai/Yi/blob/main/assets/img/yi_34b_chat_web_demo.gif?raw=true)
+<p align="right"> [
+  <a href="#top">Back to top ⬆️ </a>  ]
+</p>
 ### Fine-tuning
 ```bash
 bash run_eval.sh
 ```
+Then you'll see the answer from both the base model and the finetuned model.
 </ul>
 </details>
+<p align="right"> [
+  <a href="#top">Back to top ⬆️ </a>  ]
+</p>
 ### Quantization
 #### GPT-Q
 | Yi-34B               | 72 GB         | 4 x RTX 4090 <br> A800 (80 GB)               |
 | Yi-34B-200K          | 200 GB        | 4 x A800 (80 GB)                        |
+<p align="right"> [
+  <a href="#top">Back to top ⬆️ </a>  ]
+</p>
 ### Learning hub
 <details>
 With all these resources at your fingertips, you're ready to start your exciting journey with Yi. Happy learning! 🥳
 #### Tutorials
+##### Chinese tutorials
 | Type        | Deliverable                                            |      Date      |     Author     |
 |-------------|--------------------------------------------------------|----------------|----------------|
 | Blog        | [实测零一万物Yi-VL多模态语言模型：能准确“识图吃瓜”](https://mp.weixin.qq.com/s/fu4O9XvJ03JhimsEyI-SsQ)              |  2024-02-02  |  [苏洋](https://github.com/soulteary)  |
 | Blog        | [本地运行零一万物 34B 大模型，使用 Llama.cpp & 21G 显存](https://zhuanlan.zhihu.com/p/668921042)                  |  2023-11-26  |  [苏洋](https://github.com/soulteary)  |
 | Blog        | [零一万物模型折腾笔记：官方 Yi-34B 模型基础使用](https://zhuanlan.zhihu.com/p/671387298)                           | 2023-12-10 |  [苏洋](https://github.com/soulteary)  |
 | Blog        | [CPU 混合推理，非常见大模型量化方案：“二三五六” 位量化方案](https://zhuanlan.zhihu.com/p/671698216)                  | 2023-12-12 |  [苏洋](https://github.com/soulteary)  |
+| Blog        | [单卡 3 小时训练 Yi-6B 大模型 Agent：基于 Llama Factory 实战](https://zhuanlan.zhihu.com/p/678989191)             | 2024-01-22 | [郑耀威](https://github.com/hiyouga) |
 | Blog        | [零一万物开源Yi-VL多模态大模型，魔搭社区推理&微调最佳实践来啦！](https://zhuanlan.zhihu.com/p/680098411) | 2024-01-26  |  [ModelScope](https://github.com/modelscope)  |
+| Video       | [只需 24G 显存，用 vllm 跑起来 Yi-34B 中英双语大模型](https://www.bilibili.com/video/BV17t4y1f7Ee/)               | 2023-12-28 |  [漆妮妮](https://space.bilibili.com/1262370256)  |
+##### English tutorials
+| Type        | Deliverable                                            |      Date      |     Author     |
+|-------------|--------------------------------------------------------|----------------|----------------|
+| Blog        | [Running Yi-34B-Chat locally using LlamaEdge](https://www.secondstate.io/articles/yi-34b/)                   |  2023-11-30  |  [Second State](https://github.com/second-state)  |
+| Video       | [Install Yi 34B Locally - Chinese English Bilingual LLM](https://www.youtube.com/watch?v=CVQvj4Wrh4w&t=476s) | 2023-11-05  |  [Fahd Mirza](https://www.youtube.com/watch?v=CVQvj4Wrh4w&t=476s)  |
 </details>
 ### 💦 Upstream
+The Yi series models follow the same model architecture as Llama. By choosing Yi, you can leverage existing tools, libraries, and resources within the Llama ecosystem, eliminating the need to create new tools and enhancing development efficiency.
+For example, the Yi series models are saved in the format of the Llama model. You can directly use `LlamaForCausalLM` and `LlamaTokenizer` to load the model. For more information, see [Use the chat model](#31-use-the-chat-model).
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
 model = AutoModelForCausalLM.from_pretrained("01-ai/Yi-34b", device_map="auto")
 ```
+<p align="right"> [
+  <a href="#top">Back to top ⬆️ </a>  ]
+</p>
 ### 🌊 Downstream
 > 💡 Tip
 - **Falcon-180B caveat**: Falcon-180B was not tested on QuAC and OBQA due to technical constraints. Its performance score is an average from other tasks, and considering the generally lower scores of these two tasks, Falcon-180B's capabilities are likely not underestimated.
 </details>
+<p align="right"> [
+  <a href="#top">Back to top ⬆️ </a>  ]
+</p>
 # 🟢 Who can use Yi?
 Everyone! 🙌 ✅