MediaTek-Research
/

Breeze-7B-Instruct-v0_1

@@ -52,22 +52,6 @@ and is comparable with Mistral-7B-Instruct-v0.1 on MMLU and MT-Bench in English.
 | Mistral-7B-v0.1                           | 33.01        | 42.23          | 35.86      | 37.63      |
-## Inference Performance
-In this test, we use the first 1500 characters of one of 201802最高法院民事裁判書 as input and ask the model to rewrite the article.
-The max_new_tokens is set to 1000 (except Qwen/Qwen-7B, which is set to 400). All models were inferenced with `vllm` on 2 A6000 (TP=2 ).
-| Models                                                             | Speed (char/sec)  |Estimated Max Input Length (TC Char)|
-|--------------------------------------------------------------------|-------------------|--------------------------|
-| Yi-6B                                                        |   62.08           |    4.4k                  |
-| **Breeze-7B-Base-v0.1**                              |   59.57           |    10.1k                 |
-| Qwen-7B                                                       |   55.00           |    9.7k                  |
-| Qwen-14B                                                      |   51.12           |    9.7k                  |
-| Mistral-7B-v0.1                                          |   45.31           |    6.3k                 |
-| Taiwan-LLM-13B-v2.0-base                                |   19.61           |    2.6k                  |
-| Taiwan-LLM-7B-v2.1-base                                 |   16.23           |    2.6k                  |
-| Yi-34B                                                       |   15.18           |    4.4k                  |
 ## Chat Model Performance
 | Models                                     |        | TMMLU+ (ACC) | TMMLU+ (ACC) | DRCD (EM) | MT-Bench-tw (Score) | MMLU (ACC) | MMLU (ACC) | MT-Bench (Score) |
@@ -100,6 +84,23 @@ The max_new_tokens is set to 1000 (except Qwen/Qwen-7B, which is set to 400). Al
 | Taiwan-LLM-7B-v2.1-chat                             | 25.58        | 31.76          | 27.36      | 27.61      |
 ## Examples

 | Mistral-7B-v0.1                           | 33.01        | 42.23          | 35.86      | 37.63      |
 ## Chat Model Performance
 | Models                                     |        | TMMLU+ (ACC) | TMMLU+ (ACC) | DRCD (EM) | MT-Bench-tw (Score) | MMLU (ACC) | MMLU (ACC) | MT-Bench (Score) |
 | Taiwan-LLM-7B-v2.1-chat                             | 25.58        | 31.76          | 27.36      | 27.61      |
+## Inference Performance
+In this test, we use the first 1500 characters of one of 201802最高法院民事裁判書 as input and ask the model to rewrite the article.
+The max_new_tokens is set to 1000 (except Qwen/Qwen-7B, which is set to 400). All models were inferenced with `vllm` on 2 A6000 (TP=2 ).
+| Models                                                             | Speed (char/sec)  |Estimated Max Input Length (TC Char)|
+|--------------------------------------------------------------------|-------------------|--------------------------|
+| Yi-6B                                                        |   62.08           |    4.4k                  |
+| **Breeze-7B-Base-v0.1**                              |   59.57           |    10.1k                 |
+| **Breeze-7B-Instruct-64k-v0.1**                              |              |                     |
+| Qwen-7B                                                       |   55.00           |    9.7k                  |
+| Qwen-14B                                                      |   51.12           |    9.7k                  |
+| Mistral-7B-v0.1                                          |   45.31           |    6.3k                 |
+| Taiwan-LLM-13B-v2.0-base                                |   19.61           |    2.6k                  |
+| Taiwan-LLM-7B-v2.1-base                                 |   16.23           |    2.6k                  |
+| Yi-34B                                                       |   15.18           |    4.4k                  |
 ## Examples