cArlIcon commited on
Commit
bf79513
1 Parent(s): e2890c2

update README

Browse files
Files changed (1) hide show
  1. README.md +8 -5
README.md CHANGED
@@ -13,12 +13,14 @@ license_link: LICENSE
13
 
14
  The **Yi** series models are large language models trained from scratch by
15
  developers at [01.AI](https://01.ai/). The first public release contains two
16
- bilingual(English/Chinese) base models with the parameter sizes of 6B and 34B.
17
- Both of them are trained with 4K sequence length and can be extended to 32K
18
- during inference time.
 
19
 
20
  ## News
21
 
 
22
  - 🎯 **2023/11/02**: The base model of `Yi-6B` and `Yi-34B`.
23
 
24
 
@@ -36,8 +38,9 @@ during inference time.
36
  | Aquila-34B | 67.8 | 71.4 | 63.1 | - | - | - | - | - |
37
  | Falcon-180B | 70.4 | 58.0 | 57.8 | 59.0 | 54.0 | 77.3 | 68.8 | 34.0 |
38
  | Yi-6B | 63.2 | 75.5 | 72.0 | 72.2 | 42.8 | 72.3 | 68.7 | 19.8 |
39
- | **Yi-34B** | **76.3** | **83.7** | **81.4** | **82.8** | **54.3** | **80.1** | **76.4** | 37.1 |
40
-
 
41
 
42
  While benchmarking open-source models, we have observed a disparity between the
43
  results generated by our pipeline and those reported in public sources (e.g.
 
13
 
14
  The **Yi** series models are large language models trained from scratch by
15
  developers at [01.AI](https://01.ai/). The first public release contains two
16
+ bilingual(English/Chinese) base models with the parameter sizes of 6B(`Yi-6B`)
17
+ and 34B(`Yi-34B`). Both of them are trained with 4K sequence length and can be
18
+ extended to 32K during inference time. The `Yi-6B-200K` and `Yi-34B-200K` are
19
+ base model with 200K context length.
20
 
21
  ## News
22
 
23
+ - 🎯 **2023/11/06**: The base model of `Yi-6B-200K` and `Yi-34B-200K` with 200K context length.
24
  - 🎯 **2023/11/02**: The base model of `Yi-6B` and `Yi-34B`.
25
 
26
 
 
38
  | Aquila-34B | 67.8 | 71.4 | 63.1 | - | - | - | - | - |
39
  | Falcon-180B | 70.4 | 58.0 | 57.8 | 59.0 | 54.0 | 77.3 | 68.8 | 34.0 |
40
  | Yi-6B | 63.2 | 75.5 | 72.0 | 72.2 | 42.8 | 72.3 | 68.7 | 19.8 |
41
+ | Yi-6B-200K | 64.0 | 75.3 | 73.5 | 73.9 | 42.0 | 72.0 | 69.1 | 19.0 |
42
+ | **Yi-34B** | **76.3** | **83.7** | 81.4 | 82.8 | **54.3** | **80.1** | 76.4 | 37.1 |
43
+ | Yi-34B-200K | 76.1 | 83.6 | **81.9** | **83.4** | 52.7 | 79.7 | **76.6** | 36.3 |
44
 
45
  While benchmarking open-source models, we have observed a disparity between the
46
  results generated by our pipeline and those reported in public sources (e.g.