tainc commited on
Commit
af197ab
1 Parent(s): b73ee8d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +35 -35
README.md CHANGED
@@ -7,10 +7,10 @@ language:
7
  - vi
8
  license: llama3
9
  ---
10
- # Llama3 8B SEA-LIONv2
11
 
12
  SEA-LION is a collection of Large Language Models (LLMs) which has been pretrained and instruct-tuned for the Southeast Asia (SEA) region.
13
- This is the card for the Llama3 8B SEA-LIONv2 base model which has undergone continued pre-training from the [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) model.
14
 
15
  SEA-LION stands for <i>Southeast Asian Languages In One Network</i>.
16
 
@@ -19,7 +19,7 @@ SEA-LION stands for <i>Southeast Asian Languages In One Network</i>.
19
 
20
  ### Model Description
21
 
22
- The continued pre-training data for Llama3 8B SEA-LIONv2 base model encompasses approximately 48B tokens.
23
 
24
  - **Developed by:** Products Pillar, AI Singapore
25
  - **Funded by:** Singapore NRF
@@ -30,7 +30,7 @@ The continued pre-training data for Llama3 8B SEA-LIONv2 base model encompasses
30
  For tokenization, the model employs the default tokenizer used in Meta-Llama-3-8B-Instruct.
31
 
32
  ### Benchmark Performance
33
- We evaluated Llama3 8B SEA-LIONv2 base model on general language capabilities.
34
 
35
  #### General Language Capabilities
36
  For the evaluation of general language capabilities in SEA languages, we employed the [BHASA evaluation benchmark](https://arxiv.org/abs/2309.06085v2) across a variety of tasks.
@@ -60,7 +60,7 @@ We also evaluated the model on English capabilities using tasks from the Open LL
60
 
61
  ### Data
62
 
63
- Llama3 8B SEA-LIONv2 base model was continued pre-trained on 48B tokens of the following data:
64
 
65
  | Data Source | Unique Tokens (B) | Multiplier | Total Tokens (B) | Percentage (%) |
66
  |---------------------------|:-----------------:|:----------:|:----------------:|:--------------:|
@@ -87,10 +87,10 @@ Note:
87
 
88
  ### Infrastructure
89
 
90
- Llama3 8B SEA-LIONv2 was trained using [MosaicML Composer](https://github.com/mosaicml/composer)
91
  on the following hardware:
92
 
93
- | Training Details | Llama3 8B SEA-LIONv2 |
94
  |----------------------|:--------------------:|
95
  | AWS EC2 p5d.24xlarge | 8 instances |
96
  | Nvidia H100 80GB GPU | 64 |
@@ -99,7 +99,7 @@ on the following hardware:
99
 
100
  ### Configuration
101
 
102
- | HyperParameter | Llama3 8B SEA-LIONv2 |
103
  |-------------------|:--------------------:|
104
  | Precision | bfloat16 |
105
  | Optimizer | decoupled_adamw |
@@ -111,33 +111,33 @@ on the following hardware:
111
 
112
  ## The Team
113
 
114
- Brandon Ong<br>
115
- Bryan Siow<br>
116
- Esther Choa<br>
117
- Huang Yuli<br>
118
- Lee Chwan Ren<br>
119
- Leong Wai Yi<br>
120
- Leong Wei Qi<br>
121
- Li Yier<br>
122
- Liu Bing Jie Darius<br>
123
- Lovenia Holy<br>
124
- Montalan Jann Railey<br>
125
- Ng Boon Cheong Raymond<br>
126
- Ngui Jian Gang<br>
127
- Nguyen Thanh Ngan<br>
128
- Nicholas Cheng<br>
129
- Ong Tat-Wee David<br>
130
- Ong Zhi Hao<br>
131
- Rengarajan Hamsawardhini<br>
132
- Susanto Yosephine<br>
133
- Tai Ngee Chia<br>
134
- Tan Choon Meng<br>
135
- Teo Eng Sipp Leslie<br>
136
- Teo Wei Yi<br>
137
- Tjhi William<br>
138
- Walter Teng<br>
139
- Wayne Lau<br>
140
- Yeo Yeow Tong<br>
141
  Yong Xianbin<br>
142
 
143
 
 
7
  - vi
8
  license: llama3
9
  ---
10
+ # Llama3 8B CPT SEA-LIONv2
11
 
12
  SEA-LION is a collection of Large Language Models (LLMs) which has been pretrained and instruct-tuned for the Southeast Asia (SEA) region.
13
+ This is the card for the Llama3 8B CPT SEA-LIONv2 base model which has undergone continued pre-training from the [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) model.
14
 
15
  SEA-LION stands for <i>Southeast Asian Languages In One Network</i>.
16
 
 
19
 
20
  ### Model Description
21
 
22
+ The continued pre-training data for Llama3 8B CPT SEA-LIONv2 base model encompasses approximately 48B tokens.
23
 
24
  - **Developed by:** Products Pillar, AI Singapore
25
  - **Funded by:** Singapore NRF
 
30
  For tokenization, the model employs the default tokenizer used in Meta-Llama-3-8B-Instruct.
31
 
32
  ### Benchmark Performance
33
+ We evaluated Llama3 8B CPT SEA-LIONv2 base model on general language capabilities.
34
 
35
  #### General Language Capabilities
36
  For the evaluation of general language capabilities in SEA languages, we employed the [BHASA evaluation benchmark](https://arxiv.org/abs/2309.06085v2) across a variety of tasks.
 
60
 
61
  ### Data
62
 
63
+ Llama3 8B CPT SEA-LIONv2 base model was continued pre-trained on 48B tokens of the following data:
64
 
65
  | Data Source | Unique Tokens (B) | Multiplier | Total Tokens (B) | Percentage (%) |
66
  |---------------------------|:-----------------:|:----------:|:----------------:|:--------------:|
 
87
 
88
  ### Infrastructure
89
 
90
+ Llama3 8B CPT SEA-LIONv2 was trained using [MosaicML Composer](https://github.com/mosaicml/composer)
91
  on the following hardware:
92
 
93
+ | Training Details | Llama3 8B CPT SEA-LIONv2 |
94
  |----------------------|:--------------------:|
95
  | AWS EC2 p5d.24xlarge | 8 instances |
96
  | Nvidia H100 80GB GPU | 64 |
 
99
 
100
  ### Configuration
101
 
102
+ | HyperParameter | Llama3 8B CPT SEA-LIONv2 |
103
  |-------------------|:--------------------:|
104
  | Precision | bfloat16 |
105
  | Optimizer | decoupled_adamw |
 
111
 
112
  ## The Team
113
 
114
+ Choa Esther<br>
115
+ Cheng Nicholas<br>
116
+ Huang Yuli<br>
117
+ Lau Wayne<br>
118
+ Lee Chwan Ren<br>
119
+ Leong Wai Yi<br>
120
+ Leong Wei Qi<br>
121
+ Li Yier<br>
122
+ Liu Bing Jie Darius<br>
123
+ Lovenia Holy<br>
124
+ Montalan Jann Railey<br>
125
+ Ng Boon Cheong Raymond<br>
126
+ Ngui Jian Gang<br>
127
+ Nguyen Thanh Ngan<br>
128
+ Ong Brandon<br>
129
+ Ong Tat-Wee David<br>
130
+ Ong Zhi Hao<br>
131
+ Rengarajan Hamsawardhini<br>
132
+ Siow Bryan<br>
133
+ Susanto Yosephine<br>
134
+ Tai Ngee Chia<br>
135
+ Tan Choon Meng<br>
136
+ Teo Eng Sipp Leslie<br>
137
+ Teo Wei Yi<br>
138
+ Tjhi William<br>
139
+ Teng Walter<br>
140
+ Yeo Yeow Tong<br>
141
  Yong Xianbin<br>
142
 
143