aisingapore
/

llama3-8b-cpt-sea-lionv2-base

@@ -7,10 +7,10 @@ language:
 - vi
 license: llama3
 ---
-# Llama3 8B SEA-LIONv2
 SEA-LION is a collection of Large Language Models (LLMs) which has been pretrained and instruct-tuned for the Southeast Asia (SEA) region.
-This is the card for the Llama3 8B SEA-LIONv2 base model which has undergone continued pre-training from the [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) model.
 SEA-LION stands for <i>Southeast Asian Languages In One Network</i>.
@@ -19,7 +19,7 @@ SEA-LION stands for <i>Southeast Asian Languages In One Network</i>.
 ### Model Description
-The continued pre-training data for Llama3 8B SEA-LIONv2 base model encompasses approximately 48B tokens.
 - **Developed by:** Products Pillar, AI Singapore
 - **Funded by:** Singapore NRF
@@ -30,7 +30,7 @@ The continued pre-training data for Llama3 8B SEA-LIONv2 base model encompasses
 For tokenization, the model employs the default tokenizer used in Meta-Llama-3-8B-Instruct.
 ### Benchmark Performance
-We evaluated Llama3 8B SEA-LIONv2 base model on general language capabilities.
 #### General Language Capabilities
 For the evaluation of general language capabilities in SEA languages, we employed the [BHASA evaluation benchmark](https://arxiv.org/abs/2309.06085v2) across a variety of tasks.
@@ -60,7 +60,7 @@ We also evaluated the model on English capabilities using tasks from the Open LL
 ### Data
-Llama3 8B SEA-LIONv2 base model was continued pre-trained on 48B tokens of the following data:
 | Data Source               | Unique Tokens (B) | Multiplier | Total Tokens (B) | Percentage (%) |
 |---------------------------|:-----------------:|:----------:|:----------------:|:--------------:|
@@ -87,10 +87,10 @@ Note:
 ### Infrastructure
-Llama3 8B SEA-LIONv2 was trained using [MosaicML Composer](https://github.com/mosaicml/composer)
 on the following hardware:
-| Training Details     | Llama3 8B SEA-LIONv2 |
 |----------------------|:--------------------:|
 | AWS EC2 p5d.24xlarge |          8 instances |
 | Nvidia H100 80GB GPU |          64          |
@@ -99,7 +99,7 @@ on the following hardware:
 ### Configuration
-| HyperParameter    | Llama3 8B SEA-LIONv2 |
 |-------------------|:--------------------:|
 | Precision         | bfloat16             |
 | Optimizer         | decoupled_adamw      |
@@ -111,33 +111,33 @@ on the following hardware:
 ## The Team
-Brandon Ong<br>
-Bryan Siow<br>
-Esther Choa<br>
-Huang Yuli<br>
-Lee Chwan Ren<br>
-Leong Wai Yi<br>
-Leong Wei Qi<br>
-Li Yier<br>
-Liu Bing Jie Darius<br>
-Lovenia Holy<br>
-Montalan Jann Railey<br>
-Ng Boon Cheong Raymond<br>
-Ngui Jian Gang<br>
-Nguyen Thanh Ngan<br>
-Nicholas Cheng<br>
-Ong Tat-Wee David<br>
-Ong Zhi Hao<br>
-Rengarajan Hamsawardhini<br>
-Susanto Yosephine<br>
-Tai Ngee Chia<br>
-Tan Choon Meng<br>
-Teo Eng Sipp Leslie<br>
-Teo Wei Yi<br>
-Tjhi William<br>
-Walter Teng<br>
-Wayne Lau<br>
-Yeo Yeow Tong<br>
 Yong Xianbin<br>

 - vi
 license: llama3
 ---
+# Llama3 8B CPT SEA-LIONv2
 SEA-LION is a collection of Large Language Models (LLMs) which has been pretrained and instruct-tuned for the Southeast Asia (SEA) region.
+This is the card for the Llama3 8B CPT SEA-LIONv2 base model which has undergone continued pre-training from the [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) model.
 SEA-LION stands for <i>Southeast Asian Languages In One Network</i>.
 ### Model Description
+The continued pre-training data for Llama3 8B CPT SEA-LIONv2 base model encompasses approximately 48B tokens.
 - **Developed by:** Products Pillar, AI Singapore
 - **Funded by:** Singapore NRF
 For tokenization, the model employs the default tokenizer used in Meta-Llama-3-8B-Instruct.
 ### Benchmark Performance
+We evaluated Llama3 8B CPT SEA-LIONv2 base model on general language capabilities.
 #### General Language Capabilities
 For the evaluation of general language capabilities in SEA languages, we employed the [BHASA evaluation benchmark](https://arxiv.org/abs/2309.06085v2) across a variety of tasks.
 ### Data
+Llama3 8B CPT SEA-LIONv2 base model was continued pre-trained on 48B tokens of the following data:
 | Data Source               | Unique Tokens (B) | Multiplier | Total Tokens (B) | Percentage (%) |
 |---------------------------|:-----------------:|:----------:|:----------------:|:--------------:|
 ### Infrastructure
+Llama3 8B CPT SEA-LIONv2 was trained using [MosaicML Composer](https://github.com/mosaicml/composer)
 on the following hardware:
+| Training Details     | Llama3 8B CPT SEA-LIONv2 |
 |----------------------|:--------------------:|
 | AWS EC2 p5d.24xlarge |          8 instances |
 | Nvidia H100 80GB GPU |          64          |
 ### Configuration
+| HyperParameter    | Llama3 8B CPT SEA-LIONv2 |
 |-------------------|:--------------------:|
 | Precision         | bfloat16             |
 | Optimizer         | decoupled_adamw      |
 ## The Team
+Choa Esther<br>
+Cheng Nicholas<br>
+Huang Yuli<br>
+Lau Wayne<br>
+Lee Chwan Ren<br>
+Leong Wai Yi<br>
+Leong Wei Qi<br>
+Li Yier<br>
+Liu Bing Jie Darius<br>
+Lovenia Holy<br>
+Montalan Jann Railey<br>
+Ng Boon Cheong Raymond<br>
+Ngui Jian Gang<br>
+Nguyen Thanh Ngan<br>
+Ong Brandon<br>
+Ong Tat-Wee David<br>
+Ong Zhi Hao<br>
+Rengarajan Hamsawardhini<br>
+Siow Bryan<br>
+Susanto Yosephine<br>
+Tai Ngee Chia<br>
+Tan Choon Meng<br>
+Teo Eng Sipp Leslie<br>
+Teo Wei Yi<br>
+Tjhi William<br>
+Teng Walter<br>
+Yeo Yeow Tong<br>
 Yong Xianbin<br>