aisingapore
/

llama3-8b-cpt-sea-lionv2-base

@@ -10,8 +10,7 @@ license: llama3
 # LLaMA3 8B SEA-LIONv2
 SEA-LION is a collection of Large Language Models (LLMs) which has been pretrained and instruct-tuned for the Southeast Asia (SEA) region.
-This model is continued pre-trained from the [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) model.
-This is the card for the LLaMA3 8B SEA-LIONv2 base model.
 SEA-LION stands for <i>Southeast Asian Languages In One Network</i>.
@@ -20,11 +19,6 @@ SEA-LION stands for <i>Southeast Asian Languages In One Network</i>.
 ### Model Description
-The LLaMA3 8B SEA-LIONv model is a significant leap forward in the field of Natural Language Processing,
-specifically trained to understand the SEA regional context.
-For tokenization, the model employs the default tokenizer used in Meta-Llama-3-8B-Instruct.
 The continued pre-training data for LLaMA3 8B SEA-LIONv2 base model encompasses approximately 48B tokens.
 - **Developed by:** Products Pillar, AI Singapore
@@ -33,11 +27,13 @@ The continued pre-training data for LLaMA3 8B SEA-LIONv2 base model encompasses
 - **Languages:** English, Indonesian, Thai, Vietnamese, Tamil
 - **License:** [LLaMA3 Community License](https://huggingface.co/meta-llama/Meta-Llama-3-8B/blob/main/LICENSE)
 ### Benchmark Performance
 We evaluated LLaMA3 8B SEA-LIONv2 base model on general language capabilities.
 #### General Language Capabilities
-For the evaluation of general language capabilities, we employed the [BHASA evaluation benchmark](https://arxiv.org/abs/2309.06085v2) across a variety of tasks.
 These tasks include Question Answering (QA), Sentiment Analysis (Sentiment), Toxicity Detection (Toxicity), Translation in both directions (Eng>Lang & Lang>Eng), Abstractive Summarization (Summ), Causal Reasoning (Causal) and Natural Language Inference (NLI).
 The evaluation was done **five-shot** with native prompts and only a sample of 100-1000 instances for each dataset was used as per the setting described in the paper.
@@ -46,6 +42,8 @@ The evaluation was done **five-shot** with native prompts and only a sample of 1
 To be released soon
 **English**
 | Model                                    | ARC   | BBH   | HellaSwag | MMLU  | GSM8k | Average |
@@ -85,7 +83,7 @@ LLaMA3 8B SEA-LIONv2 base model was continued pre-trained on 48B tokens of the f
 Note:
 - All token counts are counted using LLaMA3 tokenizer
 - wiki* sources includes Wikipedia, Wiki Books, Wiki Source and Wiki Voyage
-- Source of Tamil news is source with permission from [Seithi](https://seithi.mediacorp.sg/)
 ### Infrastructure

 # LLaMA3 8B SEA-LIONv2
 SEA-LION is a collection of Large Language Models (LLMs) which has been pretrained and instruct-tuned for the Southeast Asia (SEA) region.
+This is the card for the LLaMA3 8B SEA-LIONv2 base model which has undergone continued pre-training from the [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) model.
 SEA-LION stands for <i>Southeast Asian Languages In One Network</i>.
 ### Model Description
 The continued pre-training data for LLaMA3 8B SEA-LIONv2 base model encompasses approximately 48B tokens.
 - **Developed by:** Products Pillar, AI Singapore
 - **Languages:** English, Indonesian, Thai, Vietnamese, Tamil
 - **License:** [LLaMA3 Community License](https://huggingface.co/meta-llama/Meta-Llama-3-8B/blob/main/LICENSE)
+For tokenization, the model employs the default tokenizer used in Meta-Llama-3-8B-Instruct.
 ### Benchmark Performance
 We evaluated LLaMA3 8B SEA-LIONv2 base model on general language capabilities.
 #### General Language Capabilities
+For the evaluation of general language capabilities in SEA languages, we employed the [BHASA evaluation benchmark](https://arxiv.org/abs/2309.06085v2) across a variety of tasks.
 These tasks include Question Answering (QA), Sentiment Analysis (Sentiment), Toxicity Detection (Toxicity), Translation in both directions (Eng>Lang & Lang>Eng), Abstractive Summarization (Summ), Causal Reasoning (Causal) and Natural Language Inference (NLI).
 The evaluation was done **five-shot** with native prompts and only a sample of 100-1000 instances for each dataset was used as per the setting described in the paper.
 To be released soon
+We also evaluated the model on English capabilities using tasks from the Open LLM Leaderboard.
 **English**
 | Model                                    | ARC   | BBH   | HellaSwag | MMLU  | GSM8k | Average |
 Note:
 - All token counts are counted using LLaMA3 tokenizer
 - wiki* sources includes Wikipedia, Wiki Books, Wiki Source and Wiki Voyage
+- Tamil news is sourced with permission from [Seithi](https://seithi.mediacorp.sg/)
 ### Infrastructure