aisingapore
/

llama3-8b-cpt-sea-lionv2-base

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

waiyiaisg commited on Jul 30, 2024

Commit

9f28f9b

·

verified ·

1 Parent(s): e261300

update english base results

Files changed (1) hide show

README.md +22 -5

README.md CHANGED Viewed

@@ -33,13 +33,30 @@ The continued pre-training data for LLaMA3 8B SEA-LIONv2 base model encompasses
 - **Languages:** English, Indonesian, Thai, Vietnamese, Tamil
 - **License:** [LLaMA3 Community License](https://huggingface.co/meta-llama/Meta-Llama-3-8B/blob/main/LICENSE)
-### Performance Benchmarks
-LLaMA3 8B SEA-LIONv has a similar English performance with LLaMA3-8B-Base model:
-| Model                | ARC   | BBH   | HellaSwag | MMLU  | GSM8k  | Average |
-|----------------------|:-----:|:-----:|:---------:|:-----:|:------:|:-------:|
-| LLaMA3 8B SEA-LIONv2 | 58.87 | 47.70 |   81.14   | 63.11 |  50.49 | 60.26   |
 ## Training Details

 - **Languages:** English, Indonesian, Thai, Vietnamese, Tamil
 - **License:** [LLaMA3 Community License](https://huggingface.co/meta-llama/Meta-Llama-3-8B/blob/main/LICENSE)
+### Benchmark Performance
+We evaluated LLaMA3 8B SEA-LIONv2 base model on general language capabilities.
+#### General Language Capabilities
+For the evaluation of general language capabilities, we employed the [BHASA evaluation benchmark](https://arxiv.org/abs/2309.06085v2) across a variety of tasks.
+These tasks include Question Answering (QA), Sentiment Analysis (Sentiment), Toxicity Detection (Toxicity), Translation in both directions (Eng>Lang & Lang>Eng), Abstractive Summarization (Summ), Causal Reasoning (Causal) and Natural Language Inference (NLI).
+The evaluation was done **five-shot** with native prompts and only a sample of 100-1000 instances for each dataset was used as per the setting described in the paper.
+**BHASA**
+**English**
+| Model                                    | ARC   | BBH   | HellaSwag | MMLU  | GSM8k | Average |
+| ---------------------------------------- | ----- | ----- | --------- | ----- | ----- | ------- |
+| aisingapore/llama3-8b-cpt-sealionv2-base | 58.87 | 47.70 | 81.14     | 63.11 | 50.49 | 60.26   |
+| google/gemma-2-9b                        | 68.00 | 53.53 | 82.73     | 70.26 | 63.53 | 67.61   |
+| meta-llama/Meta-Llama-3-8B               | 57.85 | 46.09 | 81.89     | 65.10 | 45.34 | 59.25   |
+| Qwen/Qwen2-7B                            | 61.86 | 53.10 | 80.63     | 70.45 | 78.09 | 68.83   |
+| Sail/Sailor-7B                           | 50.34 | 35.65 | 76.11     | 52.80 | 33.81 | 49.74   |
+| mistralai/Mistral-7B-v0.3                | 59.56 | 44.89 | 82.97     | 62.36 | 33.36 | 56.63   |
 ## Training Details