aisingapore
/

llama3-8b-cpt-sea-lionv2-base

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

RaymondAISG commited on Aug 6, 2024

Commit

b178608

·

verified ·

1 Parent(s): dfd0674

Update README.md

Files changed (1) hide show

README.md +2 -17

README.md CHANGED Viewed

@@ -38,22 +38,7 @@ These tasks include Question Answering (QA), Sentiment Analysis (Sentiment), Tox
 The evaluation was done **five-shot** with native prompts and only a sample of 100-1000 instances for each dataset was used as per the setting described in the paper.
-**BHASA**
-To be released soon
-We also evaluated the model on English capabilities using tasks from the Open LLM Leaderboard.
-**English**
-| Model                                     | ARC   | BBH   | HellaSwag | MMLU  | GSM8k | Average |
-| ----------------------------------------- |:-----:|:-----:|:---------:|:-----:|:-----:|:-------:|
-| Qwen/Qwen2-7B                             | 61.86 | 53.10 | 80.63     | 70.45 | 78.09 | 68.83   |
-| aisingapore/llama3-8b-cpt-sea-lionv2-base | 58.87 | 47.70 | 81.14     | 63.11 | 50.49 | 60.26   |
-| meta-llama/Meta-Llama-3-8B                | 57.85 | 46.09 | 81.89     | 65.10 | 45.34 | 59.25   |
-| mistralai/Mistral-7B-v0.3                 | 59.56 | 44.89 | 82.97     | 62.36 | 33.36 | 56.63   |
-| Sail/Sailor-7B                            | 50.34 | 35.65 | 76.11     | 52.80 | 33.81 | 49.74   |
 ## Training Details
@@ -69,7 +54,7 @@ Llama3 8B CPT SEA-LIONv2 base model was continued pre-trained on 48B tokens of t
 | Dolma Semantic Scholar    |        0.959      |          1 |         2.9      |      2.79      |
 | Dolma arXiv               |        0.469      |          1 |         5.3      |      1.99      |
 | Dolma StarCoder           |        4.422      |          1 |         4.9      |      0.98      |
-| SEA-LION Pile - Indonesian|          3.4      |          1 |         6.8      |     14.17      |
 | Wiki* - Indonesian        |          0.3      |          4 |         1.2      |      2.50      |
 | SEA-LION Pile - Tamil     |          5.6      |          1 |         5.6      |     11.67      |
 | Wiki* + News - Tamil      |          0.6      |          4 |         2.4      |      5.00      |

 The evaluation was done **five-shot** with native prompts and only a sample of 100-1000 instances for each dataset was used as per the setting described in the paper.
+Please refer to the [SEA HELM](https://leaderboard.sea-lion.ai/) leaderboard for the evaluation scores.
 ## Training Details
 | Dolma Semantic Scholar    |        0.959      |          1 |         2.9      |      2.79      |
 | Dolma arXiv               |        0.469      |          1 |         5.3      |      1.99      |
 | Dolma StarCoder           |        4.422      |          1 |         4.9      |      0.98      |
+| SEA-LION Pile - Indonesian|          3.4      |          2 |         6.8      |     14.17      |
 | Wiki* - Indonesian        |          0.3      |          4 |         1.2      |      2.50      |
 | SEA-LION Pile - Tamil     |          5.6      |          1 |         5.6      |     11.67      |
 | Wiki* + News - Tamil      |          0.6      |          4 |         2.4      |      5.00      |