RaymondAISG
commited on
Commit
•
b178608
1
Parent(s):
dfd0674
Update README.md
Browse files
README.md
CHANGED
@@ -38,22 +38,7 @@ These tasks include Question Answering (QA), Sentiment Analysis (Sentiment), Tox
|
|
38 |
|
39 |
The evaluation was done **five-shot** with native prompts and only a sample of 100-1000 instances for each dataset was used as per the setting described in the paper.
|
40 |
|
41 |
-
|
42 |
-
|
43 |
-
To be released soon
|
44 |
-
|
45 |
-
We also evaluated the model on English capabilities using tasks from the Open LLM Leaderboard.
|
46 |
-
|
47 |
-
**English**
|
48 |
-
|
49 |
-
| Model | ARC | BBH | HellaSwag | MMLU | GSM8k | Average |
|
50 |
-
| ----------------------------------------- |:-----:|:-----:|:---------:|:-----:|:-----:|:-------:|
|
51 |
-
| Qwen/Qwen2-7B | 61.86 | 53.10 | 80.63 | 70.45 | 78.09 | 68.83 |
|
52 |
-
| aisingapore/llama3-8b-cpt-sea-lionv2-base | 58.87 | 47.70 | 81.14 | 63.11 | 50.49 | 60.26 |
|
53 |
-
| meta-llama/Meta-Llama-3-8B | 57.85 | 46.09 | 81.89 | 65.10 | 45.34 | 59.25 |
|
54 |
-
| mistralai/Mistral-7B-v0.3 | 59.56 | 44.89 | 82.97 | 62.36 | 33.36 | 56.63 |
|
55 |
-
| Sail/Sailor-7B | 50.34 | 35.65 | 76.11 | 52.80 | 33.81 | 49.74 |
|
56 |
-
|
57 |
|
58 |
## Training Details
|
59 |
|
@@ -69,7 +54,7 @@ Llama3 8B CPT SEA-LIONv2 base model was continued pre-trained on 48B tokens of t
|
|
69 |
| Dolma Semantic Scholar | 0.959 | 1 | 2.9 | 2.79 |
|
70 |
| Dolma arXiv | 0.469 | 1 | 5.3 | 1.99 |
|
71 |
| Dolma StarCoder | 4.422 | 1 | 4.9 | 0.98 |
|
72 |
-
| SEA-LION Pile - Indonesian| 3.4 |
|
73 |
| Wiki* - Indonesian | 0.3 | 4 | 1.2 | 2.50 |
|
74 |
| SEA-LION Pile - Tamil | 5.6 | 1 | 5.6 | 11.67 |
|
75 |
| Wiki* + News - Tamil | 0.6 | 4 | 2.4 | 5.00 |
|
|
|
38 |
|
39 |
The evaluation was done **five-shot** with native prompts and only a sample of 100-1000 instances for each dataset was used as per the setting described in the paper.
|
40 |
|
41 |
+
Please refer to the [SEA HELM](https://leaderboard.sea-lion.ai/) leaderboard for the evaluation scores.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
42 |
|
43 |
## Training Details
|
44 |
|
|
|
54 |
| Dolma Semantic Scholar | 0.959 | 1 | 2.9 | 2.79 |
|
55 |
| Dolma arXiv | 0.469 | 1 | 5.3 | 1.99 |
|
56 |
| Dolma StarCoder | 4.422 | 1 | 4.9 | 0.98 |
|
57 |
+
| SEA-LION Pile - Indonesian| 3.4 | 2 | 6.8 | 14.17 |
|
58 |
| Wiki* - Indonesian | 0.3 | 4 | 1.2 | 2.50 |
|
59 |
| SEA-LION Pile - Tamil | 5.6 | 1 | 5.6 | 11.67 |
|
60 |
| Wiki* + News - Tamil | 0.6 | 4 | 2.4 | 5.00 |
|