RaymondAISG
commited on
Commit
•
8561bba
1
Parent(s):
d016c32
Update README.md
Browse files
README.md
CHANGED
@@ -10,7 +10,7 @@ language:
|
|
10 |
# LLaMA3 8B SEA-LIONv2
|
11 |
|
12 |
SEA-LION is a collection of Large Language Models (LLMs) which has been pretrained and instruct-tuned for the Southeast Asia (SEA) region.
|
13 |
-
This model is continued pre-trained from the
|
14 |
This is the card for the LLaMA3 8B SEA-LIONv2 base model.
|
15 |
|
16 |
SEA-LION stands for <i>Southeast Asian Languages In One Network</i>.
|
@@ -31,7 +31,7 @@ The continued pre-training data for LLaMA3 8B SEA-LIONv2 base model encompasses
|
|
31 |
- **Funded by:** Singapore NRF
|
32 |
- **Model type:** Decoder
|
33 |
- **Languages:** English, Indonesian, Thai, Vietnamese, Tamil
|
34 |
-
- **License:** LLaMA3 Community License
|
35 |
|
36 |
### Performance Benchmarks
|
37 |
|
@@ -68,7 +68,7 @@ LLaMA3 8B SEA-LIONv2 base model was continued pre-trained on 48B tokens of the f
|
|
68 |
Note:
|
69 |
- All token counts are counted using LLaMA3 tokenizer
|
70 |
- wiki* sources includes Wikipedia, Wiki Books, Wiki Source and Wiki Voyage
|
71 |
-
- Source of Tamil news is source with permission from (
|
72 |
|
73 |
### Infrastructure
|
74 |
|
|
|
10 |
# LLaMA3 8B SEA-LIONv2
|
11 |
|
12 |
SEA-LION is a collection of Large Language Models (LLMs) which has been pretrained and instruct-tuned for the Southeast Asia (SEA) region.
|
13 |
+
This model is continued pre-trained from the [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) model.
|
14 |
This is the card for the LLaMA3 8B SEA-LIONv2 base model.
|
15 |
|
16 |
SEA-LION stands for <i>Southeast Asian Languages In One Network</i>.
|
|
|
31 |
- **Funded by:** Singapore NRF
|
32 |
- **Model type:** Decoder
|
33 |
- **Languages:** English, Indonesian, Thai, Vietnamese, Tamil
|
34 |
+
- **License:** [LLaMA3 Community License](https://huggingface.co/meta-llama/Meta-Llama-3-8B/blob/main/LICENSE)
|
35 |
|
36 |
### Performance Benchmarks
|
37 |
|
|
|
68 |
Note:
|
69 |
- All token counts are counted using LLaMA3 tokenizer
|
70 |
- wiki* sources includes Wikipedia, Wiki Books, Wiki Source and Wiki Voyage
|
71 |
+
- Source of Tamil news is source with permission from [Seithi](https://seithi.mediacorp.sg/)
|
72 |
|
73 |
### Infrastructure
|
74 |
|