Commit
·
6b5c971
1
Parent(s):
c158d24
Update README.md
Browse files
README.md
CHANGED
@@ -3,24 +3,44 @@ language:
|
|
3 |
- vie
|
4 |
pipeline_tag: text-generation
|
5 |
|
6 |
-
Trained:
|
7 |
Config file: 2.7B
|
8 |
-
Data: Vietnamese Dataset 450GB(CulturaX) + Project(1.3B) + Crawled Vietnamese Wikipedia(630MB) + viwik18(1.27GB)
|
9 |
---
|
10 |
# Model Card for Model ID
|
11 |
|
12 |
-
|
13 |
|
14 |
-
Pretrained GPT-NeoX model with 450GB+ Vietnamese dataset. Took about 17 hours to reach 80,000 iterations. Trained on A100 40GB GPU and 48 core CPU.
|
15 |
|
16 |
## Model Details
|
17 |
|
18 |
-
###
|
19 |
-
|
20 |
-
|
21 |
-
|
22 |
-
|
23 |
-
|
24 |
-
|
25 |
-
|
26 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
- vie
|
4 |
pipeline_tag: text-generation
|
5 |
|
6 |
+
Trained: Pre-train
|
7 |
Config file: 2.7B
|
|
|
8 |
---
|
9 |
# Model Card for Model ID
|
10 |
|
11 |
+
This model is pretrained with Vietnamese language, based on GPT-NeoX which is a large language model developed by EleutherAI.
|
12 |
|
|
|
13 |
|
14 |
## Model Details
|
15 |
|
16 |
+
### Training Data
|
17 |
+
- **Pre-train:**
|
18 |
+
Culturax Vietnamese Dataset(450GB) + AI-Hub Vietnamese Dataset(1.3GB) + Crawled Vietnamese Wikipedia Dataset(630MB) + viwik18 Dataset(1.27GB)
|
19 |
+
|
20 |
+
### Training Hardware
|
21 |
+
Trained on A100 40GB GPU and 48 core CPU. Took about 17 hours to reach 80,000 steps.
|
22 |
+
|
23 |
+
### Hyperparameters
|
24 |
+
<figure style="width:30em">
|
25 |
+
|
26 |
+
| Hyperparameter | Value |
|
27 |
+
| ---------------------- | ----------- |
|
28 |
+
| n<sub>parameters</sub> | 2670182400 |
|
29 |
+
| n<sub>layers</sub> | 32 |
|
30 |
+
| d<sub>model</sub> | 2560 |
|
31 |
+
| n<sub>heads</sub> | 32 |
|
32 |
+
| d<sub>head</sub> | 128 |
|
33 |
+
| n<sub>vocab</sub> | 60000 |
|
34 |
+
| Sequence Length | 2048 |
|
35 |
+
| Learning Rate | 0.00016 |
|
36 |
+
| Positional Encoding | [Rotary Position Embedding (RoPE)](https://arxiv.org/abs/2104.09864) |
|
37 |
+
</figure>
|
38 |
+
|
39 |
+
### How to use
|
40 |
+
The model can be loaded using the `AutoModelForCausalLM` functionality:
|
41 |
+
```python
|
42 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM
|
43 |
+
|
44 |
+
tokenizer = AutoTokenizer.from_pretrained("eunyounglee/GPT-NeoX-2.7B-Vietnamese-pretrained")
|
45 |
+
model = AutoModelForCausalLM.from_pretrained("eunyounglee/GPT-NeoX-2.7B-Vietnamese-pretrained")
|
46 |
+
```
|