eunyounglee
/

GPT-NeoX-2.7B-Vietnamese-pretrained

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

eunyounglee commited on Dec 6, 2023

Commit

6b5c971

·

1 Parent(s): c158d24

Update README.md

Files changed (1) hide show

README.md +33 -13

README.md CHANGED Viewed

@@ -3,24 +3,44 @@ language:
 - vie
 pipeline_tag: text-generation
-Trained: Pretrain
 Config file: 2.7B
-Data: Vietnamese Dataset 450GB(CulturaX) + Project(1.3B) + Crawled Vietnamese Wikipedia(630MB) + viwik18(1.27GB)
 ---
 # Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
-Pretrained GPT-NeoX model with 450GB+ Vietnamese dataset. Took about 17 hours to reach 80,000 iterations. Trained on A100 40GB GPU and 48 core CPU.
 ## Model Details
-### Model Description
-<!-- Provide a longer summary of what this model is. -->
-- **Developed by:** Eunyoung Lee
-- **Model type:** GPT-NeoX
-- **Language(s) (NLP):** Vietnamese

 - vie
 pipeline_tag: text-generation
+Trained: Pre-train
 Config file: 2.7B
 ---
 # Model Card for Model ID
+This model is pretrained with Vietnamese language, based on GPT-NeoX which is a large language model developed by EleutherAI.
 ## Model Details
+### Training Data
+- **Pre-train:**
+Culturax Vietnamese Dataset(450GB) + AI-Hub Vietnamese Dataset(1.3GB) + Crawled Vietnamese Wikipedia Dataset(630MB) + viwik18 Dataset(1.27GB)
+### Training Hardware
+Trained on A100 40GB GPU and 48 core CPU. Took about 17 hours to reach 80,000 steps.
+### Hyperparameters
+<figure style="width:30em">
+| Hyperparameter         | Value       |
+| ---------------------- | ----------- |
+| n<sub>parameters</sub> | 2670182400  |
+| n<sub>layers</sub>     | 32          |
+| d<sub>model</sub>      | 2560        |
+| n<sub>heads</sub>      | 32          |
+| d<sub>head</sub>       | 128         |
+| n<sub>vocab</sub>      | 60000       |
+| Sequence Length        | 2048        |
+| Learning Rate          | 0.00016     |
+| Positional Encoding    | [Rotary Position Embedding (RoPE)](https://arxiv.org/abs/2104.09864) |
+</figure>
+### How to use
+ The model can be loaded using the `AutoModelForCausalLM` functionality:
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+tokenizer = AutoTokenizer.from_pretrained("eunyounglee/GPT-NeoX-2.7B-Vietnamese-pretrained")
+model = AutoModelForCausalLM.from_pretrained("eunyounglee/GPT-NeoX-2.7B-Vietnamese-pretrained")
+```