eunyounglee
/

GPT-NeoX-2.7B-Vietnamese-finetune

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

eunyounglee commited on Dec 6, 2023

Commit

fccfcee

•

1 Parent(s): a788b47

Update README.md

Files changed (1) hide show

README.md +9 -15

README.md CHANGED Viewed

@@ -8,36 +8,30 @@ Config file: 2.7B
 ---
 # Model Card for Model ID
-This model is pretrained and fine-tuned with Vietnamese dataset, based on GPT-NeoX which is a large language model developed by EleutherAI.
-Pretrained GPT-NeoX model with 450GB+ Vietnamese dataset. Fine-tuned with 12MB Vietnamese Question & Answer dataset. Trained on A100 40GB GPU and 48 core CPU. Took 18 hours to reach 10 epochs.
 ## Model Details
 ### Training Data
 - **Pre-train:**
-Vietnamese CulturaX Dataset(450GB) + Project(1.3GB) + Crawled Vietnamese Wikipedia(630MB) + viwik18(1.27GB)
 - **Fine-tuning:**
 12MB Vietnamese Question & Answer dataset
 Vietnamese Alpaca(16412 rows) + Vietnamese QA Dataset based on viwik18(14293 rows)
 ### Training Hardware
-- **Developed by:** Deeploading
-- **Model type:** GPT-NeoX
-- **Language(s) (NLP):** Vietnamese
 <figure style="width:30em">
 | Hyperparameter         | Value       |
 | ---------------------- | ----------- |
-| n<sub>parameters</sub> | 2670182400  |
-| n<sub>layers</sub>     | 32          |
-| d<sub>model</sub>      | 2560        |
-| n<sub>heads</sub>      | 32          |
-| d<sub>head</sub>       | 128         |
-| n<sub>vocab</sub>      | 60000       |
-| Sequence Length        | 2048        |
-| Learning Rate          | 0.00016     |
-| Positional Encoding    | [Rotary Position Embedding (RoPE)](https://arxiv.org/abs/2104.09864) |
 </figure>
 ### How to use

 ---
 # Model Card for Model ID
+This model is pretrained and fine-tuned with Vietnamese language, based on GPT-NeoX which is a large language model developed by EleutherAI.
 ## Model Details
 ### Training Data
 - **Pre-train:**
+Culturax Vietnamese Dataset(450GB) + AI-Hub Vietnamese Dataset(1.3GB) + Crawled Vietnamese Wikipedia Dataset(630MB) + viwik18 Dataset(1.27GB)
 - **Fine-tuning:**
 12MB Vietnamese Question & Answer dataset
 Vietnamese Alpaca(16412 rows) + Vietnamese QA Dataset based on viwik18(14293 rows)
 ### Training Hardware
+Trained on A100 40GB GPU and 48 core CPU. Took 18 hours to reach 10 epochs.
 <figure style="width:30em">
 | Hyperparameter         | Value       |
 | ---------------------- | ----------- |
+| num_train_epochs       | 2670182400  |
+| train_batch_size       | 2           |
+| learning_rate          | 0.0001      |
+| warmup_steps           | 1000        |
+| weight_decay           | 0           |
 </figure>
 ### How to use