Model Card for Model ID
This model is pretrained with Vietnamese language, based on GPT-NeoX which is a large language model developed by EleutherAI.
Model Details
Training Data
- Pre-train: Culturax Vietnamese Dataset(450GB) + AI-Hub Vietnamese Dataset(1.3GB) + Crawled Vietnamese Wikipedia Dataset(630MB) + viwik18 Dataset(1.27GB)
Training Hardware
Trained on A100 40GB GPU and 48 core CPU. Took about 17 hours to reach 80,000 steps.
Hyperparameters
Hyperparameter | Value |
---|---|
nparameters | 2670182400 |
nlayers | 32 |
dmodel | 2560 |
nheads | 32 |
dhead | 128 |
nvocab | 60000 |
Sequence Length | 2048 |
Learning Rate | 0.00016 |
Positional Encoding | Rotary Position Embedding (RoPE) |
How to use
The model can be loaded using the AutoModelForCausalLM
functionality:
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("eunyounglee/GPT-NeoX-2.7B-Vietnamese-pretrained")
model = AutoModelForCausalLM.from_pretrained("eunyounglee/GPT-NeoX-2.7B-Vietnamese-pretrained")
- Downloads last month
- 10
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.