Edit model card

Model Card for Model ID

This model is pretrained with Vietnamese language, based on GPT-NeoX which is a large language model developed by EleutherAI.

Model Details

Training Data

  • Pre-train: Culturax Vietnamese Dataset(450GB) + AI-Hub Vietnamese Dataset(1.3GB) + Crawled Vietnamese Wikipedia Dataset(630MB) + viwik18 Dataset(1.27GB)

Training Hardware

Trained on A100 40GB GPU and 48 core CPU. Took about 17 hours to reach 80,000 steps.

Hyperparameters

Hyperparameter Value
nparameters 2670182400
nlayers 32
dmodel 2560
nheads 32
dhead 128
nvocab 60000
Sequence Length 2048
Learning Rate 0.00016
Positional Encoding Rotary Position Embedding (RoPE)

How to use

The model can be loaded using the AutoModelForCausalLM functionality:

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("eunyounglee/GPT-NeoX-2.7B-Vietnamese-pretrained")
model = AutoModelForCausalLM.from_pretrained("eunyounglee/GPT-NeoX-2.7B-Vietnamese-pretrained")
Downloads last month
10
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.