Genesist-8B-EarlyPrototype-0.4

This is an early prototype of the Genesist-8B model, fine-tuned from the Llama-3-8B-Instruct model using Supervised Fine-Tuning (SFT). It is designed to better understand and follow specific instructions in Indonesian.

Model Details

Base Model: Llama-3-8B-Instruct
Fine-tuning Method: Supervised Fine-Tuning (SFT)
Training Data: Approximately 45 million tokens of instruction data in Indonesian, specifically curated to improve the model's ability to follow instructions.
Languages: Indonesian (id), English (en)
License: Llama3

Training Hyperparameters

max_seq_length: 16385
per_device_train_batch_size: 2
gradient_accumulation_steps: 4
warmup_steps: 5
num_train_epochs: 1
learning_rate: 5e-5
logging_steps: 1
optim: "adamw_8bit"
weight_decay: 0.01
lr_scheduler_type: "linear"
seed: 3407