Genesist-8B-EarlyPrototype-0.4
This is an early prototype of the Genesist-8B model, fine-tuned from the Llama-3-8B-Instruct model using Supervised Fine-Tuning (SFT). It is designed to better understand and follow specific instructions in Indonesian.
Model Details
- Base Model: Llama-3-8B-Instruct
- Fine-tuning Method: Supervised Fine-Tuning (SFT)
- Training Data: Approximately 45 million tokens of instruction data in Indonesian, specifically curated to improve the model's ability to follow instructions.
- Languages: Indonesian (id), English (en)
- License: Llama3
Training Hyperparameters
- max_seq_length: 16385
- per_device_train_batch_size: 2
- gradient_accumulation_steps: 4
- warmup_steps: 5
- num_train_epochs: 1
- learning_rate: 5e-5
- logging_steps: 1
- optim: "adamw_8bit"
- weight_decay: 0.01
- lr_scheduler_type: "linear"
- seed: 3407
- Downloads last month
- 2