Edit model card

Llama-2-7b-Ukrainian

Model Details

Model Description

Llama-2-7b-Ukrainian is a bilingual pre-trained model supporting Ukrainian and English. Continued pre-training from Llama-2-7b on 5B tokens consisting of 75% Ukrainian documents and 25% English documents from CulturaX.

Paper: To Err Is Human, but Llamas Can Learn It Too

Training Hyperparameters

Hyperparameter Value
Training steps 19080
Batch size 256
Weight decay 0.1
Context length 1024
Learning rate 2e-5 linear decay to 2e-6
Precision bf16
Optimizer AdamW

Citation

BibTeX:

@article{luhtaru2024err,
  title={To Err Is Human, but Llamas Can Learn It Too},
  author={Luhtaru, Agnes and Purason, Taido and Vainikko, Martin and Del, Maksym and Fishel, Mark},
  journal={arXiv preprint arXiv:2403.05493},
  year={2024}
}
Downloads last month
9
Safetensors
Model size
6.74B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for tartuNLP/Llama-2-7b-Ukrainian

Quantizations
1 model

Dataset used to train tartuNLP/Llama-2-7b-Ukrainian