metadata
datasets:
- TEAMGORANI/gorani-100k
language:
- en
library_name: transformers
pipeline_tag: text-generation
- model = TEAMGORANI/gorani-100k-llama2-13b-instruct
- dataset_name = TEAMGORANI/gorani-100k
Training Process | |
---|---|
Tokenizer Used | LlamaTokenizerFast |
Training Progress (Epoch 3.15/16) | |
Step | 19740/100000 |
Google Colab Resource Usage | 150 tokens used |
System Information | ||
---|---|---|
Used | Total | |
System RAM | 5.8 GB | 83.5 GB |
GPU RAM | 26.6 GB | 40.0 GB |
Disk | 74.0 GB | 166.8 GB |
Basic Training Settings | |
---|---|
local_rank | -1 |
per_device_train_batch_size | 4 |
per_device_eval_batch_size | 1 |
gradient_accumulation_steps | 4 |
learning_rate | 2e-4 |
max_grad_norm | 0.3 |
weight_decay | 0.001 |
max_seq_length | 2048 |
num_train_epochs | 1 |
max_steps | 100000 |
warmup_ratio | 0.03 |
save_steps | 500000 |
logging_steps | 10000 |
4-bit Precision Settings | |
---|---|
use_4bit | True |
use_nested_quant | False |
bnb_4bit_compute_dtype | "bfloat16" |
bnb_4bit_quant_type | "nf4" |
LoRA Settings | |
---|---|
lora_alpha | 16 |
lora_dropout | 0.1 |
lora_r | 64 |
Advanced Training Flags | |
---|---|
fp16 | False |
bf16 | False |
packing | False |
gradient_checkpointing | True |
optim | "paged_adamw_32bit" |
lr_scheduler_type | "constant" |
group_by_length | True |
GPU Configuration | |
---|---|
device_map | {"": 0} |