Llama-2-7b-chat-hf-finetune_90_10_MIX_gold
This model is a fine-tuned version of meta-llama/Llama-2-7b-chat-hf on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.2993
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 3
- eval_batch_size: 3
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: reduce_lr_on_plateau
- num_epochs: 50
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
0.6572 | 0.9961 | 256 | 0.7896 |
0.3218 | 1.9922 | 512 | 0.8874 |
0.2205 | 2.9883 | 768 | 0.9900 |
0.1221 | 3.9844 | 1024 | 1.0245 |
0.0952 | 4.9805 | 1280 | 1.0542 |
0.0604 | 5.9767 | 1536 | 1.0905 |
0.0704 | 6.9728 | 1792 | 1.1199 |
0.0675 | 7.9689 | 2048 | 1.1321 |
0.0678 | 8.9650 | 2304 | 1.1694 |
0.0656 | 9.9611 | 2560 | 1.1842 |
0.0839 | 10.9572 | 2816 | 1.1984 |
0.0562 | 11.9533 | 3072 | 1.2080 |
0.0585 | 12.9494 | 3328 | 1.2126 |
0.0577 | 13.9455 | 3584 | 1.2232 |
0.0561 | 14.9416 | 3840 | 1.2333 |
0.0557 | 15.9377 | 4096 | 1.2406 |
0.0534 | 16.9339 | 4352 | 1.2476 |
0.0545 | 17.9300 | 4608 | 1.2522 |
0.0543 | 18.9261 | 4864 | 1.2583 |
0.0536 | 19.9222 | 5120 | 1.2626 |
0.0536 | 20.9183 | 5376 | 1.2697 |
0.0535 | 21.9144 | 5632 | 1.2717 |
0.0515 | 22.9105 | 5888 | 1.2792 |
0.0506 | 23.9066 | 6144 | 1.2807 |
0.0512 | 24.9027 | 6400 | 1.2825 |
0.0506 | 25.8988 | 6656 | 1.2848 |
0.0506 | 26.8949 | 6912 | 1.2864 |
0.0502 | 27.8911 | 7168 | 1.2880 |
0.0498 | 28.8872 | 7424 | 1.2901 |
0.0494 | 29.8833 | 7680 | 1.2914 |
0.0493 | 30.8794 | 7936 | 1.2935 |
0.0488 | 31.8755 | 8192 | 1.2948 |
0.0481 | 32.8716 | 8448 | 1.2960 |
0.0482 | 33.8677 | 8704 | 1.2970 |
0.0469 | 34.8638 | 8960 | 1.2972 |
0.0473 | 35.8599 | 9216 | 1.2974 |
0.0473 | 36.8560 | 9472 | 1.2976 |
0.0468 | 37.8521 | 9728 | 1.2978 |
0.0467 | 38.8482 | 9984 | 1.2981 |
0.046 | 39.8444 | 10240 | 1.2981 |
0.0472 | 40.8405 | 10496 | 1.2985 |
0.0461 | 41.8366 | 10752 | 1.2987 |
0.0456 | 42.8327 | 11008 | 1.2988 |
0.0458 | 43.8288 | 11264 | 1.2990 |
0.0445 | 44.8249 | 11520 | 1.2993 |
0.044 | 45.8210 | 11776 | 1.2993 |
0.0452 | 46.8171 | 12032 | 1.2993 |
0.0435 | 47.8132 | 12288 | 1.2994 |
0.0442 | 48.8093 | 12544 | 1.2994 |
0.0447 | 49.8054 | 12800 | 1.2993 |
Framework versions
- PEFT 0.11.1
- Transformers 4.40.2
- Pytorch 2.3.0+cu121
- Datasets 2.20.0
- Tokenizers 0.19.1
- Downloads last month
- 6
Unable to determine this model’s pipeline type. Check the
docs
.