cllm-0.0.2

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.5767

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 8
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 256
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 50
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
4.8419 0.0214 500 4.7291
3.891 0.0429 1000 3.8792
3.5798 0.0643 1500 3.5656
3.3861 0.0858 2000 3.4057
3.2754 0.1072 2500 3.2925
3.2039 0.1286 3000 3.2109
3.1475 0.1501 3500 3.1513
3.0936 0.1715 4000 3.0991
3.0483 0.1930 4500 3.0603
3.0036 0.2144 5000 3.0180
2.9644 0.2358 5500 2.9900
2.9374 0.2573 6000 2.9599
2.901 0.2787 6500 2.9334
2.8968 0.3002 7000 2.9124
2.866 0.3216 7500 2.8889
2.8614 0.3430 8000 2.8672
2.8378 0.3645 8500 2.8489
2.8242 0.3859 9000 2.8290
2.7961 0.4074 9500 2.8133
2.769 0.4288 10000 2.7962
2.7619 0.4502 10500 2.7804
2.7527 0.4717 11000 2.7687
2.7457 0.4931 11500 2.7540
2.7119 0.5146 12000 2.7441
2.7089 0.5360 12500 2.7317
2.7236 0.5574 13000 2.7218
2.6984 0.5789 13500 2.7102
2.6791 0.6003 14000 2.6998
2.6764 0.6218 14500 2.6915
2.6663 0.6432 15000 2.6806
2.6424 0.6646 15500 2.6720
2.6384 0.6861 16000 2.6612
2.6343 0.7075 16500 2.6536
2.6303 0.7290 17000 2.6471
2.6115 0.7504 17500 2.6373
2.6125 0.7718 18000 2.6310
2.5983 0.7933 18500 2.6246
2.6043 0.8147 19000 2.6173
2.5876 0.8362 19500 2.6106
2.5824 0.8576 20000 2.6043
2.5802 0.8790 20500 2.5983
2.5772 0.9005 21000 2.5927
2.5584 0.9219 21500 2.5878
2.5652 0.9434 22000 2.5835
2.5593 0.9648 22500 2.5794
2.5547 0.9862 23000 2.5767

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.1.0+cu118
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
26
Safetensors
Model size
401M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for anilguleroglu/cllm-0.0.2

Adapters
1 model
Finetunes
2 models