train-bioR-concat

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6559

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 24
  • eval_batch_size: 24
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 96
  • total_eval_batch_size: 96
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-06 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 1000
  • training_steps: 41803
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
1.2076 0.0239 1000 1.8223
1.2901 0.0478 2000 1.8245
1.1528 0.0718 3000 1.8674
1.0056 0.0957 4000 1.9692
0.8399 0.1196 5000 2.0165
0.7892 0.1435 6000 1.9441
0.7658 0.1674 7000 1.8904
0.7284 0.1914 8000 1.8260
0.7217 0.2153 9000 1.8162
0.7122 0.2392 10000 1.7559
0.7055 0.2631 11000 1.7974
0.6943 0.2871 12000 1.7621
0.6942 0.3110 13000 1.7651
0.6868 0.3349 14000 1.7228
0.6817 0.3588 15000 1.7558
0.6911 0.3827 16000 1.7466
0.6889 0.4067 17000 1.7291
0.6798 0.4306 18000 1.6921
0.675 0.4545 19000 1.7139
0.6779 0.4784 20000 1.6933
0.6851 0.5023 21000 1.7136
0.675 0.5263 22000 1.6874
0.6747 0.5502 23000 1.6950
0.6724 0.5741 24000 1.6884
0.6631 0.5980 25000 1.6873
0.6671 0.6220 26000 1.6983
0.6645 0.6459 27000 1.6729
0.658 0.6698 28000 1.6809
0.6605 0.6937 29000 1.6656
0.6599 0.7176 30000 1.6704
0.6591 0.7416 31000 1.6679
0.6664 0.7655 32000 1.6555
0.6608 0.7894 33000 1.6487
0.6609 0.8133 34000 1.6522
0.6553 0.8372 35000 1.6502
0.6527 0.8612 36000 1.6568
0.6648 0.8851 37000 1.6587
0.6515 0.9090 38000 1.6471
0.65 0.9329 39000 1.6461
0.65 0.9568 40000 1.6499
0.6533 0.9808 41000 1.6559

Framework versions

  • Transformers 4.47.0
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
70
Safetensors
Model size
365M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.