train-bioR-concat

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 24
eval_batch_size: 24
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 96
total_eval_batch_size: 96
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-06 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 1000
training_steps: 41803
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss
1.2076	0.0239	1000	1.8223
1.2901	0.0478	2000	1.8245
1.1528	0.0718	3000	1.8674
1.0056	0.0957	4000	1.9692
0.8399	0.1196	5000	2.0165
0.7892	0.1435	6000	1.9441
0.7658	0.1674	7000	1.8904
0.7284	0.1914	8000	1.8260
0.7217	0.2153	9000	1.8162
0.7122	0.2392	10000	1.7559
0.7055	0.2631	11000	1.7974
0.6943	0.2871	12000	1.7621
0.6942	0.3110	13000	1.7651
0.6868	0.3349	14000	1.7228
0.6817	0.3588	15000	1.7558
0.6911	0.3827	16000	1.7466
0.6889	0.4067	17000	1.7291
0.6798	0.4306	18000	1.6921
0.675	0.4545	19000	1.7139
0.6779	0.4784	20000	1.6933
0.6851	0.5023	21000	1.7136
0.675	0.5263	22000	1.6874
0.6747	0.5502	23000	1.6950
0.6724	0.5741	24000	1.6884
0.6631	0.5980	25000	1.6873
0.6671	0.6220	26000	1.6983
0.6645	0.6459	27000	1.6729
0.658	0.6698	28000	1.6809
0.6605	0.6937	29000	1.6656
0.6599	0.7176	30000	1.6704
0.6591	0.7416	31000	1.6679
0.6664	0.7655	32000	1.6555
0.6608	0.7894	33000	1.6487
0.6609	0.8133	34000	1.6522
0.6553	0.8372	35000	1.6502
0.6527	0.8612	36000	1.6568
0.6648	0.8851	37000	1.6587
0.6515	0.9090	38000	1.6471
0.65	0.9329	39000	1.6461
0.65	0.9568	40000	1.6499
0.6533	0.9808	41000	1.6559