test_v7

This model is a fine-tuned version of ./models/distill-bge-retromae-step on the adalbertojunior/segmentacao dataset. It achieves the following results on the evaluation set:

Loss: 0.0045
Precision: 0.6658
Recall: 0.6860
F1: 0.6757
Accuracy: 0.9991

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 1
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 3.0

Training results

Training Loss	Epoch	Step	Validation Loss	Precision	Recall	F1	Accuracy
No log	0.0637	100	0.0048	0.5339	0.5647	0.5489	0.9984
No log	0.1274	200	0.0048	0.5567	0.6226	0.5878	0.9987
No log	0.1911	300	0.0048	0.5745	0.5950	0.5846	0.9988
No log	0.2548	400	0.0048	0.5622	0.5978	0.5794	0.9988
0.0061	0.3185	500	0.0069	0.48	0.5950	0.5314	0.9983
0.0061	0.3822	600	0.0061	0.5692	0.6116	0.5896	0.9987
0.0061	0.4459	700	0.0052	0.5736	0.6226	0.5971	0.9988
0.0061	0.5096	800	0.0055	0.5921	0.6198	0.6057	0.9988
0.0061	0.5733	900	0.0057	0.6126	0.6446	0.6282	0.9989
0.0008	0.6370	1000	0.0065	0.5635	0.6116	0.5865	0.9987
0.0008	0.7007	1100	0.0060	0.5725	0.6529	0.6100	0.9987
0.0008	0.7645	1200	0.0061	0.5704	0.6474	0.6065	0.9988
0.0008	0.8282	1300	0.0053	0.5813	0.6501	0.6138	0.9988
0.0008	0.8919	1400	0.0045	0.6658	0.6860	0.6757	0.9991
0.0004	0.9556	1500	0.0049	0.6497	0.6694	0.6594	0.9990
0.0004	1.0193	1600	0.0054	0.5707	0.6446	0.6054	0.9988
0.0004	1.0830	1700	0.0047	0.6376	0.6639	0.6505	0.9990
0.0004	1.1467	1800	0.0048	0.5922	0.6722	0.6297	0.9989
0.0004	1.2104	1900	0.0041	0.6455	0.6722	0.6586	0.9990
0.0002	1.2741	2000	0.0053	0.5686	0.6391	0.6018	0.9987
0.0002	1.3378	2100	0.0046	0.6495	0.6942	0.6711	0.9990
0.0002	1.4015	2200	0.0049	0.5947	0.6749	0.6323	0.9988
0.0002	1.4652	2300	0.0045	0.6125	0.6749	0.6422	0.9989
0.0002	1.5289	2400	0.0045	0.5701	0.6722	0.6169	0.9988
0.0002	1.5926	2500	0.0058	0.5321	0.6391	0.5807	0.9986
0.0002	1.6563	2600	0.0056	0.5110	0.6419	0.5690	0.9985
0.0002	1.7200	2700	0.0052	0.5792	0.6446	0.6102	0.9988
0.0002	1.7837	2800	0.0047	0.5941	0.6612	0.6258	0.9989
0.0002	1.8474	2900	0.0051	0.5655	0.6419	0.6013	0.9988
0.0001	1.9111	3000	0.0044	0.5866	0.6529	0.6180	0.9989
0.0001	1.9748	3100	0.0042	0.5792	0.6446	0.6102	0.9988
0.0001	2.0385	3200	0.0045	0.6015	0.6694	0.6336	0.9989
0.0001	2.1022	3300	0.0063	0.5409	0.6556	0.5928	0.9987
0.0001	2.1659	3400	0.0047	0.5887	0.6584	0.6216	0.9989
0.0001	2.2297	3500	0.0045	0.6131	0.6722	0.6413	0.9989
0.0001	2.2934	3600	0.0047	0.6193	0.6722	0.6446	0.9989
0.0001	2.3571	3700	0.0047	0.6091	0.6612	0.6341	0.9989
0.0001	2.4208	3800	0.0047	0.6205	0.6667	0.6428	0.9989
0.0001	2.4845	3900	0.0044	0.6070	0.6722	0.6379	0.9989
0.0001	2.5482	4000	0.0052	0.5355	0.6226	0.5758	0.9987
0.0001	2.6119	4100	0.0047	0.5871	0.6501	0.6170	0.9989
0.0001	2.6756	4200	0.0049	0.5739	0.6419	0.6060	0.9988
0.0001	2.7393	4300	0.0049	0.5634	0.6364	0.5977	0.9988
0.0001	2.8030	4400	0.0052	0.5634	0.6364	0.5977	0.9988
0.0	2.8667	4500	0.0049	0.5739	0.6419	0.6060	0.9988
0.0	2.9304	4600	0.0044	0.5796	0.6419	0.6092	0.9988
0.0	2.9941	4700	0.0047	0.5796	0.6419	0.6092	0.9988

Framework versions

Transformers 4.45.2
Pytorch 2.4.0+cu121
Datasets 3.0.1
Tokenizers 0.20.0

datalawyer
/

segmenter-distill

test_v7

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results