bslm-entity-extraction-mt5-base-include-desc-normalized-tr243k

This model is a fine-tuned version of google/mt5-base on this dataset. It achieves the following results on the evaluation set:

Loss: 0.0123
Exact Match: 67.9183
F1 Score: 88.9881

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 2

Training results

Training Loss	Epoch	Step	Validation Loss	Exact Match	F1 Score
0.7787	0.0328	500	0.5270	0.0	0.0
0.0903	0.0656	1000	0.0410	40.0656	75.4663
0.0428	0.0984	1500	0.0276	49.4349	81.9724
0.0462	0.1312	2000	0.0276	50.4375	81.5147
0.0304	0.1640	2500	0.0232	54.2836	84.0088
0.0274	0.1968	3000	0.0218	55.7237	84.5312
0.0251	0.2296	3500	0.0205	56.7262	84.8972
0.0252	0.2624	4000	0.0209	55.0492	84.5563
0.0236	0.2953	4500	0.0185	60.2443	86.0929
0.0221	0.3281	5000	0.0194	57.6376	85.3742
0.0226	0.3609	5500	0.0179	61.3015	86.3940
0.025	0.3937	6000	0.0176	59.8979	86.1283
0.0211	0.4265	6500	0.0178	60.5177	86.3265
0.0206	0.4593	7000	0.0166	61.3380	86.6077
0.0194	0.4921	7500	0.0170	60.3536	86.3744
0.0184	0.5249	8000	0.0159	63.2155	87.2735
0.0192	0.5577	8500	0.0164	61.8848	86.8659
0.0181	0.5905	9000	0.0158	62.1035	86.9785
0.0186	0.6233	9500	0.0156	62.7598	87.2376
0.018	0.6561	10000	0.0151	64.2727	87.6065
0.0171	0.6889	10500	0.0154	62.6139	87.2134
0.0183	0.7217	11000	0.0145	64.7102	87.8215
0.0182	0.7545	11500	0.0150	62.9420	87.3372
0.017	0.7873	12000	0.0141	65.0747	87.9349
0.0183	0.8202	12500	0.0148	62.6504	87.2310
0.0179	0.8530	13000	0.0138	65.4575	87.9886
0.017	0.8858	13500	0.0136	65.8221	88.1741
0.0168	0.9186	14000	0.0140	64.6555	87.8573
0.017	0.9514	14500	0.0135	66.2778	88.3458
0.0174	0.9842	15000	0.0140	64.8195	88.0423
0.0154	1.0170	15500	0.0138	65.8039	88.2375
0.0154	1.0498	16000	0.0135	66.3507	88.3934
0.015	1.0826	16500	0.0135	65.9861	88.3272
0.0151	1.1154	17000	0.0139	65.5851	88.2204
0.0153	1.1482	17500	0.0131	67.5355	88.7772
0.0148	1.1810	18000	0.0136	66.3507	88.4478
0.015	1.2138	18500	0.0134	66.4054	88.5039
0.0154	1.2466	19000	0.0133	66.5877	88.5994
0.0139	1.2794	19500	0.0132	66.1502	88.4829
0.0156	1.3122	20000	0.0131	66.9705	88.6868
0.016	1.3451	20500	0.0127	67.0252	88.7032
0.0143	1.3779	21000	0.0130	67.0252	88.7021
0.0159	1.4107	21500	0.0128	67.2803	88.8236
0.0133	1.4435	22000	0.0129	67.3168	88.8505
0.0131	1.4763	22500	0.0127	67.1892	88.8617
0.0137	1.5091	23000	0.0130	67.0434	88.7488
0.0133	1.5419	23500	0.0126	67.6449	88.9151
0.0144	1.5747	24000	0.0127	67.3533	88.8633
0.0142	1.6075	24500	0.0125	67.4809	88.9516
0.0136	1.6403	25000	0.0128	66.8246	88.7465
0.0139	1.6731	25500	0.0132	66.0955	88.5128
0.0126	1.7059	26000	0.0127	67.8090	89.0277
0.0135	1.7387	26500	0.0126	67.5173	88.9308
0.0141	1.7715	27000	0.0124	67.6449	88.9314
0.0138	1.8043	27500	0.0123	68.0095	89.0386
0.0141	1.8371	28000	0.0123	68.0095	88.9919
0.0142	1.8700	28500	0.0121	68.4470	89.0863
0.0147	1.9028	29000	0.0124	67.8454	88.9933
0.0135	1.9356	29500	0.0124	67.6814	88.9077
0.014	1.9684	30000	0.0123	67.9183	88.9881

Framework versions

Transformers 4.45.2
Pytorch 2.4.1+cu121
Datasets 3.0.1
Tokenizers 0.20.1

MittyN
/

bslm-entity-extraction-mt5-base-include-desc-normalized-tr243k

bslm-entity-extraction-mt5-base-include-desc-normalized-tr243k

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for MittyN/bslm-entity-extraction-mt5-base-include-desc-normalized-tr243k

Evaluation results