deberta-v3-small_v1_no_entities_with_context

This model is a fine-tuned version of microsoft/deberta-v3-small on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.0315
Accuracy: 0.0062
F1: 0.0086
Precision: 0.0043
Recall: 0.9070
Learning Rate: 0.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1	Precision	Recall	Rate
No log	1.0	191	0.0385	0.0047	0.0094	0.0047	1.0	0.0000
No log	2.0	382	0.0282	0.0047	0.0094	0.0047	1.0	0.0000
0.1139	3.0	573	0.0274	0.0047	0.0094	0.0047	1.0	0.0000
0.1139	4.0	764	0.0270	0.0047	0.0094	0.0047	1.0	0.0000
0.1139	5.0	955	0.0271	0.0047	0.0094	0.0047	1.0	0.0000
0.0317	6.0	1146	0.0269	0.0047	0.0094	0.0047	1.0	0.0000
0.0317	7.0	1337	0.0271	0.0047	0.0094	0.0047	1.0	0.0000
0.0316	8.0	1528	0.0264	0.0047	0.0094	0.0047	1.0	0.0000
0.0316	9.0	1719	0.0261	0.0047	0.0094	0.0047	1.0	0.0000
0.0316	10.0	1910	0.0261	0.0047	0.0094	0.0047	1.0	0.0000
0.0299	11.0	2101	0.0263	0.0047	0.0094	0.0047	1.0	0.0000
0.0299	12.0	2292	0.0262	0.0047	0.0094	0.0047	1.0	0.0000
0.0299	13.0	2483	0.0260	0.0047	0.0094	0.0047	1.0	0.0000
0.0294	14.0	2674	0.0263	0.0047	0.0094	0.0047	1.0	0.0000
0.0294	15.0	2865	0.0259	0.0047	0.0094	0.0047	1.0	0.0000
0.026	16.0	3056	0.0262	0.0050	0.0094	0.0047	1.0	0.0000
0.026	17.0	3247	0.0265	0.0050	0.0092	0.0046	0.9767	0.0000
0.026	18.0	3438	0.0270	0.0048	0.0093	0.0047	0.9884	0.0000
0.0224	19.0	3629	0.0272	0.0056	0.0090	0.0045	0.9535	0.0000
0.0224	20.0	3820	0.0271	0.0055	0.0091	0.0046	0.9651	0.0000
0.0197	21.0	4011	0.0271	0.0052	0.0090	0.0045	0.9535	0.0000
0.0197	22.0	4202	0.0270	0.0050	0.0090	0.0045	0.9535	0.0000
0.0197	23.0	4393	0.0271	0.0056	0.0090	0.0045	0.9535	0.0000
0.0172	24.0	4584	0.0275	0.0053	0.0089	0.0045	0.9419	0.0000
0.0172	25.0	4775	0.0273	0.0053	0.0089	0.0045	0.9419	1e-05
0.0172	26.0	4966	0.0282	0.0061	0.0087	0.0044	0.9186	0.0000
0.0152	27.0	5157	0.0281	0.0060	0.0088	0.0044	0.9302	0.0000
0.0152	28.0	5348	0.0281	0.0058	0.0088	0.0044	0.9302	0.0000
0.0138	29.0	5539	0.0277	0.0059	0.0088	0.0044	0.9302	0.0000
0.0138	30.0	5730	0.0292	0.0056	0.0089	0.0045	0.9419	0.0000
0.0138	31.0	5921	0.0287	0.0061	0.0088	0.0044	0.9302	0.0000
0.0124	32.0	6112	0.0289	0.0059	0.0087	0.0044	0.9186	0.0000
0.0124	33.0	6303	0.0300	0.0062	0.0086	0.0043	0.9070	0.0000
0.0124	34.0	6494	0.0293	0.0057	0.0087	0.0044	0.9186	0.0000
0.0113	35.0	6685	0.0297	0.0059	0.0089	0.0045	0.9419	6e-06
0.0113	36.0	6876	0.0293	0.0060	0.0086	0.0043	0.9070	0.0000
0.0106	37.0	7067	0.0295	0.0060	0.0085	0.0043	0.8953	0.0000
0.0106	38.0	7258	0.0301	0.0063	0.0086	0.0043	0.9070	0.0000
0.0106	39.0	7449	0.0300	0.0063	0.0085	0.0043	0.8953	0.0000
0.0092	40.0	7640	0.0297	0.0057	0.0086	0.0043	0.9070	0.0000
0.0092	41.0	7831	0.0299	0.0061	0.0086	0.0043	0.9070	0.0000
0.0091	42.0	8022	0.0298	0.0064	0.0086	0.0043	0.9070	0.0000
0.0091	43.0	8213	0.0302	0.0061	0.0087	0.0044	0.9186	0.0000
0.0091	44.0	8404	0.0307	0.0062	0.0086	0.0043	0.9070	0.0000
0.0082	45.0	8595	0.0310	0.0062	0.0086	0.0043	0.9070	0.0000
0.0082	46.0	8786	0.0308	0.0062	0.0087	0.0044	0.9186	0.0000
0.0082	47.0	8977	0.0314	0.0062	0.0087	0.0044	0.9186	0.0000
0.0081	48.0	9168	0.0314	0.0064	0.0087	0.0044	0.9186	0.0000
0.0081	49.0	9359	0.0315	0.0062	0.0086	0.0043	0.9070	0.0000
0.0077	50.0	9550	0.0315	0.0062	0.0086	0.0043	0.9070	0.0

Framework versions

Transformers 4.40.1
Pytorch 2.2.1+cu121
Datasets 2.19.0
Tokenizers 0.19.1

bobbyw
/

deberta-v3-small_v1_no_entities_with_context

deberta-v3-small_v1_no_entities_with_context

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for bobbyw/deberta-v3-small_v1_no_entities_with_context

Evaluation results