barthez-deft-linguistique

This model is a fine-tuned version of moussaKam/barthez on an unknown dataset.

Note: this model is one of the preliminary experiments and it underperforms the models published in the paper (using MBartHez and HAL/Wiki pre-training + copy mechanisms)

It achieves the following results on the evaluation set:

Loss: 1.7596
Rouge1: 41.989
Rouge2: 22.4524
Rougel: 32.7966
Rougelsum: 32.7953
Gen Len: 22.1549

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 20.0
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
3.0569	1.0	108	2.0282	31.6993	14.9483	25.5565	25.4379	18.3803
2.2892	2.0	216	1.8553	35.2563	18.019	28.3135	28.2927	18.507
1.9062	3.0	324	1.7696	37.4613	18.1488	28.9959	29.0134	19.5352
1.716	4.0	432	1.7641	37.6903	18.7496	30.1097	30.1027	18.9577
1.5722	5.0	540	1.7781	38.1013	19.8291	29.8142	29.802	19.169
1.4655	6.0	648	1.7661	38.3557	20.3309	30.5068	30.4728	19.3662
1.3507	7.0	756	1.7596	39.7409	20.2998	31.0849	31.1152	19.3944
1.2874	8.0	864	1.7706	37.7846	20.3457	30.6826	30.6321	19.4789
1.2641	9.0	972	1.7848	38.7421	19.5701	30.5798	30.6305	19.3944
1.1192	10.0	1080	1.8008	40.3313	20.3378	31.8325	31.8648	19.5493
1.0724	11.0	1188	1.8450	38.9612	20.5719	31.4496	31.3144	19.8592
1.0077	12.0	1296	1.8364	36.5997	18.46	29.1808	29.1705	19.7324
0.9362	13.0	1404	1.8677	38.0371	19.2321	30.3893	30.3926	19.6338
0.8868	14.0	1512	1.9154	36.4737	18.5314	29.325	29.3634	19.6479
0.8335	15.0	1620	1.9344	35.7583	18.0687	27.9666	27.8675	19.8028
0.8305	16.0	1728	1.9556	37.2137	18.2199	29.5959	29.5799	19.9577
0.8057	17.0	1836	1.9793	36.6834	17.8505	28.6701	28.7145	19.7324
0.7869	18.0	1944	1.9994	37.5918	19.1984	28.8569	28.8278	19.7606
0.7549	19.0	2052	2.0117	37.3278	18.5169	28.778	28.7737	19.8028
0.7497	20.0	2160	2.0189	37.7513	19.1813	29.3675	29.402	19.6901

Framework versions

Transformers 4.10.2
Pytorch 1.7.1+cu110
Datasets 1.11.0
Tokenizers 0.10.3

jogonba2
/

barthez-deft-linguistique

barthez-deft-linguistique

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results