barthez-deft-archeologie

This model is a fine-tuned version of moussaKam/barthez on an unknown dataset.

Note: this model is one of the preliminary experiments and it underperforms the models published in the paper (using MBartHez and HAL/Wiki pre-training + copy mechanisms)

It achieves the following results on the evaluation set:

Loss: 2.0733
Rouge1: 37.1845
Rouge2: 16.9534
Rougel: 28.8416
Rougelsum: 29.077
Gen Len: 34.4028

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 20.0
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
3.4832	1.0	108	2.4237	22.6662	10.009	19.8729	19.8814	15.8333
2.557	2.0	216	2.2328	24.8102	11.9911	20.4773	20.696	19.0139
2.2702	3.0	324	2.2002	25.6482	11.6191	21.8383	21.9341	18.1944
2.1119	4.0	432	2.1266	25.5806	11.9765	21.3973	21.3503	19.4306
1.9582	5.0	540	2.1072	25.6578	12.2709	22.182	22.0548	19.1528
1.8137	6.0	648	2.1008	26.5272	11.4033	22.359	22.3259	19.4722
1.7725	7.0	756	2.1074	25.0405	11.1773	21.1369	21.1847	19.1806
1.6772	8.0	864	2.0959	26.5237	11.6028	22.5018	22.3931	19.3333
1.5798	9.0	972	2.0976	27.7443	11.9898	22.4052	22.2954	19.7222
1.4753	10.0	1080	2.0733	28.3502	12.9162	22.6352	22.6015	19.8194
1.4646	11.0	1188	2.1091	27.9198	12.8591	23.0718	23.0779	19.6111
1.4082	12.0	1296	2.1036	28.8509	13.0987	23.4189	23.5044	19.4861
1.2862	13.0	1404	2.1222	28.6641	12.8157	22.6799	22.7051	19.8611
1.2612	14.0	1512	2.1487	26.9709	11.6084	22.0312	22.0543	19.875
1.2327	15.0	1620	2.1808	28.218	12.6239	22.7372	22.7881	19.7361
1.2264	16.0	1728	2.1778	26.7393	11.4474	21.6057	21.555	19.7639
1.1848	17.0	1836	2.1995	27.6902	12.1082	22.0406	22.0101	19.6806
1.133	18.0	1944	2.2038	27.0402	12.1846	21.7793	21.7513	19.8056
1.168	19.0	2052	2.2116	27.5149	11.9876	22.1113	22.1527	19.7222
1.1206	20.0	2160	2.2133	28.2321	12.677	22.749	22.8485	19.5972

Framework versions

Transformers 4.10.2
Pytorch 1.7.1+cu110
Datasets 1.11.0
Tokenizers 0.10.3

jogonba2
/

barthez-deft-archeologie

barthez-deft-archeologie

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results