Edit model card

barthez-deft-linguistique

This model is a fine-tuned version of moussaKam/barthez on an unknown dataset.

Note: this model is one of the preliminary experiments and it underperforms the models published in the paper (using MBartHez and HAL/Wiki pre-training + copy mechanisms)

It achieves the following results on the evaluation set:

  • Loss: 1.7596
  • Rouge1: 41.989
  • Rouge2: 22.4524
  • Rougel: 32.7966
  • Rougelsum: 32.7953
  • Gen Len: 22.1549

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
3.0569 1.0 108 2.0282 31.6993 14.9483 25.5565 25.4379 18.3803
2.2892 2.0 216 1.8553 35.2563 18.019 28.3135 28.2927 18.507
1.9062 3.0 324 1.7696 37.4613 18.1488 28.9959 29.0134 19.5352
1.716 4.0 432 1.7641 37.6903 18.7496 30.1097 30.1027 18.9577
1.5722 5.0 540 1.7781 38.1013 19.8291 29.8142 29.802 19.169
1.4655 6.0 648 1.7661 38.3557 20.3309 30.5068 30.4728 19.3662
1.3507 7.0 756 1.7596 39.7409 20.2998 31.0849 31.1152 19.3944
1.2874 8.0 864 1.7706 37.7846 20.3457 30.6826 30.6321 19.4789
1.2641 9.0 972 1.7848 38.7421 19.5701 30.5798 30.6305 19.3944
1.1192 10.0 1080 1.8008 40.3313 20.3378 31.8325 31.8648 19.5493
1.0724 11.0 1188 1.8450 38.9612 20.5719 31.4496 31.3144 19.8592
1.0077 12.0 1296 1.8364 36.5997 18.46 29.1808 29.1705 19.7324
0.9362 13.0 1404 1.8677 38.0371 19.2321 30.3893 30.3926 19.6338
0.8868 14.0 1512 1.9154 36.4737 18.5314 29.325 29.3634 19.6479
0.8335 15.0 1620 1.9344 35.7583 18.0687 27.9666 27.8675 19.8028
0.8305 16.0 1728 1.9556 37.2137 18.2199 29.5959 29.5799 19.9577
0.8057 17.0 1836 1.9793 36.6834 17.8505 28.6701 28.7145 19.7324
0.7869 18.0 1944 1.9994 37.5918 19.1984 28.8569 28.8278 19.7606
0.7549 19.0 2052 2.0117 37.3278 18.5169 28.778 28.7737 19.8028
0.7497 20.0 2160 2.0189 37.7513 19.1813 29.3675 29.402 19.6901

Framework versions

  • Transformers 4.10.2
  • Pytorch 1.7.1+cu110
  • Datasets 1.11.0
  • Tokenizers 0.10.3
Downloads last month
29
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.