|
--- |
|
license: apache-2.0 |
|
tags: |
|
- generated_from_trainer |
|
metrics: |
|
- rouge |
|
model-index: |
|
- name: barthez-deft-sciences_de_l_information |
|
results: |
|
- task: |
|
name: Summarization |
|
type: summarization |
|
metrics: |
|
- name: Rouge1 |
|
type: rouge |
|
value: 34.5672 |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# barthez-deft-sciences_de_l_information |
|
|
|
This model is a fine-tuned version of [moussaKam/barthez](https://huggingface.co/moussaKam/barthez) on an unknown dataset. |
|
|
|
**Note**: this model is one of the preliminary experiments and it underperforms the models published in the paper (using [MBartHez](https://huggingface.co/moussaKam/mbarthez) and HAL/Wiki pre-training + copy mechanisms) |
|
|
|
It achieves the following results on the evaluation set: |
|
- Loss: 2.0258 |
|
- Rouge1: 34.5672 |
|
- Rouge2: 16.7861 |
|
- Rougel: 27.5573 |
|
- Rougelsum: 27.6099 |
|
- Gen Len: 17.8857 |
|
|
|
## Model description |
|
|
|
More information needed |
|
|
|
## Intended uses & limitations |
|
|
|
More information needed |
|
|
|
## Training and evaluation data |
|
|
|
More information needed |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 3e-05 |
|
- train_batch_size: 4 |
|
- eval_batch_size: 4 |
|
- seed: 42 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: linear |
|
- num_epochs: 20.0 |
|
- mixed_precision_training: Native AMP |
|
|
|
### Training results |
|
|
|
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | |
|
|:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:| |
|
| 3.3405 | 1.0 | 106 | 2.3682 | 31.3511 | 12.1973 | 25.6977 | 25.6851 | 14.9714 | |
|
| 2.4219 | 2.0 | 212 | 2.1891 | 30.1154 | 13.3459 | 25.4854 | 25.5403 | 14.0429 | |
|
| 2.0789 | 3.0 | 318 | 2.0994 | 32.153 | 15.3865 | 26.1859 | 26.1672 | 15.2 | |
|
| 1.869 | 4.0 | 424 | 2.0258 | 34.5797 | 16.4194 | 27.6909 | 27.7201 | 16.9857 | |
|
| 1.6569 | 5.0 | 530 | 2.0417 | 34.3854 | 16.5237 | 28.7036 | 28.8258 | 15.2429 | |
|
| 1.5414 | 6.0 | 636 | 2.0503 | 33.1768 | 15.4851 | 27.2818 | 27.2884 | 16.0143 | |
|
| 1.4461 | 7.0 | 742 | 2.0293 | 35.4273 | 16.118 | 27.3622 | 27.393 | 16.6857 | |
|
| 1.3435 | 8.0 | 848 | 2.0336 | 35.3471 | 15.9695 | 27.668 | 27.6749 | 17.2 | |
|
| 1.2624 | 9.0 | 954 | 2.0779 | 35.9201 | 17.2547 | 27.409 | 27.3293 | 17.1857 | |
|
| 1.1807 | 10.0 | 1060 | 2.1301 | 35.7061 | 15.9138 | 27.3968 | 27.4716 | 17.1286 | |
|
| 1.0972 | 11.0 | 1166 | 2.1726 | 34.3194 | 16.1313 | 27.0367 | 27.0737 | 17.1429 | |
|
| 1.0224 | 12.0 | 1272 | 2.1704 | 34.9278 | 16.7958 | 27.8754 | 27.932 | 16.6571 | |
|
| 1.0181 | 13.0 | 1378 | 2.2458 | 34.472 | 15.9111 | 28.2938 | 28.2946 | 16.7571 | |
|
| 0.9769 | 14.0 | 1484 | 2.3405 | 35.1592 | 16.3135 | 29.0956 | 29.0858 | 16.5429 | |
|
| 0.8866 | 15.0 | 1590 | 2.3303 | 34.8732 | 15.6709 | 27.5858 | 27.6169 | 16.2429 | |
|
| 0.8888 | 16.0 | 1696 | 2.2976 | 35.3034 | 16.8011 | 27.7988 | 27.7569 | 17.5143 | |
|
| 0.8358 | 17.0 | 1802 | 2.3349 | 35.505 | 16.8851 | 28.3651 | 28.413 | 16.8143 | |
|
| 0.8026 | 18.0 | 1908 | 2.3738 | 35.2328 | 17.0358 | 28.544 | 28.6211 | 16.6143 | |
|
| 0.7487 | 19.0 | 2014 | 2.4103 | 34.0793 | 15.4468 | 27.8057 | 27.8586 | 16.7286 | |
|
| 0.7722 | 20.0 | 2120 | 2.3991 | 34.8116 | 15.8706 | 27.9173 | 27.983 | 16.9286 | |
|
|
|
|
|
### Framework versions |
|
|
|
- Transformers 4.10.2 |
|
- Pytorch 1.7.1+cu110 |
|
- Datasets 1.11.0 |
|
- Tokenizers 0.10.3 |
|
|