viktor-shevchuk's picture
End of training
64ab341
|
raw
history blame
2.12 kB
metadata
license: mit
base_model: facebook/bart-large-cnn
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: bart-large-cnn-finetuned-laws_articles
    results: []

bart-large-cnn-finetuned-laws_articles

This model is a fine-tuned version of facebook/bart-large-cnn on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.1271
  • Rouge1: 36.7269
  • Rouge2: 16.9683
  • Rougel: 27.0421
  • Rougelsum: 28.4193

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
No log 1.0 86 2.0321 36.6897 16.3485 27.3189 27.9101
No log 1.99 172 1.9454 38.9231 18.8033 29.5893 30.6478
No log 3.0 259 1.9194 39.8043 19.5213 29.9679 31.526
No log 3.99 345 1.9581 38.7543 18.0651 28.0544 29.5525
No log 4.99 431 2.0134 36.5099 17.177 27.2934 28.4522
1.5279 6.0 518 2.1271 36.7269 16.9683 27.0421 28.4193

Framework versions

  • Transformers 4.35.0
  • Pytorch 2.1.0+cu121
  • Datasets 2.14.7
  • Tokenizers 0.14.1