Gabriel's picture
update model card README.md
cdbf750
|
raw
history blame
1.89 kB
metadata
license: mit
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: bart-base-cnn-xsum-swe
    results: []

bart-base-cnn-xsum-swe

This model is a fine-tuned version of Gabriel/bart-base-cnn-swe on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.1140
  • Rouge1: 30.7101
  • Rouge2: 11.9532
  • Rougel: 25.1864
  • Rougelsum: 25.2227
  • Gen Len: 19.7448

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3.75e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 3
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
2.3087 1.0 6375 2.1997 29.7666 11.0222 24.2659 24.2915 19.7172
2.0793 2.0 12750 2.1285 30.4447 11.7671 24.9238 24.9622 19.7051
1.9186 3.0 19125 2.1140 30.7101 11.9532 25.1864 25.2227 19.7448

Framework versions

  • Transformers 4.22.1
  • Pytorch 1.12.1+cu113
  • Datasets 2.5.1
  • Tokenizers 0.12.1