Gabriel
/

bart-base-cnn-xsum-swe

text2text-generation

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

bart-base-cnn-xsum-swe / README.md

Gabriel's picture

update model card README.md

cdbf750 about 2 years ago

|

1.89 kB

metadata

license: mit
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: bart-base-cnn-xsum-swe
    results: []

bart-base-cnn-xsum-swe

This model is a fine-tuned version of Gabriel/bart-base-cnn-swe on the None dataset. It achieves the following results on the evaluation set:

Loss: 2.1140
Rouge1: 30.7101
Rouge2: 11.9532
Rougel: 25.1864
Rougelsum: 25.2227
Gen Len: 19.7448

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3.75e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 3
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
2.3087	1.0	6375	2.1997	29.7666	11.0222	24.2659	24.2915	19.7172
2.0793	2.0	12750	2.1285	30.4447	11.7671	24.9238	24.9622	19.7051
1.9186	3.0	19125	2.1140	30.7101	11.9532	25.1864	25.2227	19.7448

Framework versions

Transformers 4.22.1
Pytorch 1.12.1+cu113
Datasets 2.5.1
Tokenizers 0.12.1