---
datasets:
- multi_news
metrics:
- bleu
- rouge
pipeline_tag: summarization
---
# Hyperparameters
    learning_rate=2e-5
    per_device_train_batch_size=14
    per_device_eval_batch_size=14
    weight_decay=0.01
    save_total_limit=3
    num_train_epochs=3
    predict_with_generate=True
    fp16=True

# Training Output
    global_step=7710
    training_loss=2.1297076629757417
    metrics={'train_runtime': 6059.0418, 
    'train_samples_per_second': 17.813, 
    'train_steps_per_second': 1.272, 
    'total_flos': 2.3389776681055027e+17, 
    'train_loss': 2.1297076629757417, 
    'epoch': 3.0}

# Training Results

| Epoch	| Training Loss | Validation Loss |  Rouge1  |  Rouge2	|  Rougel  | Rougelsum |   Bleu	  |  Gen Len  |
|:----- |:------------  |:--------------- |:-------- | :------- |:-------- |:--------- |:-------- |:--------- |
|  1    |    2.223100   |     2.038599    | 0.147400 | 0.054800 | 0.113500 |  0.113500 | 0.001400 | 20.000000 |
|  2	|    2.078100   |     2.009619    | 0.152900 | 0.057800 | 0.117000 |  0.117000 | 0.001600 | 20.000000 |
|  3	|    1.989000   |     2.006006    | 0.152900 | 0.057300 | 0.116700 |  0.116700 | 0.001700 | 20.000000 |