---
datasets:
- multi_news
language:
- en
metrics:
- bleu
- rouge
library_name: transformers
pipeline_tag: summarization
---

# Hyperparameters
    learning_rate=2e-5
    per_device_train_batch_size=14
    per_device_eval_batch_size=14
    weight_decay=0.01
    save_total_limit=3
    num_train_epochs=3
    predict_with_generate=True
    fp16=True

# Training Output
    global_step=7710,
    training_loss=2.8554159399445727,
    metrics={'train_runtime': 21924.7566,
    'train_samples_per_second': 4.923,
    'train_steps_per_second': 0.352,
    'total_flos': 2.3807388210639667e+17,
    'train_loss': 2.8554159399445727,
    'epoch': 3.0}

# Training Results

| Epoch	| Training Loss | Validation Loss |  Rouge1  |  Rouge2	|  Rougel  | Rougelsum |   Bleu	  |  Gen Len  |
|:----- |:------------  |:--------------- |:-------- | :------- |:-------- |:--------- |:-------- |:--------- |
1|	2.981200|	2.831641|	0.414500|	0.147000|	0.230700|	0.230600|	0.512800|	140.734900|
2	|2.800900|	2.789402|	0.417300|	0.148400|	0.231800|	0.231700|	0.516000|	141.158200|
3	|2.680300|	2.780862|	0.418300|	0.148400|	0.232200|	0.232100|	0.516800|	140.872300|