File size: 2,377 Bytes
48a3431
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ebce5ff
48a3431
ebce5ff
48a3431
 
 
a2275c3
71ebd99
 
 
 
 
 
 
 
 
de6f21a
 
48a3431
 
 
 
 
 
 
 
 
a2275c3
 
 
 
 
48a3431
 
 
de6f21a
 
48a3431
 
 
a89bf7b
 
48a3431
 
 
a89bf7b
 
48a3431
 
a89bf7b
 
48a3431
 
66886fc
 
48a3431
 
 
ebce5ff
48a3431
 
 
ebce5ff
 
a2275c3
48a3431
 
 
 
a2275c3
 
48a3431
de6f21a
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
---
license: apache-2.0
tags:
- summarization
- generated_from_trainer
datasets:
- multi_news
metrics:
- rouge
model-index:
- name: bart-base-multi-news
  results:
  - task:
      name: Sequence-to-sequence Language Modeling
      type: text2text-generation
    dataset:
      name: multi_news
      type: multi_news
      config: default
      split: validation
      args: default
    metrics:
    - name: Rouge1
      type: rouge
      value: 26.31
    - name: Rouge2
      type: rouge
      value: 9.6
    - name: Rougel
      type: rouge
      value: 20.87
    - name: Rougelsum
      type: rouge
      value: 21.54
language:
- en
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# bart-base-multi-news

This model is a fine-tuned version of [facebook/bart-base](https://huggingface.co/facebook/bart-base) on the multi_news dataset.
It achieves the following results on the evaluation set:
- Loss: 2.4147
- Rouge1: 26.31
- Rouge2: 9.6
- Rougel: 20.87
- Rougelsum: 21.54

## Intended uses & limitations

The inteded use of this model is text summarization.
The model requires additional training in order to perform better in the task of summarization.

## Training and evaluation data

The training data were 10000 samples from the multi-news training dataset
and the evaluation data were 500 samples from the multi-news evaluation dataset

## Training procedure

For the training procedure the Seq2SeqTrainer class was used from the transformers library.

### Training hyperparameters

The Hyperparameters were passed to the Seq2SeqTrainingArguments class from the transformers library.

The following hyperparameters were used during training:
- learning_rate: 5.6e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 1

### Training results

| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|
| 2.4041        | 1.0   | 1250 | 2.4147          | 26.31  | 9.6    | 20.87  | 21.54     |


### Framework versions

- Transformers 4.30.0
- Pytorch 2.0.1+cu118
- Datasets 2.12.0
- Tokenizers 0.13.3