File size: 4,801 Bytes
12681ff
 
 
 
 
 
 
 
 
d23df37
12681ff
 
 
 
a3253ec
12681ff
 
 
e94598a
ed83b7e
722e776
ed83b7e
bd2b497
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12681ff
 
8236ee0
 
12681ff
8236ee0
12681ff
580f81d
 
8236ee0
 
 
 
 
 
 
12681ff
8236ee0
12681ff
8236ee0
12681ff
8236ee0
12681ff
8236ee0
12681ff
8236ee0
12681ff
8236ee0
12681ff
8236ee0
12681ff
8236ee0
12681ff
8236ee0
 
 
 
 
 
 
12681ff
8236ee0
12681ff
8236ee0
 
6bff02f
 
 
 
 
 
 
 
 
12681ff
8236ee0
12681ff
8236ee0
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
---
language: en
license: mit
library_name: transformers
tags:
- summarization
- bart
datasets: ccdv/arxiv-summarization
model-index:
- name: BARTxiv
  results:
  - task:
      type: summarization
    dataset:
      name: arxiv-summarization
      type: ccdv/arxiv-summarization
      split: validation
    metrics:
    - type: rouge1
      value: 41.70204016592095
    - type: rouge2
      value: 15.134827404979639
  - task:
      type: summarization
      name: Summarization
    dataset:
      name: cnn_dailymail
      type: cnn_dailymail
      config: 3.0.0
      split: test
    metrics:
    - type: rouge
      value: 42.6935
      name: ROUGE-1
      verified: true
      verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiYjllYzUzNWNjZWQyMDdjNTYxYTFhNmM5MWZlNzljZWVmNTE0N2E1ZWQxNDUzZTkwNTY5OWY2YzViNDIyMDg3MiIsInZlcnNpb24iOjF9.ehl1eTGu4x9i_8rpVUvzqK6y89N0AvVHHUc_Z_A35TpR1_6hhxnxpB67RWaPd5cYhUKVvwryxHfaoLH0WHlfDg
    - type: rouge
      value: 19.9458
      name: ROUGE-2
      verified: true
      verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNTdkODkyMjBlMGZlOTgzOTQ5OGE2ZmEzMjM3NDRiOTBlYzU0YTU5YmQzMDBmZTMwOWQ4Nzc3NGM4ZWZkODZhOCIsInZlcnNpb24iOjF9.ChzOw3oJ2CKdqnJr8GyRcpbhoMdmhvVelOEOZ9l9OoPS8dGF2dsZhz6pPmuIcVLuap6uPryFLJyM3s_doXEFCA
    - type: rouge
      value: 28.7611
      name: ROUGE-L
      verified: true
      verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNzg2OGZjOWNlNjYyZTQxZjRkYTMzMjE4MGI4YjI2NTRjMTRmMDYyNjBkNzk2ODdlZjVhOWY1Zjc3OTAyMTk4MyIsInZlcnNpb24iOjF9.QUE_vKtGAnZf3Dd3cM9boIZba5DPLxUtQb8I5TQgwWy6pcJ8PKNvewR5uscU6aNmIY_gcfNtyE6c-7xIxHBFAQ
    - type: rouge
      value: 39.0496
      name: ROUGE-LSUM
      verified: true
      verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiYjdkM2E4YTYwMWU3MTRkMDc5ODI0N2JhMWU4ODdhZTY0NDg1ZDQxMjRiYzQ4Y2Q2Y2RmZmZjZGY1YzEwNmE0NyIsInZlcnNpb24iOjF9.OhIZWxf5COw52hqK-Kan73Tsr3C3lIXS72SRYNH9Ph81JxQ1D12QeSlN6JaAtFmOWLxs_xs60H0Icbo9-letDg
    - type: loss
      value: 2.429295539855957
      name: loss
      verified: true
      verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiOTI5MzUxMjNhODM3ZDQ4NDk3ZTgzYTQyYjBlZTExYzI3MmJjZjdhNjhkODMyMzA0M2Q5Nzk3MTViM2QxOGJkYSIsInZlcnNpb24iOjF9.2iOkGmRyyVxJdc9oQukeKWCxu0V-5zudxIg4msELcHvks3hQwHcO8QKSZ2A7Io_QC0F999maTIqCTvPcJTvxBQ
    - type: gen_len
      value: 97.3349
      name: gen_len
      verified: true
      verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMjZiZmZkNzEyYTlhOGI3YWVlMjdjOWMzYWQ4YmU5ZjI0Yzk2NDE0OTkwYjFkNTNmMWM3MDk1OWU1ZDA0NTYyOCIsInZlcnNpb24iOjF9.oE6OwT5oO8xJak7HN4L0OHzmoaSghLZqiFy24KygS21jNVpbwXj793rV5RcPkJNWJb6agRktxXqtCZyxAzqMBw
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# BARTxiv

See the model implementation [here](https://interrsect.web.app).

This model is a fine-tuned version of [facebook/bart-large-cnn](https://huggingface.co/facebook/bart-large-cnn) on the [arxiv-summarization](https://huggingface.co/datasets/ccdv/arxiv-summarization) dataset.
It achieves the following results on the validation set:
- Loss: 0.86
- Rouge1: 41.70
- Rouge2: 15.13
- Rougel: 22.85
- Rougelsum: 37.77

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 1e-6
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- optimizer: Adafactor
- num_epochs: 9

### Training results

| Training Loss | Epoch | Step | Validation Loss | Rouge1  | Rouge2  | Rougel  | Rougelsum |
|:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|
| 1.24          | 1.0   | 1073 | 1.24            | 38.32   | 12.80   | 20.55   | 34.50     |
| 1.04          | 2.0   | 2146 | 1.04            | 39.65   | 13.74   | 21.28   | 35.83     |
| 0.979         | 3.0   | 3219 | 0.98            | 40.19   | 14.30   | 21.87   | 36.38     |
| 0.970         | 4.0   | 4292 | 0.97            | 40.87   | 14.44   | 22.14   | 36.89     |
| 0.918         | 5.0   | 5365 | 0.92            | 41.17   | 14.94   | 22.54   | 37.40     |
| 0.901         | 6.0   | 6438 | 0.90            | 41.02   | 14.65   | 22.46   | 37.05     |
| 0.889         | 7.0   | 7511 | 0.89            | 41.32   | 15.09   | 22.64   | 37.42     |
| 0.900         | 8.0   | 8584 | 0 .90           | 41.23   | 15.02   | 22.67   | 37.28     |
| 0.869         | 9.0   | 9657 | 0.87            | 41.70   | 15.13   | 22.85   | 37.77     |

### Framework versions

- Transformers 4.25.1
- Pytorch 1.13.0+cu117
- Datasets 2.6.1
- Tokenizers 0.13.1