File size: 4,801 Bytes
12681ff d23df37 12681ff a3253ec 12681ff e94598a ed83b7e 722e776 ed83b7e bd2b497 12681ff 8236ee0 12681ff 8236ee0 12681ff 580f81d 8236ee0 12681ff 8236ee0 12681ff 8236ee0 12681ff 8236ee0 12681ff 8236ee0 12681ff 8236ee0 12681ff 8236ee0 12681ff 8236ee0 12681ff 8236ee0 12681ff 8236ee0 12681ff 8236ee0 12681ff 8236ee0 6bff02f 12681ff 8236ee0 12681ff 8236ee0 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 |
---
language: en
license: mit
library_name: transformers
tags:
- summarization
- bart
datasets: ccdv/arxiv-summarization
model-index:
- name: BARTxiv
results:
- task:
type: summarization
dataset:
name: arxiv-summarization
type: ccdv/arxiv-summarization
split: validation
metrics:
- type: rouge1
value: 41.70204016592095
- type: rouge2
value: 15.134827404979639
- task:
type: summarization
name: Summarization
dataset:
name: cnn_dailymail
type: cnn_dailymail
config: 3.0.0
split: test
metrics:
- type: rouge
value: 42.6935
name: ROUGE-1
verified: true
verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiYjllYzUzNWNjZWQyMDdjNTYxYTFhNmM5MWZlNzljZWVmNTE0N2E1ZWQxNDUzZTkwNTY5OWY2YzViNDIyMDg3MiIsInZlcnNpb24iOjF9.ehl1eTGu4x9i_8rpVUvzqK6y89N0AvVHHUc_Z_A35TpR1_6hhxnxpB67RWaPd5cYhUKVvwryxHfaoLH0WHlfDg
- type: rouge
value: 19.9458
name: ROUGE-2
verified: true
verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNTdkODkyMjBlMGZlOTgzOTQ5OGE2ZmEzMjM3NDRiOTBlYzU0YTU5YmQzMDBmZTMwOWQ4Nzc3NGM4ZWZkODZhOCIsInZlcnNpb24iOjF9.ChzOw3oJ2CKdqnJr8GyRcpbhoMdmhvVelOEOZ9l9OoPS8dGF2dsZhz6pPmuIcVLuap6uPryFLJyM3s_doXEFCA
- type: rouge
value: 28.7611
name: ROUGE-L
verified: true
verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNzg2OGZjOWNlNjYyZTQxZjRkYTMzMjE4MGI4YjI2NTRjMTRmMDYyNjBkNzk2ODdlZjVhOWY1Zjc3OTAyMTk4MyIsInZlcnNpb24iOjF9.QUE_vKtGAnZf3Dd3cM9boIZba5DPLxUtQb8I5TQgwWy6pcJ8PKNvewR5uscU6aNmIY_gcfNtyE6c-7xIxHBFAQ
- type: rouge
value: 39.0496
name: ROUGE-LSUM
verified: true
verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiYjdkM2E4YTYwMWU3MTRkMDc5ODI0N2JhMWU4ODdhZTY0NDg1ZDQxMjRiYzQ4Y2Q2Y2RmZmZjZGY1YzEwNmE0NyIsInZlcnNpb24iOjF9.OhIZWxf5COw52hqK-Kan73Tsr3C3lIXS72SRYNH9Ph81JxQ1D12QeSlN6JaAtFmOWLxs_xs60H0Icbo9-letDg
- type: loss
value: 2.429295539855957
name: loss
verified: true
verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiOTI5MzUxMjNhODM3ZDQ4NDk3ZTgzYTQyYjBlZTExYzI3MmJjZjdhNjhkODMyMzA0M2Q5Nzk3MTViM2QxOGJkYSIsInZlcnNpb24iOjF9.2iOkGmRyyVxJdc9oQukeKWCxu0V-5zudxIg4msELcHvks3hQwHcO8QKSZ2A7Io_QC0F999maTIqCTvPcJTvxBQ
- type: gen_len
value: 97.3349
name: gen_len
verified: true
verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMjZiZmZkNzEyYTlhOGI3YWVlMjdjOWMzYWQ4YmU5ZjI0Yzk2NDE0OTkwYjFkNTNmMWM3MDk1OWU1ZDA0NTYyOCIsInZlcnNpb24iOjF9.oE6OwT5oO8xJak7HN4L0OHzmoaSghLZqiFy24KygS21jNVpbwXj793rV5RcPkJNWJb6agRktxXqtCZyxAzqMBw
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# BARTxiv
See the model implementation [here](https://interrsect.web.app).
This model is a fine-tuned version of [facebook/bart-large-cnn](https://huggingface.co/facebook/bart-large-cnn) on the [arxiv-summarization](https://huggingface.co/datasets/ccdv/arxiv-summarization) dataset.
It achieves the following results on the validation set:
- Loss: 0.86
- Rouge1: 41.70
- Rouge2: 15.13
- Rougel: 22.85
- Rougelsum: 37.77
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-6
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- optimizer: Adafactor
- num_epochs: 9
### Training results
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
|:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|
| 1.24 | 1.0 | 1073 | 1.24 | 38.32 | 12.80 | 20.55 | 34.50 |
| 1.04 | 2.0 | 2146 | 1.04 | 39.65 | 13.74 | 21.28 | 35.83 |
| 0.979 | 3.0 | 3219 | 0.98 | 40.19 | 14.30 | 21.87 | 36.38 |
| 0.970 | 4.0 | 4292 | 0.97 | 40.87 | 14.44 | 22.14 | 36.89 |
| 0.918 | 5.0 | 5365 | 0.92 | 41.17 | 14.94 | 22.54 | 37.40 |
| 0.901 | 6.0 | 6438 | 0.90 | 41.02 | 14.65 | 22.46 | 37.05 |
| 0.889 | 7.0 | 7511 | 0.89 | 41.32 | 15.09 | 22.64 | 37.42 |
| 0.900 | 8.0 | 8584 | 0 .90 | 41.23 | 15.02 | 22.67 | 37.28 |
| 0.869 | 9.0 | 9657 | 0.87 | 41.70 | 15.13 | 22.85 | 37.77 |
### Framework versions
- Transformers 4.25.1
- Pytorch 1.13.0+cu117
- Datasets 2.6.1
- Tokenizers 0.13.1 |