Ezi commited on
Commit
1853dce
·
1 Parent(s): d39c816

Delete README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -201
README.md DELETED
@@ -1,201 +0,0 @@
1
- ---
2
- tags:
3
- - pegasus
4
-
5
- ---
6
- # Model Card for brio-xsum-cased
7
-
8
-
9
- # Model Details
10
-
11
- ## Model Description
12
-
13
- BRIO: Bringing Order to Abstractive Summarization
14
-
15
- - **Developed by:** Yale LILY Lab
16
- - **Shared by [Optional]:** Hugging Face
17
- - **Model type:** PEGASUS
18
- - **Language(s) (NLP):** Text2Text Generation
19
- - **License:** More information needed
20
- - **Related Models:**
21
- - **Parent Model:** PEGASUS
22
- - **Resources for more information:**
23
- - [Github Repo](https://github.com/Yale-LILY/BRIO)
24
- - [Associated Paper](https://arxiv.org/abs/2203.16804)
25
- - [Associated Space](https://huggingface.co/spaces/darveen/text_summarizer)
26
-
27
-
28
- # Uses
29
-
30
- ## Direct Use
31
- This model can be used for the task of Text2Text Generation
32
-
33
- ## Downstream Use [Optional]
34
-
35
- The model creators note in the [associated paper](https://arxiv.org/abs/2203.16804)
36
- > It is possible to apply our method in a reinforcement learning setting, where the candidate summaries are dynamically generated.
37
-
38
- ## Out-of-Scope Use
39
-
40
-
41
- The model should not be used to intentionally create hostile or alienating environments for people.
42
-
43
- # Bias, Risks, and Limitations
44
-
45
-
46
- Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)). Predictions generated by the model may include disturbing and harmful stereotypes across protected classes; identity characteristics; and sensitive, social, and occupational groups.
47
-
48
-
49
- ## Recommendations
50
-
51
-
52
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
53
-
54
-
55
- # Training Details
56
-
57
- ## Training Data
58
- The model creators note in the [associated paper](https://arxiv.org/abs/2203.16804)
59
- > CNNDM4: is a large scale news dataset.
60
- Nallapati et al: we treat the news articles as the source documents and the associated highlights as the summaries.
61
- XSum5: is a highly abstractive dataset of articles from the British Broadcasting Corporation (BBC). NYT6: contains articles from the New York Times and the associated summaries
62
-
63
- ## Training Procedure
64
-
65
-
66
- ### Preprocessing
67
- The model creators note in the [associated paper](https://arxiv.org/abs/2203.16804)
68
- > We follow Kedzie et al. (2018) for data preprocessing and splitting, and use the associated archival abstracts as the summaries
69
-
70
- ### Speeds, Sizes, Times
71
-
72
- More information needed
73
-
74
- # Evaluation
75
-
76
-
77
- ## Testing Data, Factors & Metrics
78
-
79
- ### Testing Data
80
-
81
- More information needed
82
-
83
- ### Factors
84
-
85
- More information needed
86
-
87
- ### Metrics
88
-
89
- More information needed
90
-
91
- ## Results
92
-
93
-
94
-
95
- ### CNNDM
96
- | | ROUGE-1 | ROUGE-2 | ROUGE-L |
97
- |----------|---------|---------|---------|
98
- | BART | 44.16 | 21.28 | 40.90 |
99
- | Ours | 47.78 | 23.55 | 44.57 |
100
-
101
-
102
- ### XSum
103
- | | ROUGE-1 | ROUGE-2 | ROUGE-L |
104
- |----------|---------|---------|---------|
105
- | Pegasus | 47.21 | 24.56 | 39.25 |
106
- | Ours | 49.07 | 25.59 | 40.40 |
107
-
108
-
109
- ### NYT
110
- | | ROUGE-1 | ROUGE-2 | ROUGE-L |
111
- |----------|---------|---------|---------|
112
- | BART | 55.78 | 36.61 | 52.60 |
113
- | Ours | 57.75 | 38.64 | 54.54 |
114
-
115
-
116
-
117
- # Model Examination
118
- The model creators note in the [associated paper](https://arxiv.org/abs/2203.16804)
119
- We attribute BRIO-Ctr’s superior performance to its use of the same model architecture (BART) for both candidate generation and scoring, while SimCLS uses RoBERTa as the evaluation model. As a result, BRIO-Ctr maximizes the parameter sharing between the two stages, and preserves the power of the Seq2Seq model pre-trained on the same dataset.
120
-
121
- # Environmental Impact
122
-
123
-
124
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
125
-
126
- - **Hardware Type:** More information needed
127
- - **Hours used:** More information needed
128
- - **Cloud Provider:** More information needed
129
- - **Compute Region:** More information needed
130
- - **Carbon Emitted:** More information needed
131
-
132
- # Technical Specifications [optional]
133
-
134
- ## Model Architecture and Objective
135
- The model creators note in the [associated paper](https://arxiv.org/abs/2203.16804)
136
-
137
- > Formulate summarization as a sequence-to-sequence (Seq2Seq) problem
138
-
139
- ## Compute Infrastructure
140
-
141
- More information needed
142
-
143
- ### Hardware
144
-
145
- More information needed
146
-
147
- ### Software
148
-
149
- More information needed
150
-
151
- # Citation
152
-
153
-
154
- **BibTeX:**
155
- ```
156
- @misc{https://doi.org/10.48550/arxiv.2203.16804,
157
- doi = {10.48550/ARXIV.2203.16804},
158
-
159
- url = {https://arxiv.org/abs/2203.16804},
160
-
161
- author = {Liu, Yixin and Liu, Pengfei and Radev, Dragomir and Neubig, Graham},
162
-
163
- keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
164
-
165
- title = {BRIO: Bringing Order to Abstractive Summarization},
166
- ```
167
-
168
-
169
-
170
- # Glossary [optional]
171
-
172
- More information needed
173
-
174
- # More Information [optional]
175
-
176
- More information needed
177
-
178
- # Model Card Authors [optional]
179
-
180
- Yale LILY Lab in collaboration with Ezi Ozoani and the Hugging Face team
181
-
182
- # Model Card Contact
183
-
184
- More information needed
185
-
186
- # How to Get Started with the Model
187
-
188
- Use the code below to get started with the model.
189
-
190
- <details>
191
- <summary> Click to expand </summary>
192
-
193
- ```python
194
- from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
195
-
196
- tokenizer = AutoTokenizer.from_pretrained("Yale-LILY/brio-xsum-cased")
197
-
198
- model = AutoModelForSeq2SeqLM.from_pretrained("Yale-LILY/brio-xsum-cased")
199
-
200
- ```
201
- </details>