autoevaluator HF staff commited on
Commit
eb755ed
·
1 Parent(s): 78d0bd9

Add verifyToken field to verify evaluation results are produced by Hugging Face's automatic model evaluator

Browse files

Beep boop, I am a bot from Hugging Face's automatic model evaluator 👋! We've added a new `verifyToken` field to your evaluation results to verify that they are produced by the model evaluator. Accept this PR to ensure that your results remain listed as **verified** on the [Hub leaderboard](https://huggingface.co/spaces/autoevaluate/leaderboards).

Files changed (1) hide show
  1. README.md +200 -157
README.md CHANGED
@@ -1,18 +1,18 @@
1
  ---
2
- languages: en
3
  license:
4
  - apache-2.0
5
  - bsd-3-clause
6
- datasets:
7
- - kmfoda/booksum
8
  tags:
9
  - summarization
10
  - summary
11
  - booksum
12
  - long-document
13
  - long-form
 
 
14
  metrics:
15
  - rouge
 
16
  widget:
17
  - text: large earthquakes along a given fault segment do not occur at random intervals
18
  because it takes time to accumulate the strain energy for the rupture. The rates
@@ -27,39 +27,38 @@ widget:
27
  deviation of the average recurrence interval, the more specific could be the long
28
  term prediction of a future mainshock.
29
  example_title: earthquakes
30
- - text: " A typical feed-forward neural field algorithm. Spatiotemporal coordinates\
31
- \ are fed into a neural network that predicts values in the reconstructed domain.\
32
- \ Then, this domain is mapped to the sensor domain where sensor measurements are\
33
- \ available as supervision. Class and Section Problems Addressed Generalization\
34
- \ (Section 2) Inverse problems, ill-posed problems, editability; symmetries. Hybrid\
35
- \ Representations (Section 3) Computation & memory efficiency, representation\
36
- \ capacity, editability: Forward Maps (Section 4) Inverse problems Network Architecture\
37
- \ (Section 5) Spectral bias, integration & derivatives. Manipulating Neural Fields\
38
- \ (Section 6) Edit ability, constraints, regularization. Table 2: The five classes\
39
- \ of techniques in the neural field toolbox each addresses problems that arise\
40
- \ in learning, inference, and control. (Section 3). We can supervise reconstruction\
41
- \ via differentiable forward maps that transform Or project our domain (e.g, 3D\
42
- \ reconstruction via 2D images; Section 4) With appropriate network architecture\
43
- \ choices, we can overcome neural network spectral biases (blurriness) and efficiently\
44
- \ compute derivatives and integrals (Section 5). Finally, we can manipulate neural\
45
- \ fields to add constraints and regularizations, and to achieve editable representations\
46
- \ (Section 6). Collectively, these classes constitute a 'toolbox' of techniques\
47
- \ to help solve problems with neural fields There are three components in a conditional\
48
- \ neural field: (1) An encoder or inference function \u20AC that outputs the conditioning\
49
- \ latent variable 2 given an observation 0 E(0) =2. 2 is typically a low-dimensional\
50
- \ vector, and is often referred to aS a latent code Or feature code_ (2) A mapping\
51
- \ function 4 between Z and neural field parameters O: Y(z) = O; (3) The neural\
52
- \ field itself $. The encoder \u20AC finds the most probable z given the observations\
53
- \ O: argmaxz P(2/0). The decoder maximizes the inverse conditional probability\
54
- \ to find the most probable 0 given Z: arg- max P(Olz). We discuss different encoding\
55
- \ schemes with different optimality guarantees (Section 2.1.1), both global and\
56
- \ local conditioning (Section 2.1.2), and different mapping functions Y (Section\
57
- \ 2.1.3) 2. Generalization Suppose we wish to estimate a plausible 3D surface\
58
- \ shape given a partial or noisy point cloud. We need a suitable prior over the\
59
- \ sur- face in its reconstruction domain to generalize to the partial observations.\
60
- \ A neural network expresses a prior via the function space of its architecture\
61
- \ and parameters 0, and generalization is influenced by the inductive bias of\
62
- \ this function space (Section 5)."
63
  example_title: scientific paper
64
  - text: 'Is a else or outside the cob and tree written being of early client rope
65
  and you have is for good reasons. On to the ocean in Orange for time. By''s the
@@ -111,68 +110,82 @@ widget:
111
  the point of you of your model. This hidden data is complete by unseen. In other
112
  words, we solve our problem of validation.'
113
  example_title: transcribed audio - lecture
114
- - text: "Transformer-based models have shown to be very useful for many NLP tasks.\
115
- \ However, a major limitation of transformers-based models is its O(n^2)O(n 2)\
116
- \ time & memory complexity (where nn is sequence length). Hence, it's computationally\
117
- \ very expensive to apply transformer-based models on long sequences n > 512n>512.\
118
- \ Several recent papers, e.g. Longformer, Performer, Reformer, Clustered attention\
119
- \ try to remedy this problem by approximating the full attention matrix. You can\
120
- \ checkout \U0001F917's recent blog post in case you are unfamiliar with these\
121
- \ models.\nBigBird (introduced in paper) is one of such recent models to address\
122
- \ this issue. BigBird relies on block sparse attention instead of normal attention\
123
- \ (i.e. BERT's attention) and can handle sequences up to a length of 4096 at a\
124
- \ much lower computational cost compared to BERT. It has achieved SOTA on various\
125
- \ tasks involving very long sequences such as long documents summarization, question-answering\
126
- \ with long contexts.\nBigBird RoBERTa-like model is now available in \U0001F917\
127
- Transformers. The goal of this post is to give the reader an in-depth understanding\
128
- \ of big bird implementation & ease one's life in using BigBird with \U0001F917\
129
- Transformers. But, before going into more depth, it is important to remember that\
130
- \ the BigBird's attention is an approximation of BERT's full attention and therefore\
131
- \ does not strive to be better than BERT's full attention, but rather to be more\
132
- \ efficient. It simply allows to apply transformer-based models to much longer\
133
- \ sequences since BERT's quadratic memory requirement quickly becomes unbearable.\
134
- \ Simply put, if we would have \u221E compute & \u221E time, BERT's attention\
135
- \ would be preferred over block sparse attention (which we are going to discuss\
136
- \ in this post).\nIf you wonder why we need more compute when working with longer\
137
- \ sequences, this blog post is just right for you!\nSome of the main questions\
138
- \ one might have when working with standard BERT-like attention include:\nDo all\
139
- \ tokens really have to attend to all other tokens? Why not compute attention\
140
- \ only over important tokens? How to decide what tokens are important? How to\
141
- \ attend to just a few tokens in a very efficient way? In this blog post, we will\
142
- \ try to answer those questions.\nWhat tokens should be attended to? We will give\
143
- \ a practical example of how attention works by considering the sentence 'BigBird\
144
- \ is now available in HuggingFace for extractive question answering'. In BERT-like\
145
- \ attention, every word would simply attend to all other tokens.\nLet's think\
146
- \ about a sensible choice of key tokens that a queried token actually only should\
147
- \ attend to by writing some pseudo-code. Will will assume that the token available\
148
- \ is queried and build a sensible list of key tokens to attend to.\n>>> # let's\
149
- \ consider following sentence as an example >>> example = ['BigBird', 'is', 'now',\
150
- \ 'available', 'in', 'HuggingFace', 'for', 'extractive', 'question', 'answering']\n\
151
- >>> # further let's assume, we're trying to understand the representation of 'available'\
152
- \ i.e. >>> query_token = 'available' >>> # We will initialize an empty `set` and\
153
- \ fill up the tokens of our interest as we proceed in this section. >>> key_tokens\
154
- \ = [] # => currently 'available' token doesn't have anything to attend Nearby\
155
- \ tokens should be important because, in a sentence (sequence of words), the current\
156
- \ word is highly dependent on neighboring past & future tokens. This intuition\
157
- \ is the idea behind the concept of sliding attention."
 
 
 
 
 
 
 
 
 
 
 
 
158
  example_title: bigbird blog intro
159
- - text: "To be fair, you have to have a very high IQ to understand Rick and Morty.\
160
- \ The humour is extremely subtle, and without a solid grasp of theoretical physics\
161
- \ most of the jokes will go over a typical viewer's head. There's also Rick's\
162
- \ nihilistic outlook, which is deftly woven into his characterisation- his personal\
163
- \ philosophy draws heavily from Narodnaya Volya literature, for instance. The\
164
- \ fans understand this stuff; they have the intellectual capacity to truly appreciate\
165
- \ the depths of these jokes, to realise that they're not just funny- they say\
166
- \ something deep about LIFE. As a consequence people who dislike Rick & Morty\
167
- \ truly ARE idiots- of course they wouldn't appreciate, for instance, the humour\
168
- \ in Rick's existential catchphrase 'Wubba Lubba Dub Dub,' which itself is a cryptic\
169
- \ reference to Turgenev's Russian epic Fathers and Sons. I'm smirking right now\
170
- \ just imagining one of those addlepated simpletons scratching their heads in\
171
- \ confusion as Dan Harmon's genius wit unfolds itself on their television screens.\
172
- \ What fools.. how I pity them. \U0001F602\nAnd yes, by the way, i DO have a Rick\
173
- \ & Morty tattoo. And no, you cannot see it. It's for the ladies' eyes only- and\
174
- \ even then they have to demonstrate that they're within 5 IQ points of my own\
175
- \ (preferably lower) beforehand. Nothin personnel kid \U0001F60E"
 
 
176
  example_title: Richard & Mortimer
177
  parameters:
178
  max_length: 48
@@ -194,30 +207,36 @@ model-index:
194
  config: samsum
195
  split: test
196
  metrics:
197
- - name: ROUGE-1
198
- type: rouge
199
  value: 33.1401
 
200
  verified: true
201
- - name: ROUGE-2
202
- type: rouge
203
  value: 9.3095
 
204
  verified: true
205
- - name: ROUGE-L
206
- type: rouge
207
  value: 24.8552
 
208
  verified: true
209
- - name: ROUGE-LSUM
210
- type: rouge
211
  value: 29.0391
 
212
  verified: true
213
- - name: loss
214
- type: loss
215
  value: 2.288182497024536
 
216
  verified: true
217
- - name: gen_len
218
- type: gen_len
219
  value: 45.2173
 
220
  verified: true
 
221
  - task:
222
  type: summarization
223
  name: Summarization
@@ -227,30 +246,36 @@ model-index:
227
  config: plain_text
228
  split: test
229
  metrics:
230
- - name: ROUGE-1
231
- type: rouge
232
  value: 39.7279
 
233
  verified: true
234
- - name: ROUGE-2
235
- type: rouge
236
  value: 10.8944
 
237
  verified: true
238
- - name: ROUGE-L
239
- type: rouge
240
  value: 19.7018
 
241
  verified: true
242
- - name: ROUGE-LSUM
243
- type: rouge
244
  value: 36.5634
 
245
  verified: true
246
- - name: loss
247
- type: loss
248
  value: 2.473011016845703
 
249
  verified: true
250
- - name: gen_len
251
- type: gen_len
252
  value: 212.8243
 
253
  verified: true
 
254
  - task:
255
  type: summarization
256
  name: Summarization
@@ -260,30 +285,36 @@ model-index:
260
  config: default
261
  split: test
262
  metrics:
263
- - name: ROUGE-1
264
- type: rouge
265
  value: 42.1065
 
266
  verified: true
267
- - name: ROUGE-2
268
- type: rouge
269
  value: 15.4079
 
270
  verified: true
271
- - name: ROUGE-L
272
- type: rouge
273
  value: 24.8814
 
274
  verified: true
275
- - name: ROUGE-LSUM
276
- type: rouge
277
  value: 36.0375
 
278
  verified: true
279
- - name: loss
280
- type: loss
281
  value: 1.9130958318710327
 
282
  verified: true
283
- - name: gen_len
284
- type: gen_len
285
  value: 179.2184
 
286
  verified: true
 
287
  - task:
288
  type: summarization
289
  name: Summarization
@@ -293,30 +324,36 @@ model-index:
293
  config: kmfoda--booksum
294
  split: test
295
  metrics:
296
- - name: ROUGE-1
297
- type: rouge
298
  value: 35.2154
 
299
  verified: true
300
- - name: ROUGE-2
301
- type: rouge
302
  value: 6.8702
 
303
  verified: true
304
- - name: ROUGE-L
305
- type: rouge
306
  value: 17.6693
 
307
  verified: true
308
- - name: ROUGE-LSUM
309
- type: rouge
310
  value: 32.8365
 
311
  verified: true
312
- - name: loss
313
- type: loss
314
  value: 2.9878039360046387
 
315
  verified: true
316
- - name: gen_len
317
- type: gen_len
318
  value: 200.6785
 
319
  verified: true
 
320
  - task:
321
  type: summarization
322
  name: Summarization
@@ -326,30 +363,36 @@ model-index:
326
  config: y
327
  split: test
328
  metrics:
329
- - name: ROUGE-1
330
- type: rouge
331
  value: 37.376
 
332
  verified: true
333
- - name: ROUGE-2
334
- type: rouge
335
  value: 11.4432
 
336
  verified: true
337
- - name: ROUGE-L
338
- type: rouge
339
  value: 22.2754
 
340
  verified: true
341
- - name: ROUGE-LSUM
342
- type: rouge
343
  value: 32.5087
 
344
  verified: true
345
- - name: loss
346
- type: loss
347
  value: 2.9867310523986816
 
348
  verified: true
349
- - name: gen_len
350
- type: gen_len
351
  value: 172.7776
 
352
  verified: true
 
353
  ---
354
 
355
  # pszemraj/pegasus-x-large-book-summary
 
1
  ---
 
2
  license:
3
  - apache-2.0
4
  - bsd-3-clause
 
 
5
  tags:
6
  - summarization
7
  - summary
8
  - booksum
9
  - long-document
10
  - long-form
11
+ datasets:
12
+ - kmfoda/booksum
13
  metrics:
14
  - rouge
15
+ languages: en
16
  widget:
17
  - text: large earthquakes along a given fault segment do not occur at random intervals
18
  because it takes time to accumulate the strain energy for the rupture. The rates
 
27
  deviation of the average recurrence interval, the more specific could be the long
28
  term prediction of a future mainshock.
29
  example_title: earthquakes
30
+ - text: ' A typical feed-forward neural field algorithm. Spatiotemporal coordinates
31
+ are fed into a neural network that predicts values in the reconstructed domain.
32
+ Then, this domain is mapped to the sensor domain where sensor measurements are
33
+ available as supervision. Class and Section Problems Addressed Generalization
34
+ (Section 2) Inverse problems, ill-posed problems, editability; symmetries. Hybrid
35
+ Representations (Section 3) Computation & memory efficiency, representation capacity,
36
+ editability: Forward Maps (Section 4) Inverse problems Network Architecture (Section
37
+ 5) Spectral bias, integration & derivatives. Manipulating Neural Fields (Section
38
+ 6) Edit ability, constraints, regularization. Table 2: The five classes of techniques
39
+ in the neural field toolbox each addresses problems that arise in learning, inference,
40
+ and control. (Section 3). We can supervise reconstruction via differentiable forward
41
+ maps that transform Or project our domain (e.g, 3D reconstruction via 2D images;
42
+ Section 4) With appropriate network architecture choices, we can overcome neural
43
+ network spectral biases (blurriness) and efficiently compute derivatives and integrals
44
+ (Section 5). Finally, we can manipulate neural fields to add constraints and regularizations,
45
+ and to achieve editable representations (Section 6). Collectively, these classes
46
+ constitute a ''toolbox'' of techniques to help solve problems with neural fields
47
+ There are three components in a conditional neural field: (1) An encoder or inference
48
+ function that outputs the conditioning latent variable 2 given an observation
49
+ 0 E(0) =2. 2 is typically a low-dimensional vector, and is often referred to aS
50
+ a latent code Or feature code_ (2) A mapping function 4 between Z and neural field
51
+ parameters O: Y(z) = O; (3) The neural field itself $. The encoder € finds the
52
+ most probable z given the observations O: argmaxz P(2/0). The decoder maximizes
53
+ the inverse conditional probability to find the most probable 0 given Z: arg-
54
+ max P(Olz). We discuss different encoding schemes with different optimality guarantees
55
+ (Section 2.1.1), both global and local conditioning (Section 2.1.2), and different
56
+ mapping functions Y (Section 2.1.3) 2. Generalization Suppose we wish to estimate
57
+ a plausible 3D surface shape given a partial or noisy point cloud. We need a suitable
58
+ prior over the sur- face in its reconstruction domain to generalize to the partial
59
+ observations. A neural network expresses a prior via the function space of its
60
+ architecture and parameters 0, and generalization is influenced by the inductive
61
+ bias of this function space (Section 5).'
 
62
  example_title: scientific paper
63
  - text: 'Is a else or outside the cob and tree written being of early client rope
64
  and you have is for good reasons. On to the ocean in Orange for time. By''s the
 
110
  the point of you of your model. This hidden data is complete by unseen. In other
111
  words, we solve our problem of validation.'
112
  example_title: transcribed audio - lecture
113
+ - text: 'Transformer-based models have shown to be very useful for many NLP tasks.
114
+ However, a major limitation of transformers-based models is its O(n^2)O(n 2) time
115
+ & memory complexity (where nn is sequence length). Hence, it''s computationally
116
+ very expensive to apply transformer-based models on long sequences n > 512n>512.
117
+ Several recent papers, e.g. Longformer, Performer, Reformer, Clustered attention
118
+ try to remedy this problem by approximating the full attention matrix. You can
119
+ checkout 🤗''s recent blog post in case you are unfamiliar with these models.
120
+
121
+ BigBird (introduced in paper) is one of such recent models to address this issue.
122
+ BigBird relies on block sparse attention instead of normal attention (i.e. BERT''s
123
+ attention) and can handle sequences up to a length of 4096 at a much lower computational
124
+ cost compared to BERT. It has achieved SOTA on various tasks involving very long
125
+ sequences such as long documents summarization, question-answering with long contexts.
126
+
127
+ BigBird RoBERTa-like model is now available in 🤗Transformers. The goal of this
128
+ post is to give the reader an in-depth understanding of big bird implementation
129
+ & ease one''s life in using BigBird with 🤗Transformers. But, before going into
130
+ more depth, it is important to remember that the BigBird''s attention is an approximation
131
+ of BERT''s full attention and therefore does not strive to be better than BERT''s
132
+ full attention, but rather to be more efficient. It simply allows to apply transformer-based
133
+ models to much longer sequences since BERT''s quadratic memory requirement quickly
134
+ becomes unbearable. Simply put, if we would have compute & time, BERT''s attention
135
+ would be preferred over block sparse attention (which we are going to discuss
136
+ in this post).
137
+
138
+ If you wonder why we need more compute when working with longer sequences, this
139
+ blog post is just right for you!
140
+
141
+ Some of the main questions one might have when working with standard BERT-like
142
+ attention include:
143
+
144
+ Do all tokens really have to attend to all other tokens? Why not compute attention
145
+ only over important tokens? How to decide what tokens are important? How to attend
146
+ to just a few tokens in a very efficient way? In this blog post, we will try to
147
+ answer those questions.
148
+
149
+ What tokens should be attended to? We will give a practical example of how attention
150
+ works by considering the sentence ''BigBird is now available in HuggingFace for
151
+ extractive question answering''. In BERT-like attention, every word would simply
152
+ attend to all other tokens.
153
+
154
+ Let''s think about a sensible choice of key tokens that a queried token actually
155
+ only should attend to by writing some pseudo-code. Will will assume that the token
156
+ available is queried and build a sensible list of key tokens to attend to.
157
+
158
+ >>> # let''s consider following sentence as an example >>> example = [''BigBird'',
159
+ ''is'', ''now'', ''available'', ''in'', ''HuggingFace'', ''for'', ''extractive'',
160
+ ''question'', ''answering'']
161
+
162
+ >>> # further let''s assume, we''re trying to understand the representation of
163
+ ''available'' i.e. >>> query_token = ''available'' >>> # We will initialize an
164
+ empty `set` and fill up the tokens of our interest as we proceed in this section.
165
+ >>> key_tokens = [] # => currently ''available'' token doesn''t have anything
166
+ to attend Nearby tokens should be important because, in a sentence (sequence of
167
+ words), the current word is highly dependent on neighboring past & future tokens.
168
+ This intuition is the idea behind the concept of sliding attention.'
169
  example_title: bigbird blog intro
170
+ - text: 'To be fair, you have to have a very high IQ to understand Rick and Morty.
171
+ The humour is extremely subtle, and without a solid grasp of theoretical physics
172
+ most of the jokes will go over a typical viewer''s head. There''s also Rick''s
173
+ nihilistic outlook, which is deftly woven into his characterisation- his personal
174
+ philosophy draws heavily from Narodnaya Volya literature, for instance. The fans
175
+ understand this stuff; they have the intellectual capacity to truly appreciate
176
+ the depths of these jokes, to realise that they''re not just funny- they say something
177
+ deep about LIFE. As a consequence people who dislike Rick & Morty truly ARE idiots-
178
+ of course they wouldn''t appreciate, for instance, the humour in Rick''s existential
179
+ catchphrase ''Wubba Lubba Dub Dub,'' which itself is a cryptic reference to Turgenev''s
180
+ Russian epic Fathers and Sons. I''m smirking right now just imagining one of those
181
+ addlepated simpletons scratching their heads in confusion as Dan Harmon''s genius
182
+ wit unfolds itself on their television screens. What fools.. how I pity them.
183
+ 😂
184
+
185
+ And yes, by the way, i DO have a Rick & Morty tattoo. And no, you cannot see it.
186
+ It''s for the ladies'' eyes only- and even then they have to demonstrate that
187
+ they''re within 5 IQ points of my own (preferably lower) beforehand. Nothin personnel
188
+ kid 😎'
189
  example_title: Richard & Mortimer
190
  parameters:
191
  max_length: 48
 
207
  config: samsum
208
  split: test
209
  metrics:
210
+ - type: rouge
 
211
  value: 33.1401
212
+ name: ROUGE-1
213
  verified: true
214
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiYjQ1NjY1OGVjYWEwMzBjMzk3ZmMyZDA0ZTcxOTdmZTUxNTc0OGYxYmY3MzJkMzFmYTVjNzU2ZTk4MzE0NWMzMSIsInZlcnNpb24iOjF9.PSHB6DMF6tkwSw5nsFE57a2ApRAy_tkS6ziKA6PSTWddEdaqfca4pfig6_olmRmcS4KxN6HHcsmioHzv4LJQBw
215
+ - type: rouge
216
  value: 9.3095
217
+ name: ROUGE-2
218
  verified: true
219
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMzk3MTA3NmY1OGE3MzFjZTJhYWYzNGU4NTUzMTgwM2Y1NWZjMmEyNDNmNmEzYmQzZThjOGExMjc2ZjAyZjMzZCIsInZlcnNpb24iOjF9.tfgp8p-WlkVrfducTSg4zs-byeZMCmdZw1aizPQHXm_qRAwGtKcuVkZcmza5Y3o3VqsAEmGzg5HQD1vnZvWIDA
220
+ - type: rouge
221
  value: 24.8552
222
+ name: ROUGE-L
223
  verified: true
224
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiOTVmMTIwNDQwNTI4MmI2MmY1ODc1Mjk0NGQ5ZWE4ZTYzOGNkMjY2ZmJhMjg2MTZlNTdhYTA2ZDAxNTFjMjA2MSIsInZlcnNpb24iOjF9.9HLgy9842oIDm6ABb3L94R1P4zAqTI0QN8aP62xzIyDxUXTbWw68PEDufYLiBJbTgZ8ElopZ9I7aou2zCgXeAA
225
+ - type: rouge
226
  value: 29.0391
227
+ name: ROUGE-LSUM
228
  verified: true
229
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMmNhYWJjYjdjMzMxMmE4ZTE4NGEzMDdmZDZjODI5ZWRjZWJmYTEyZGIzYWQ2NjM3YzQ4MjI4ZTM4MmU5MzRjZSIsInZlcnNpb24iOjF9.d2yoVdmxjVJnsgIYFiLuaBO5Krgw4Axl5yeOSTKrvHygrAxoqT1nl4anzQiyoR3PwYBXwBkwmgpJUfZ7RNXtDQ
230
+ - type: loss
231
  value: 2.288182497024536
232
+ name: loss
233
  verified: true
234
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiYzM5NGIwODMxOTA3MTY3ODc2ZDczYTNmMTMwM2QyZmNlZjFmZDJjMGY3NWNkMDEyYzA4OTA2ZDRiODY3Zjg4OCIsInZlcnNpb24iOjF9.8k9mC050OS7mQSR9oA8liDRDQvEx1VxmTXGLmDYJVYYtTh2HYJFGP8Vy_krocFRIYDxh-IHPEOOSr5NrLMWHBA
235
+ - type: gen_len
236
  value: 45.2173
237
+ name: gen_len
238
  verified: true
239
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNWZhNzQ5OTQ5Yjg5YjhlOTZiZmJhZjZiODNmY2E2OTg4YTg4NWVhYzRkNzM2Mzk4NzdlMDgxM2M4NjY2YzhhYSIsInZlcnNpb24iOjF9.tDEEsPUclZDygAdGhNrBGrF24vR8ao08Nw7hmtUt5lmSZZZK_u-8rpz97QgVS6MCJdjFVnbYC4bkFnlQWI_FAA
240
  - task:
241
  type: summarization
242
  name: Summarization
 
246
  config: plain_text
247
  split: test
248
  metrics:
249
+ - type: rouge
 
250
  value: 39.7279
251
+ name: ROUGE-1
252
  verified: true
253
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiOTAxODk3OTUwMTIzODU3NzU2YzAzZjE2NTM3MzBjNDA0ZWRmZGU3NWUzNTg1YThhNDQ1NjQ5ZmM3OWI2YzBhNSIsInZlcnNpb24iOjF9.vnNKucBNt2-nIyODj9P2HeaWPX5AQR8L-DL8QzrO7kj58-vZnjT6hsAGmepRNzdZ1TLF-3j2J2plcNJ8lUO8Dg
254
+ - type: rouge
255
  value: 10.8944
256
+ name: ROUGE-2
257
  verified: true
258
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNjYzMmIxOTJmZjkxOGI5N2U0NTRmMmQwOGJhMzMxYWIzMWMzYzUwMDEyMDdiZDQ2YTUzOWU0OTViMTI2YTAwYiIsInZlcnNpb24iOjF9.De0PaAikWqfWpoIXTCYP-mSFu3PUATLX08Qq74OHXM8784heFVDX1E1sXlh_QbbKJbuMuZtTKM4qr7oLUizOAw
259
+ - type: rouge
260
  value: 19.7018
261
+ name: ROUGE-L
262
  verified: true
263
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiYzI3MjQzOGQ3MGE3NDNkZTEyMWRkYjUyYTYzNDEwOWVjMGFmNTBiZjE4ZTBhMGYzMmI1Yzk0YjBmYmIzMWMxZSIsInZlcnNpb24iOjF9.FVikJ5Ma0gUgM-tpbomWXnC4jtmvhxqikPqCk84t4IbIdU0CIYGTQEONiz-VqI0fJeNrnTS6lxpBv7XxKoq3BQ
264
+ - type: rouge
265
  value: 36.5634
266
+ name: ROUGE-LSUM
267
  verified: true
268
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiOTI2OTVmNDZiZWE5ZjNkODIwZjJiNTU2ZjJjYjczODUwM2JiNDEzYmE3N2U5YWM5NzJjOWEzMmYzZjdlYWJmYyIsInZlcnNpb24iOjF9.poR4zcqRvdaierfWFdTa53Cv6ZbNbnRwyRTi9HukHF5AWAQgc6zpBLkwOYFYoWjuSH83ohWeMM3MoIdw3zypBw
269
+ - type: loss
270
  value: 2.473011016845703
271
+ name: loss
272
  verified: true
273
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNDFmMjg3NWQ2YTMxMTc1OGZiYWYzNjg5NDY3MWE4MjY5ZDQxZDZhZGI1OTc5MzZkZGEzYmVlNWFiMzZjNDdhNCIsInZlcnNpb24iOjF9.05nKB3SmEfFKSduJqlleF4Fd2_IhwJS8eTOrnzZYCQQfLCfpJAZLhp3eLQCuBY4htd-FNrZftrThL66zVxyrCQ
274
+ - type: gen_len
275
  value: 212.8243
276
+ name: gen_len
277
  verified: true
278
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiOGNjMTg4ZDZlZjAxZGNhN2M0NWI0ZTA0OWEzNDkzNDAzOTJhODA2MmVkODI4YjYzN2FiOTU1ZDMwM2VlNWMyYyIsInZlcnNpb24iOjF9.WYx6XJFKokY2heoN-jpAMp1Z1gsyJus3zpktQgNd0FOYJxOUqW40A0kkHtd15y4dUhsbccLpuJGY1fNJgHOiDw
279
  - task:
280
  type: summarization
281
  name: Summarization
 
285
  config: default
286
  split: test
287
  metrics:
288
+ - type: rouge
 
289
  value: 42.1065
290
+ name: ROUGE-1
291
  verified: true
292
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZDJhNDM2MWEwMjJlYjRmZTVkYzljODcwMzlmMGUxMDA4ZmRjNjM0NmY3ZWJlMmZjNGI3NDQ3NTQyOTQ3MjBkNSIsInZlcnNpb24iOjF9.l1MiZbXyFyXAcsfFChMrTvSaBhzBR6AuDnBuII8zY3Csz3ShWK0vo09MkQdZ1epe8PKWV9wwUBuJyKk3wL7MDw
293
+ - type: rouge
294
  value: 15.4079
295
+ name: ROUGE-2
296
  verified: true
297
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNTY3NDBkYTVkNjdhY2I0ZmY0NTA4YzVkMGE5YWE5ODdjOGE1MDhkOTJhOWY3NmI2ZWI1MGU2MGI1NDRlYjI3MSIsInZlcnNpb24iOjF9.VN-5eK2SzFDCJnFTHHu7XCU_lynaxW_JEDc3llmcNo_ffDgRmISHHGaqV7fPFymBBMXpPly7XblO_sukyqj1Cg
298
+ - type: rouge
299
  value: 24.8814
300
+ name: ROUGE-L
301
  verified: true
302
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZDYyNGZmNDY3MTY4YzI4ZjZhODE0NGIyN2ZkOGEyYzM3MWZjM2QzZTg5ZjNmZmYzZDE5NzhiZDQ4OGM1YjNiMyIsInZlcnNpb24iOjF9.L73M1M5XdMQkf8zSdfLN0MUrxtO0r6UiLjoOkHfrIGbWNsNJ8tU5lciYFNIhJrICUL8LchCsFqR9LAClKS4bCg
303
+ - type: rouge
304
  value: 36.0375
305
+ name: ROUGE-LSUM
306
  verified: true
307
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMTBlMTQ5OTQxNTA3ZmFiMGYyZWQ0MGM0ODY2YWI3MzgyNjkwNzQyM2FmNGRjMzc3MjJmZDZkOWY4M2RhZTg2MSIsInZlcnNpb24iOjF9.IiMSSVahBgH8n34bGCC_DDGpujDXQbIvGhlcpVV2EBVQLLWUqcCy5WwBdbRrxPC-asBRCNERQxj8Uii4FvPsDQ
308
+ - type: loss
309
  value: 1.9130958318710327
310
+ name: loss
311
  verified: true
312
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNTg2NTMxZDE3MDg3MDFkMTYxNjY1OTc5YjQ4ODcyMGUxMTFiZjJiNDgyYWZhN2NjZmE1MDQ1NTRmZGY0NjQzZSIsInZlcnNpb24iOjF9.kADUBMO8i6-oGDDt1cOiGMrGcMkF_Qc1jSpS2NSFyksDRusQa_YuuShefF4DuHVEr3CS0hNjjRH9_JBeX9ZQDg
313
+ - type: gen_len
314
  value: 179.2184
315
+ name: gen_len
316
  verified: true
317
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNjM4NGNiMTY3YzZjMzg4MTRiMDdiZDFiMzA1ZDIyMDM2MDk1OWRhYWQzN2UxZDNlODIxOWVhY2JlYjk4Mjk5YyIsInZlcnNpb24iOjF9.nU8ImMNWgjg9BKjUBJQLFaJOBq3kyIne8ldlpL0OV0e4888wOntIAcJP0dCCYfRSLVmZuXQ1M8cpDuTf50hNCw
318
  - task:
319
  type: summarization
320
  name: Summarization
 
324
  config: kmfoda--booksum
325
  split: test
326
  metrics:
327
+ - type: rouge
 
328
  value: 35.2154
329
+ name: ROUGE-1
330
  verified: true
331
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZWQ5MGMzNDc4MDBiNmRiNDY5ZDM4N2QzYTJlYTNiYTcwNDBlMzdlM2I4N2VmM2ZjMmQ3NGU3OTRlMTMzMTg3NyIsInZlcnNpb24iOjF9.E55gu7HvMwc4HejF3YOD6yqQJj7_6GCoCMWm78sY5_w2glR-oM98tu9IsG27VaPva7UklxsspzT2DIVaVKY0CQ
332
+ - type: rouge
333
  value: 6.8702
334
+ name: ROUGE-2
335
  verified: true
336
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZjFhN2JlYzlmMGZmYzkwYjBlNjY4YzhlYzNmMTdmZWYyYmU3NWI0ZTRkMTgxNmRiM2EyZWMyMWFjY2JkNzg1MCIsInZlcnNpb24iOjF9.I9BoHbGt8LLNtLAssIXm9tQ4lHqFCMt0zJS_zTezzxGRMS5On71c3jnlzrDtwEm6wjmZEwYIJK8qqJh-Qa5YAA
337
+ - type: rouge
338
  value: 17.6693
339
+ name: ROUGE-L
340
  verified: true
341
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiOGZlZjcwOTZjMmNjZWFkM2M5Zjg1OTgzMzcxOTM2Y2RkMzY4NGU2NDE2MTVjMjcyMWIwNWI4ODc0YTY3YTA2MSIsInZlcnNpb24iOjF9.Ou1C6U6PrOtXPxlk9PMucdJ_vlnVnSk94QrLJL4b_g2pcY3D80Xrw09iz4BTOPzZ2UTNBLyn8YdLY3m2vHpiAQ
342
+ - type: rouge
343
  value: 32.8365
344
+ name: ROUGE-LSUM
345
  verified: true
346
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMmIzMGQ5MzQ1MjI4MTU0ZGZkZTRhODllNWQyOTQ4ZjA5YWE4ZTJjMzQ2ZWQzOGFiMWUzZDMxOTU5NzkxYjliZiIsInZlcnNpb24iOjF9.2mYURQZYo7e3AY0tfkpqFMNhoHvrysvBXza-XYYrX_xLpruMU9Gzrwc3jvpi2wtp4eeyhzIiZJvH0O6la6zxCg
347
+ - type: loss
348
  value: 2.9878039360046387
349
+ name: loss
350
  verified: true
351
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZGU0ODBmN2I3OGFkNTFiM2I3YWQyNmUzNzUwYzEwNzczZWEwZjIxYTAwZDE2ZTIwMGE3ZGNmMDQzNTFmNjEwYyIsInZlcnNpb24iOjF9.0IKWIImKTXqysQUb2IMPk2eeHlOcBjndiPcU42nfFBMhRTqeXdBqOCP6cidlho7pVN4hsC-77ArJ9pZlbTFuBg
352
+ - type: gen_len
353
  value: 200.6785
354
+ name: gen_len
355
  verified: true
356
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMDUzYTE3MmIxZGM3MWI1MjNhMTU3MTdkMjJjNjY5Y2UzYTdjYWRiY2I4MmUxMDY4NTA5NWZjYWU0NzliODdkYiIsInZlcnNpb24iOjF9.BqmCaWzbCMNUied6zNO744Dl-0LC47FCIv-l8kDjkhSkwQcb_hi93VYts5PTsrFY_MmM8j7AsY1PiFr6nNFMBQ
357
  - task:
358
  type: summarization
359
  name: Summarization
 
363
  config: y
364
  split: test
365
  metrics:
366
+ - type: rouge
 
367
  value: 37.376
368
+ name: ROUGE-1
369
  verified: true
370
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMWI4ZjMxODcxMThiMzE3NjQ3Zjg0NzhmZjlhY2ZmYjQwMGY5ZjlkZGY1MzZmY2M5YTU4NmY1Y2NhZDA3YWFkOCIsInZlcnNpb24iOjF9.sYh4IynXgOpVetYYSWUp0v5QZWvXC1x7_uJR0LZUxaeYKEc4yfICNmDOPzNzoroaV4ELeOaPjHQpYVm-lpAHBA
371
+ - type: rouge
372
  value: 11.4432
373
+ name: ROUGE-2
374
  verified: true
375
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZTZkOGIyYzU3YTQ5ZTFmMDU3MjQ5ZWM2NGQ1MzgwMDYyZDkxN2Q2YjgyZTkzMTEyYjczMGJiYmNkZmU5MTQ3NSIsInZlcnNpb24iOjF9.Qk38acpjPjU64Z1nXEuqMXjKZrGvdC9oY586EjuCPeEAJCSzKimp8FsB-1QrjMH73q6rN2CdumJUxih6HF-KAA
376
+ - type: rouge
377
  value: 22.2754
378
+ name: ROUGE-L
379
  verified: true
380
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNzlmOTUxYmEzYzYyYmVjNGZlNzNiZWIwZmQ5OWVlY2U3NTBiZDExYWUwODQ0Y2ZjMmQyMTNmMTlmNjdmZWUwNCIsInZlcnNpb24iOjF9.bUVhxaepySyaityby71j6h4YO_l4x8OSeZoblagwUMYGXRc0Ej286QzEtZFeRGygMJ5sjUN_loWCtOmAnHY2BA
381
+ - type: rouge
382
  value: 32.5087
383
+ name: ROUGE-LSUM
384
  verified: true
385
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNDEyNjM5NjAzYTNjN2MwZTY4MWY2Y2U5YWUyM2Y1YjAyNjBhZTM0YTAyZjM5N2M1ZDkxOWUxNzE2OWZkYTBmMSIsInZlcnNpb24iOjF9.QfMHkcoAR3xqzsgL1xjHk3Lui1xhE12pJKvYujQ_h5o6PBXT79dsENsrqDGGBjiKdTKNwWqADgaviy1VrWMDCQ
386
+ - type: loss
387
  value: 2.9867310523986816
388
+ name: loss
389
  verified: true
390
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZTUzM2Q5MmE5MzU4YmFlMjFiMmUzZGU2NDAzMTQ1Y2NjZDVlYWI3NGE5MjM0NmMxMjdiOWI3MTU0NDk3NmNkZiIsInZlcnNpb24iOjF9.VoQqu6ZU3AR_cji82UkpvbLnTmZ17fZmR2E4DeonjCyTZpyyfvUsQ2nbKDovQf34DBkYXENk42EUsUF1mBZNBg
391
+ - type: gen_len
392
  value: 172.7776
393
+ name: gen_len
394
  verified: true
395
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNTEzNTMyMDY1N2Q5ZTMxNjNlMTI0Nzk5ZDc1ZWQ5Y2IwZWM0NWNhNWY2MTk3YTRkYzUwMTI4NjZiOWVhOGQwYSIsInZlcnNpb24iOjF9.-Rek2VFmGqIEgqeFoxU_0aCWdFbGYi9BV5c7x-izm9_4vtZdYQ4ITXm4T8C3UlpOax60veJQt2Uax5vyiFc9Ag
396
  ---
397
 
398
  # pszemraj/pegasus-x-large-book-summary