bobox commited on
Commit
03f6297
·
verified ·
1 Parent(s): 20bb9fe

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +633 -0
README.md ADDED
@@ -0,0 +1,633 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: microsoft/deberta-v3-base
3
+ datasets:
4
+ - tals/vitaminc
5
+ - allenai/scitail
6
+ - allenai/sciq
7
+ - allenai/qasc
8
+ - sentence-transformers/msmarco-msmarco-distilbert-base-v3
9
+ - sentence-transformers/natural-questions
10
+ - sentence-transformers/trivia-qa
11
+ - sentence-transformers/gooaq
12
+ - google-research-datasets/paws
13
+ language:
14
+ - en
15
+ library_name: sentence-transformers
16
+ metrics:
17
+ - pearson_cosine
18
+ - spearman_cosine
19
+ - pearson_manhattan
20
+ - spearman_manhattan
21
+ - pearson_euclidean
22
+ - spearman_euclidean
23
+ - pearson_dot
24
+ - spearman_dot
25
+ - pearson_max
26
+ - spearman_max
27
+ - cosine_accuracy
28
+ - cosine_accuracy_threshold
29
+ - cosine_f1
30
+ - cosine_f1_threshold
31
+ - cosine_precision
32
+ - cosine_recall
33
+ - cosine_ap
34
+ - dot_accuracy
35
+ - dot_accuracy_threshold
36
+ - dot_f1
37
+ - dot_f1_threshold
38
+ - dot_precision
39
+ - dot_recall
40
+ - dot_ap
41
+ - manhattan_accuracy
42
+ - manhattan_accuracy_threshold
43
+ - manhattan_f1
44
+ - manhattan_f1_threshold
45
+ - manhattan_precision
46
+ - manhattan_recall
47
+ - manhattan_ap
48
+ - euclidean_accuracy
49
+ - euclidean_accuracy_threshold
50
+ - euclidean_f1
51
+ - euclidean_f1_threshold
52
+ - euclidean_precision
53
+ - euclidean_recall
54
+ - euclidean_ap
55
+ - max_accuracy
56
+ - max_accuracy_threshold
57
+ - max_f1
58
+ - max_f1_threshold
59
+ - max_precision
60
+ - max_recall
61
+ - max_ap
62
+ pipeline_tag: sentence-similarity
63
+ tags:
64
+ - sentence-transformers
65
+ - sentence-similarity
66
+ - feature-extraction
67
+ - generated_from_trainer
68
+ - dataset_size:123245
69
+ - loss:CachedGISTEmbedLoss
70
+ widget:
71
+ - source_sentence: what type of inheritance does haemochromatosis
72
+ sentences:
73
+ - Nestled on the tranquil banks of the Pamlico River, Moss Landing is a vibrant
74
+ new community of thoughtfully conceived, meticulously crafted single-family homes
75
+ in Washington, North Carolina. Washington is renowned for its historic architecture
76
+ and natural beauty.
77
+ - '1 Microwave on high for 8 to 10 minutes or until tender, turning the yams once.
78
+ 2 To microwave sliced yams: Wash, peel, and cut off the woody portions and ends.
79
+ 3 Cut yams into quarters. 4 Place the yams and 1/2 cup water in a microwave-safe
80
+ casserole.ake the Yams. 1 Place half the yams in a 1-quart casserole. 2 Layer
81
+ with half the brown sugar and half the margarine. 3 Repeat the layers. 4 Bake,
82
+ uncovered, in a 375 degree F oven for 30 to 35 minutes or until the yams are glazed,
83
+ spooning the liquid over the yams once or twice during cooking.'
84
+ - Types 1, 2, and 3 hemochromatosis are inherited in an autosomal recessive pattern,
85
+ which means both copies of the gene in each cell have mutations. Most often, the
86
+ parents of an individual with an autosomal recessive condition each carry one
87
+ copy of the mutated gene but do not show signs and symptoms of the condition.Type
88
+ 4 hemochromatosis is distinguished by its autosomal dominant inheritance pattern.With
89
+ this type of inheritance, one copy of the altered gene in each cell is sufficient
90
+ to cause the disorder. In most cases, an affected person has one parent with the
91
+ condition.ype 1, the most common form of the disorder, and type 4 (also called
92
+ ferroportin disease) begin in adulthood. Men with type 1 or type 4 hemochromatosis
93
+ typically develop symptoms between the ages of 40 and 60, and women usually develop
94
+ symptoms after menopause. Type 2 hemochromatosis is a juvenile-onset disorder.
95
+ - source_sentence: More than 273 people have died from the 2019-20 coronavirus outside
96
+ mainland China .
97
+ sentences:
98
+ - 'More than 3,700 people have died : around 3,100 in mainland China and around
99
+ 550 in all other countries combined .'
100
+ - 'More than 3,200 people have died : almost 3,000 in mainland China and around
101
+ 275 in other countries .'
102
+ - more than 4,900 deaths have been attributed to COVID-19 .
103
+ - source_sentence: The male reproductive system consists of structures that produce
104
+ sperm and secrete testosterone.
105
+ sentences:
106
+ - What does the male reproductive system consist of?
107
+ - What facilitates the diffusion of ions across a membrane?
108
+ - Autoimmunity can develop with time, and its causes may be rooted in this?
109
+ - source_sentence: Nitrogen gas comprises about three-fourths of earth's atmosphere.
110
+ sentences:
111
+ - What do all cells have in common?
112
+ - What gas comprises about three-fourths of earth's atmosphere?
113
+ - What do you call an animal in which the embryo, often termed a joey, is born immature
114
+ and must complete its development outside the mother's body?
115
+ - source_sentence: What device is used to regulate a person's heart rate?
116
+ sentences:
117
+ - 'Marie Antoinette and the French Revolution . Famous Faces . Mad Max:
118
+ Maximilien Robespierre | PBS Extended Interviews > Resources > For Educators
119
+ > Mad Max: Maximilien Robespierre Maximilien Robespierre was born May 6, 1758
120
+ in Arras, France. Educated at the Lycée Louis-le-Grand in Paris as a lawyer, Robespierre
121
+ became a disciple of philosopher Jean-Jacques Rousseau and a passionate advocate
122
+ for the poor. Called "the Incorruptible" because of his unwavering dedication
123
+ to the Revolution, Robespierre joined the Jacobin Club and earned a loyal following.
124
+ In contrast to the more republican Girondins and Marie Antoinette, Robespierre
125
+ fiercely opposed declaring war on Austria, feeling it would distract from revolutionary
126
+ progress in France. Robespierre''s exemplary oratory skills influenced the National
127
+ Convention in 1792 to avoid seeking public opinion about the Convention’s decision
128
+ to execute King Louis XVI. In 1793, the Convention elected Robespierre to the
129
+ Committee of Public Defense. He was a highly controversial member, developing
130
+ radical policies, warning of conspiracies, and suggesting restructuring the Convention.
131
+ This behavior eventually led to his downfall, and he was guillotined without trial
132
+ on 10th Thermidor An II (July 28, 1794), marking the end of the Reign of Terror.
133
+ Famous Faces'
134
+ - Devices for Arrhythmia Devices for Arrhythmia Updated:Dec 21,2016 In a medical
135
+ emergency, life-threatening arrhythmias may be stopped by giving the heart an
136
+ electric shock (as with a defibrillator ). For people with recurrent arrhythmias,
137
+ medical devices such as a pacemaker and implantable cardioverter defibrillator
138
+ (ICD) can help by continuously monitoring the heart's electrical system and providing
139
+ automatic correction when an arrhythmia starts to occur. This section covers everything
140
+ you need to know about these devices. Implantable Cardioverter Defibrillator (ICD)
141
+ - 'vintage cleats | eBay vintage cleats: 1 2 3 4 5 eBay determines this price through
142
+ a machine learned model of the product''s sale prices within the last 90 days.
143
+ eBay determines trending price through a machine learned model of the product’s
144
+ sale prices within the last 90 days. "New" refers to a brand-new, unused, unopened,
145
+ undamaged item, and "Used" refers to an item that has been used previously. Top
146
+ Rated Plus Sellers with highest buyer ratings Returns, money back Sellers with
147
+ highest buyer ratings Returns, money back'
148
+ model-index:
149
+ - name: SentenceTransformer based on microsoft/deberta-v3-base
150
+ results:
151
+ - task:
152
+ type: semantic-similarity
153
+ name: Semantic Similarity
154
+ dataset:
155
+ name: sts test
156
+ type: sts-test
157
+ metrics:
158
+ - type: pearson_cosine
159
+ value: 0.8253431554642914
160
+ name: Pearson Cosine
161
+ - type: spearman_cosine
162
+ value: 0.870857890879963
163
+ name: Spearman Cosine
164
+ - type: pearson_manhattan
165
+ value: 0.8653068915625914
166
+ name: Pearson Manhattan
167
+ - type: spearman_manhattan
168
+ value: 0.8667110599943904
169
+ name: Spearman Manhattan
170
+ - type: pearson_euclidean
171
+ value: 0.8671346646296434
172
+ name: Pearson Euclidean
173
+ - type: spearman_euclidean
174
+ value: 0.8681442638917114
175
+ name: Spearman Euclidean
176
+ - type: pearson_dot
177
+ value: 0.7826717704847901
178
+ name: Pearson Dot
179
+ - type: spearman_dot
180
+ value: 0.7685403521338614
181
+ name: Spearman Dot
182
+ - type: pearson_max
183
+ value: 0.8671346646296434
184
+ name: Pearson Max
185
+ - type: spearman_max
186
+ value: 0.870857890879963
187
+ name: Spearman Max
188
+ - task:
189
+ type: binary-classification
190
+ name: Binary Classification
191
+ dataset:
192
+ name: allNLI dev
193
+ type: allNLI-dev
194
+ metrics:
195
+ - type: cosine_accuracy
196
+ value: 0.71875
197
+ name: Cosine Accuracy
198
+ - type: cosine_accuracy_threshold
199
+ value: 0.8745474815368652
200
+ name: Cosine Accuracy Threshold
201
+ - type: cosine_f1
202
+ value: 0.617169373549884
203
+ name: Cosine F1
204
+ - type: cosine_f1_threshold
205
+ value: 0.7519949674606323
206
+ name: Cosine F1 Threshold
207
+ - type: cosine_precision
208
+ value: 0.5155038759689923
209
+ name: Cosine Precision
210
+ - type: cosine_recall
211
+ value: 0.7687861271676301
212
+ name: Cosine Recall
213
+ - type: cosine_ap
214
+ value: 0.6116004689391709
215
+ name: Cosine Ap
216
+ - type: dot_accuracy
217
+ value: 0.693359375
218
+ name: Dot Accuracy
219
+ - type: dot_accuracy_threshold
220
+ value: 401.3755187988281
221
+ name: Dot Accuracy Threshold
222
+ - type: dot_f1
223
+ value: 0.566735112936345
224
+ name: Dot F1
225
+ - type: dot_f1_threshold
226
+ value: 295.2575988769531
227
+ name: Dot F1 Threshold
228
+ - type: dot_precision
229
+ value: 0.4394904458598726
230
+ name: Dot Precision
231
+ - type: dot_recall
232
+ value: 0.7976878612716763
233
+ name: Dot Recall
234
+ - type: dot_ap
235
+ value: 0.5243551756921989
236
+ name: Dot Ap
237
+ - type: manhattan_accuracy
238
+ value: 0.724609375
239
+ name: Manhattan Accuracy
240
+ - type: manhattan_accuracy_threshold
241
+ value: 228.3092498779297
242
+ name: Manhattan Accuracy Threshold
243
+ - type: manhattan_f1
244
+ value: 0.6267281105990783
245
+ name: Manhattan F1
246
+ - type: manhattan_f1_threshold
247
+ value: 266.0207824707031
248
+ name: Manhattan F1 Threshold
249
+ - type: manhattan_precision
250
+ value: 0.5210727969348659
251
+ name: Manhattan Precision
252
+ - type: manhattan_recall
253
+ value: 0.7861271676300579
254
+ name: Manhattan Recall
255
+ - type: manhattan_ap
256
+ value: 0.6101425904568746
257
+ name: Manhattan Ap
258
+ - type: euclidean_accuracy
259
+ value: 0.720703125
260
+ name: Euclidean Accuracy
261
+ - type: euclidean_accuracy_threshold
262
+ value: 9.726119041442871
263
+ name: Euclidean Accuracy Threshold
264
+ - type: euclidean_f1
265
+ value: 0.6303854875283447
266
+ name: Euclidean F1
267
+ - type: euclidean_f1_threshold
268
+ value: 14.837699890136719
269
+ name: Euclidean F1 Threshold
270
+ - type: euclidean_precision
271
+ value: 0.5186567164179104
272
+ name: Euclidean Precision
273
+ - type: euclidean_recall
274
+ value: 0.8034682080924855
275
+ name: Euclidean Recall
276
+ - type: euclidean_ap
277
+ value: 0.6172110045723997
278
+ name: Euclidean Ap
279
+ - type: max_accuracy
280
+ value: 0.724609375
281
+ name: Max Accuracy
282
+ - type: max_accuracy_threshold
283
+ value: 401.3755187988281
284
+ name: Max Accuracy Threshold
285
+ - type: max_f1
286
+ value: 0.6303854875283447
287
+ name: Max F1
288
+ - type: max_f1_threshold
289
+ value: 295.2575988769531
290
+ name: Max F1 Threshold
291
+ - type: max_precision
292
+ value: 0.5210727969348659
293
+ name: Max Precision
294
+ - type: max_recall
295
+ value: 0.8034682080924855
296
+ name: Max Recall
297
+ - type: max_ap
298
+ value: 0.6172110045723997
299
+ name: Max Ap
300
+ - task:
301
+ type: binary-classification
302
+ name: Binary Classification
303
+ dataset:
304
+ name: Qnli dev
305
+ type: Qnli-dev
306
+ metrics:
307
+ - type: cosine_accuracy
308
+ value: 0.673828125
309
+ name: Cosine Accuracy
310
+ - type: cosine_accuracy_threshold
311
+ value: 0.7472400069236755
312
+ name: Cosine Accuracy Threshold
313
+ - type: cosine_f1
314
+ value: 0.6863468634686347
315
+ name: Cosine F1
316
+ - type: cosine_f1_threshold
317
+ value: 0.7334084510803223
318
+ name: Cosine F1 Threshold
319
+ - type: cosine_precision
320
+ value: 0.6078431372549019
321
+ name: Cosine Precision
322
+ - type: cosine_recall
323
+ value: 0.788135593220339
324
+ name: Cosine Recall
325
+ - type: cosine_ap
326
+ value: 0.7293502303398447
327
+ name: Cosine Ap
328
+ - type: dot_accuracy
329
+ value: 0.6484375
330
+ name: Dot Accuracy
331
+ - type: dot_accuracy_threshold
332
+ value: 392.88726806640625
333
+ name: Dot Accuracy Threshold
334
+ - type: dot_f1
335
+ value: 0.6634920634920635
336
+ name: Dot F1
337
+ - type: dot_f1_threshold
338
+ value: 310.97833251953125
339
+ name: Dot F1 Threshold
340
+ - type: dot_precision
341
+ value: 0.5304568527918782
342
+ name: Dot Precision
343
+ - type: dot_recall
344
+ value: 0.885593220338983
345
+ name: Dot Recall
346
+ - type: dot_ap
347
+ value: 0.6331200610041253
348
+ name: Dot Ap
349
+ - type: manhattan_accuracy
350
+ value: 0.671875
351
+ name: Manhattan Accuracy
352
+ - type: manhattan_accuracy_threshold
353
+ value: 277.69342041015625
354
+ name: Manhattan Accuracy Threshold
355
+ - type: manhattan_f1
356
+ value: 0.6830122591943958
357
+ name: Manhattan F1
358
+ - type: manhattan_f1_threshold
359
+ value: 301.36639404296875
360
+ name: Manhattan F1 Threshold
361
+ - type: manhattan_precision
362
+ value: 0.582089552238806
363
+ name: Manhattan Precision
364
+ - type: manhattan_recall
365
+ value: 0.826271186440678
366
+ name: Manhattan Recall
367
+ - type: manhattan_ap
368
+ value: 0.7276384343706648
369
+ name: Manhattan Ap
370
+ - type: euclidean_accuracy
371
+ value: 0.68359375
372
+ name: Euclidean Accuracy
373
+ - type: euclidean_accuracy_threshold
374
+ value: 15.343950271606445
375
+ name: Euclidean Accuracy Threshold
376
+ - type: euclidean_f1
377
+ value: 0.6895238095238095
378
+ name: Euclidean F1
379
+ - type: euclidean_f1_threshold
380
+ value: 15.738676071166992
381
+ name: Euclidean F1 Threshold
382
+ - type: euclidean_precision
383
+ value: 0.6262975778546713
384
+ name: Euclidean Precision
385
+ - type: euclidean_recall
386
+ value: 0.7669491525423728
387
+ name: Euclidean Recall
388
+ - type: euclidean_ap
389
+ value: 0.7307379367367225
390
+ name: Euclidean Ap
391
+ - type: max_accuracy
392
+ value: 0.68359375
393
+ name: Max Accuracy
394
+ - type: max_accuracy_threshold
395
+ value: 392.88726806640625
396
+ name: Max Accuracy Threshold
397
+ - type: max_f1
398
+ value: 0.6895238095238095
399
+ name: Max F1
400
+ - type: max_f1_threshold
401
+ value: 310.97833251953125
402
+ name: Max F1 Threshold
403
+ - type: max_precision
404
+ value: 0.6262975778546713
405
+ name: Max Precision
406
+ - type: max_recall
407
+ value: 0.885593220338983
408
+ name: Max Recall
409
+ - type: max_ap
410
+ value: 0.7307379367367225
411
+ name: Max Ap
412
+ ---
413
+
414
+ # SentenceTransformer based on microsoft/deberta-v3-base
415
+
416
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [microsoft/deberta-v3-base](https://huggingface.co/microsoft/deberta-v3-base) on the negation-triplets, [vitaminc-pairs](https://huggingface.co/datasets/tals/vitaminc), [scitail-pairs-qa](https://huggingface.co/datasets/allenai/scitail), [scitail-pairs-pos](https://huggingface.co/datasets/allenai/scitail), xsum-pairs, [sciq_pairs](https://huggingface.co/datasets/allenai/sciq), [qasc_pairs](https://huggingface.co/datasets/allenai/qasc), openbookqa_pairs, [msmarco_pairs](https://huggingface.co/datasets/sentence-transformers/msmarco-msmarco-distilbert-base-v3), [nq_pairs](https://huggingface.co/datasets/sentence-transformers/natural-questions), [trivia_pairs](https://huggingface.co/datasets/sentence-transformers/trivia-qa), [gooaq_pairs](https://huggingface.co/datasets/sentence-transformers/gooaq), [paws-pos](https://huggingface.co/datasets/google-research-datasets/paws) and global_dataset datasets. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
417
+
418
+ ## Model Details
419
+
420
+ ### Model Description
421
+ - **Model Type:** Sentence Transformer
422
+ - **Base model:** [microsoft/deberta-v3-base](https://huggingface.co/microsoft/deberta-v3-base) <!-- at revision 8ccc9b6f36199bec6961081d44eb72fb3f7353f3 -->
423
+ - **Maximum Sequence Length:** 512 tokens
424
+ - **Output Dimensionality:** 768 tokens
425
+ - **Similarity Function:** Cosine Similarity
426
+ - **Training Datasets:**
427
+ - negation-triplets
428
+ - [vitaminc-pairs](https://huggingface.co/datasets/tals/vitaminc)
429
+ - [scitail-pairs-qa](https://huggingface.co/datasets/allenai/scitail)
430
+ - [scitail-pairs-pos](https://huggingface.co/datasets/allenai/scitail)
431
+ - xsum-pairs
432
+ - [sciq_pairs](https://huggingface.co/datasets/allenai/sciq)
433
+ - [qasc_pairs](https://huggingface.co/datasets/allenai/qasc)
434
+ - openbookqa_pairs
435
+ - [msmarco_pairs](https://huggingface.co/datasets/sentence-transformers/msmarco-msmarco-distilbert-base-v3)
436
+ - [nq_pairs](https://huggingface.co/datasets/sentence-transformers/natural-questions)
437
+ - [trivia_pairs](https://huggingface.co/datasets/sentence-transformers/trivia-qa)
438
+ - [gooaq_pairs](https://huggingface.co/datasets/sentence-transformers/gooaq)
439
+ - [paws-pos](https://huggingface.co/datasets/google-research-datasets/paws)
440
+ - global_dataset
441
+ - **Language:** en
442
+ <!-- - **License:** Unknown -->
443
+ ## Evaluation
444
+
445
+ ### Metrics
446
+
447
+ #### Semantic Similarity
448
+ * Dataset: `sts-test`
449
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
450
+
451
+ | Metric | Value |
452
+ |:--------------------|:-----------|
453
+ | pearson_cosine | 0.8253 |
454
+ | **spearman_cosine** | **0.8709** |
455
+ | pearson_manhattan | 0.8653 |
456
+ | spearman_manhattan | 0.8667 |
457
+ | pearson_euclidean | 0.8671 |
458
+ | spearman_euclidean | 0.8681 |
459
+ | pearson_dot | 0.7827 |
460
+ | spearman_dot | 0.7685 |
461
+ | pearson_max | 0.8671 |
462
+ | spearman_max | 0.8709 |
463
+
464
+
465
+ <!--
466
+ ## Bias, Risks and Limitations
467
+
468
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
469
+ -->
470
+
471
+
472
+ ### Training Hyperparameters
473
+ #### Non-Default Hyperparameters
474
+
475
+ - `eval_strategy`: steps
476
+ - `per_device_train_batch_size`: 96
477
+ - `per_device_eval_batch_size`: 68
478
+ - `learning_rate`: 3.5e-05
479
+ - `weight_decay`: 0.0005
480
+ - `num_train_epochs`: 2
481
+ - `lr_scheduler_type`: cosine_with_min_lr
482
+ - `lr_scheduler_kwargs`: {'num_cycles': 3.5, 'min_lr': 1.5e-05}
483
+ - `warmup_ratio`: 0.33
484
+ - `save_safetensors`: False
485
+ - `fp16`: True
486
+ - `push_to_hub`: True
487
+ - `hub_model_id`: bobox/DeBERTa3-base-STr-CosineWaves-checkpoints-tmp
488
+ - `hub_strategy`: all_checkpoints
489
+ - `batch_sampler`: no_duplicates
490
+
491
+ #### All Hyperparameters
492
+ <details><summary>Click to expand</summary>
493
+
494
+ - `overwrite_output_dir`: False
495
+ - `do_predict`: False
496
+ - `eval_strategy`: steps
497
+ - `prediction_loss_only`: True
498
+ - `per_device_train_batch_size`: 96
499
+ - `per_device_eval_batch_size`: 68
500
+ - `per_gpu_train_batch_size`: None
501
+ - `per_gpu_eval_batch_size`: None
502
+ - `gradient_accumulation_steps`: 1
503
+ - `eval_accumulation_steps`: None
504
+ - `torch_empty_cache_steps`: None
505
+ - `learning_rate`: 3.5e-05
506
+ - `weight_decay`: 0.0005
507
+ - `adam_beta1`: 0.9
508
+ - `adam_beta2`: 0.999
509
+ - `adam_epsilon`: 1e-08
510
+ - `max_grad_norm`: 1.0
511
+ - `num_train_epochs`: 2
512
+ - `max_steps`: -1
513
+ - `lr_scheduler_type`: cosine_with_min_lr
514
+ - `lr_scheduler_kwargs`: {'num_cycles': 3.5, 'min_lr': 1.5e-05}
515
+ - `warmup_ratio`: 0.33
516
+ - `warmup_steps`: 0
517
+ - `log_level`: passive
518
+ - `log_level_replica`: warning
519
+ - `log_on_each_node`: True
520
+ - `logging_nan_inf_filter`: True
521
+ - `save_safetensors`: False
522
+ - `save_on_each_node`: False
523
+ - `save_only_model`: False
524
+ - `restore_callback_states_from_checkpoint`: False
525
+ - `no_cuda`: False
526
+ - `use_cpu`: False
527
+ - `use_mps_device`: False
528
+ - `seed`: 42
529
+ - `data_seed`: None
530
+ - `jit_mode_eval`: False
531
+ - `use_ipex`: False
532
+ - `bf16`: False
533
+ - `fp16`: True
534
+ - `fp16_opt_level`: O1
535
+ - `half_precision_backend`: auto
536
+ - `bf16_full_eval`: False
537
+ - `fp16_full_eval`: False
538
+ - `tf32`: None
539
+ - `local_rank`: 0
540
+ - `ddp_backend`: None
541
+ - `tpu_num_cores`: None
542
+ - `tpu_metrics_debug`: False
543
+ - `debug`: []
544
+ - `dataloader_drop_last`: False
545
+ - `dataloader_num_workers`: 0
546
+ - `dataloader_prefetch_factor`: None
547
+ - `past_index`: -1
548
+ - `disable_tqdm`: False
549
+ - `remove_unused_columns`: True
550
+ - `label_names`: None
551
+ - `load_best_model_at_end`: False
552
+ - `ignore_data_skip`: False
553
+ - `fsdp`: []
554
+ - `fsdp_min_num_params`: 0
555
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
556
+ - `fsdp_transformer_layer_cls_to_wrap`: None
557
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
558
+ - `deepspeed`: None
559
+ - `label_smoothing_factor`: 0.0
560
+ - `optim`: adamw_torch
561
+ - `optim_args`: None
562
+ - `adafactor`: False
563
+ - `group_by_length`: False
564
+ - `length_column_name`: length
565
+ - `ddp_find_unused_parameters`: None
566
+ - `ddp_bucket_cap_mb`: None
567
+ - `ddp_broadcast_buffers`: False
568
+ - `dataloader_pin_memory`: True
569
+ - `dataloader_persistent_workers`: False
570
+ - `skip_memory_metrics`: True
571
+ - `use_legacy_prediction_loop`: False
572
+ - `push_to_hub`: True
573
+ - `resume_from_checkpoint`: None
574
+ - `hub_model_id`: bobox/DeBERTa3-base-STr-CosineWaves-checkpoints-tmp
575
+ - `hub_strategy`: all_checkpoints
576
+ - `hub_private_repo`: False
577
+ - `hub_always_push`: False
578
+ - `gradient_checkpointing`: False
579
+ - `gradient_checkpointing_kwargs`: None
580
+ - `include_inputs_for_metrics`: False
581
+ - `eval_do_concat_batches`: True
582
+ - `fp16_backend`: auto
583
+ - `push_to_hub_model_id`: None
584
+ - `push_to_hub_organization`: None
585
+ - `mp_parameters`:
586
+ - `auto_find_batch_size`: False
587
+ - `full_determinism`: False
588
+ - `torchdynamo`: None
589
+ - `ray_scope`: last
590
+ - `ddp_timeout`: 1800
591
+ - `torch_compile`: False
592
+ - `torch_compile_backend`: None
593
+ - `torch_compile_mode`: None
594
+ - `dispatch_batches`: None
595
+ - `split_batches`: None
596
+ - `include_tokens_per_second`: False
597
+ - `include_num_input_tokens_seen`: False
598
+ - `neftune_noise_alpha`: None
599
+ - `optim_target_modules`: None
600
+ - `batch_eval_metrics`: False
601
+ - `eval_on_start`: False
602
+ - `eval_use_gather_object`: False
603
+ - `batch_sampler`: no_duplicates
604
+ - `multi_dataset_batch_sampler`: proportional
605
+
606
+ </details>
607
+
608
+
609
+ ### Framework Versions
610
+ - Python: 3.10.14
611
+ - Sentence Transformers: 3.0.1
612
+ - Transformers: 4.44.0
613
+ - PyTorch: 2.4.0
614
+ - Accelerate: 0.33.0
615
+ - Datasets: 2.21.0
616
+ - Tokenizers: 0.19.1
617
+
618
+ ## Citation
619
+
620
+ ### BibTeX
621
+
622
+ #### Sentence Transformers
623
+ ```bibtex
624
+ @inproceedings{reimers-2019-sentence-bert,
625
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
626
+ author = "Reimers, Nils and Gurevych, Iryna",
627
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
628
+ month = "11",
629
+ year = "2019",
630
+ publisher = "Association for Computational Linguistics",
631
+ url = "https://arxiv.org/abs/1908.10084",
632
+ }
633
+ ```