tomaarsen HF staff commited on
Commit
e4bf7b2
1 Parent(s): ea3b3dd

Add new SentenceTransformer model.

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,667 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: google-bert/bert-base-uncased
3
+ datasets:
4
+ - sentence-transformers/gooaq
5
+ language:
6
+ - en
7
+ library_name: sentence-transformers
8
+ license: apache-2.0
9
+ metrics:
10
+ - cosine_accuracy@1
11
+ - cosine_accuracy@3
12
+ - cosine_accuracy@5
13
+ - cosine_accuracy@10
14
+ - cosine_precision@1
15
+ - cosine_precision@3
16
+ - cosine_precision@5
17
+ - cosine_precision@10
18
+ - cosine_recall@1
19
+ - cosine_recall@3
20
+ - cosine_recall@5
21
+ - cosine_recall@10
22
+ - cosine_ndcg@10
23
+ - cosine_mrr@10
24
+ - cosine_map@100
25
+ - dot_accuracy@1
26
+ - dot_accuracy@3
27
+ - dot_accuracy@5
28
+ - dot_accuracy@10
29
+ - dot_precision@1
30
+ - dot_precision@3
31
+ - dot_precision@5
32
+ - dot_precision@10
33
+ - dot_recall@1
34
+ - dot_recall@3
35
+ - dot_recall@5
36
+ - dot_recall@10
37
+ - dot_ndcg@10
38
+ - dot_mrr@10
39
+ - dot_map@100
40
+ pipeline_tag: sentence-similarity
41
+ tags:
42
+ - sentence-transformers
43
+ - sentence-similarity
44
+ - feature-extraction
45
+ - generated_from_trainer
46
+ - dataset_size:3002496
47
+ - loss:MultipleNegativesRankingLoss
48
+ widget:
49
+ - source_sentence: how to change date format in ms project 2007?
50
+ sentences:
51
+ - '[''Choose File > Options.'', ''Select General.'', ''Under Project view, pick
52
+ an option from the Date format list.'']'
53
+ - Cats can be very affectionate and bonded with each other and still bond well and
54
+ show affection to their human. Getting two kittens from the same litter, regardless
55
+ of gender, can make it easier for them to befriend each other and play—but any
56
+ two kittens generally tend to get on well after introductions.
57
+ - 'Treat your permed hair like silk or another delicate fabric: washing it once
58
+ a week is enough to keep it clean and help maintain its beauty. Wash your hair
59
+ with warm water. Hot water can strip your hair of oils that help keep it moisturized
60
+ and looking lustrous. Hot water can also ruin the curls.'
61
+ - source_sentence: is the mother in vinegar good for you?
62
+ sentences:
63
+ - Some people say the “mother,” the cloud of yeast and bacteria you might see in
64
+ a bottle of apple cider vinegar, is what makes it healthy. These things are probiotic,
65
+ meaning they might give your digestive system a boost, but there isn't enough
66
+ research to back up the other claims.
67
+ - It is normal for vaginal discharge to increase in amount and become “stringy”
68
+ (like egg whites) during the middle of your menstrual cycle when you're ovulating.
69
+ If you find that your normal discharge is annoying, you can wear panty liners/shields
70
+ on your underwear.
71
+ - State law protects cypress trees along Florida's waterways, but it has been up
72
+ to the courts to enforce the regulations. ... Landowners can cut down cypress
73
+ trees on their land, but trees below the high-water mark are considered state
74
+ property and are protected.
75
+ - source_sentence: if you're blocked on whatsapp can you see last seen?
76
+ sentences:
77
+ - Jaguars aren't going to London this year, releases new plan for season tickets.
78
+ The Jaguars will no longer be playing two games in London, and will instead play
79
+ both games at TIAA Bank Field.
80
+ - Typically, most drugs are absorbed within 20-30 minutes after given by mouth.
81
+ Vomiting after this amount of time is not related to the drug in the stomach as
82
+ the vast majority, if not all, has already been absorbed.
83
+ - You can no longer see a contact's last seen or online in the chat window. Learn
84
+ more here. You do not see updates to a contact's profile photo. Any messages sent
85
+ to a contact who has blocked you will always show one check mark (message sent),
86
+ and never show a second check mark (message delivered).
87
+ - source_sentence: how many enchantments can you put on armor?
88
+ sentences:
89
+ - 4 Answers. You can in theory add every enchantment that is compatible with a tool/weapon/armor
90
+ onto the same item. The bow can have these 7 enchantments, though mending and
91
+ infinity are mutually exclusive.
92
+ - The sleeve length will make or break a jacket. If too long, it will make the jacket
93
+ look too big, and if too short, like you have outgrown your jacket. ... This is
94
+ when you need an experienced tailor, who will be able to shorten the sleeves from
95
+ the shoulders, so the details on the cuffs are not disturbed.
96
+ - Grace period of 60 days granted after the expiration of license for purpose of
97
+ renewal, and license is valid during this period. Renewal of license may occur
98
+ from 60 days (effective August 1, 2016, 180 days) prior to expiration to 3 years
99
+ after date; afterwards, applicant required to take and pass examination.
100
+ - source_sentence: what is the best drugstore shampoo for volume?
101
+ sentences:
102
+ - '[''#8. ... '', ''#7. ... '', ''#6. Hask Biotin Boost Shampoo. ... '', ''#5. Pantene
103
+ Pro-V Sheer Volume Shampoo. ... '', ''#4. John Frieda Luxurious Volume Touchably
104
+ Full Shampoo. ... '', ''#3. Acure Vivacious Volume Peppermint Shampoo. ... '',
105
+ ''#2. OGX Thick & Full Biotin & Collagen Shampoo. ... '', "#1. L''Oréal Paris
106
+ EverPure Sulfate Free Volume Shampoo."]'
107
+ - Genes can't control an organism on their own; rather, they must interact with
108
+ and respond to the organism's environment. Some genes are constitutive, or always
109
+ "on," regardless of environmental conditions.
110
+ - In electricity, the phase refers to the distribution of a load. What is the difference
111
+ between single-phase and three-phase power supplies? Single-phase power is a two-wire
112
+ alternating current (ac) power circuit. ... Three-phase power is a three-wire
113
+ ac power circuit with each phase ac signal 120 electrical degrees apart.
114
+ co2_eq_emissions:
115
+ emissions: 523.8395173647017
116
+ energy_consumed: 1.3476635503925931
117
+ source: codecarbon
118
+ training_type: fine-tuning
119
+ on_cloud: false
120
+ cpu_model: 13th Gen Intel(R) Core(TM) i7-13700K
121
+ ram_total_size: 31.777088165283203
122
+ hours_used: 3.544
123
+ hardware_used: 1 x NVIDIA GeForce RTX 3090
124
+ model-index:
125
+ - name: BERT base uncased trained on GooAQ triplets
126
+ results:
127
+ - task:
128
+ type: information-retrieval
129
+ name: Information Retrieval
130
+ dataset:
131
+ name: gooaq dev
132
+ type: gooaq-dev
133
+ metrics:
134
+ - type: cosine_accuracy@1
135
+ value: 0.7001
136
+ name: Cosine Accuracy@1
137
+ - type: cosine_accuracy@3
138
+ value: 0.8712
139
+ name: Cosine Accuracy@3
140
+ - type: cosine_accuracy@5
141
+ value: 0.9219
142
+ name: Cosine Accuracy@5
143
+ - type: cosine_accuracy@10
144
+ value: 0.9629
145
+ name: Cosine Accuracy@10
146
+ - type: cosine_precision@1
147
+ value: 0.7001
148
+ name: Cosine Precision@1
149
+ - type: cosine_precision@3
150
+ value: 0.2904
151
+ name: Cosine Precision@3
152
+ - type: cosine_precision@5
153
+ value: 0.18438000000000002
154
+ name: Cosine Precision@5
155
+ - type: cosine_precision@10
156
+ value: 0.09629000000000001
157
+ name: Cosine Precision@10
158
+ - type: cosine_recall@1
159
+ value: 0.7001
160
+ name: Cosine Recall@1
161
+ - type: cosine_recall@3
162
+ value: 0.8712
163
+ name: Cosine Recall@3
164
+ - type: cosine_recall@5
165
+ value: 0.9219
166
+ name: Cosine Recall@5
167
+ - type: cosine_recall@10
168
+ value: 0.9629
169
+ name: Cosine Recall@10
170
+ - type: cosine_ndcg@10
171
+ value: 0.8358567622290791
172
+ name: Cosine Ndcg@10
173
+ - type: cosine_mrr@10
174
+ value: 0.7945682142857085
175
+ name: Cosine Mrr@10
176
+ - type: cosine_map@100
177
+ value: 0.796615366916047
178
+ name: Cosine Map@100
179
+ - type: dot_accuracy@1
180
+ value: 0.6709
181
+ name: Dot Accuracy@1
182
+ - type: dot_accuracy@3
183
+ value: 0.8558
184
+ name: Dot Accuracy@3
185
+ - type: dot_accuracy@5
186
+ value: 0.9096
187
+ name: Dot Accuracy@5
188
+ - type: dot_accuracy@10
189
+ value: 0.9567
190
+ name: Dot Accuracy@10
191
+ - type: dot_precision@1
192
+ value: 0.6709
193
+ name: Dot Precision@1
194
+ - type: dot_precision@3
195
+ value: 0.28526666666666667
196
+ name: Dot Precision@3
197
+ - type: dot_precision@5
198
+ value: 0.18192000000000003
199
+ name: Dot Precision@5
200
+ - type: dot_precision@10
201
+ value: 0.09567
202
+ name: Dot Precision@10
203
+ - type: dot_recall@1
204
+ value: 0.6709
205
+ name: Dot Recall@1
206
+ - type: dot_recall@3
207
+ value: 0.8558
208
+ name: Dot Recall@3
209
+ - type: dot_recall@5
210
+ value: 0.9096
211
+ name: Dot Recall@5
212
+ - type: dot_recall@10
213
+ value: 0.9567
214
+ name: Dot Recall@10
215
+ - type: dot_ndcg@10
216
+ value: 0.8177950307933399
217
+ name: Dot Ndcg@10
218
+ - type: dot_mrr@10
219
+ value: 0.772776468253962
220
+ name: Dot Mrr@10
221
+ - type: dot_map@100
222
+ value: 0.7751231358698718
223
+ name: Dot Map@100
224
+ ---
225
+
226
+ # BERT base uncased trained on GooAQ triplets
227
+
228
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased) on the [sentence-transformers/gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq) dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
229
+
230
+ ## Model Details
231
+
232
+ ### Model Description
233
+ - **Model Type:** Sentence Transformer
234
+ - **Base model:** [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased) <!-- at revision 86b5e0934494bd15c9632b12f734a8a67f723594 -->
235
+ - **Maximum Sequence Length:** 512 tokens
236
+ - **Output Dimensionality:** 768 tokens
237
+ - **Similarity Function:** Cosine Similarity
238
+ - **Training Dataset:**
239
+ - [sentence-transformers/gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq)
240
+ - **Language:** en
241
+ - **License:** apache-2.0
242
+
243
+ ### Model Sources
244
+
245
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
246
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
247
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
248
+
249
+ ### Full Model Architecture
250
+
251
+ ```
252
+ SentenceTransformer(
253
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
254
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
255
+ )
256
+ ```
257
+
258
+ ## Usage
259
+
260
+ ### Direct Usage (Sentence Transformers)
261
+
262
+ First install the Sentence Transformers library:
263
+
264
+ ```bash
265
+ pip install -U sentence-transformers
266
+ ```
267
+
268
+ Then you can load this model and run inference.
269
+ ```python
270
+ from sentence_transformers import SentenceTransformer
271
+
272
+ # Download from the 🤗 Hub
273
+ model = SentenceTransformer("tomaarsen/bert-base-uncased-gooaq")
274
+ # Run inference
275
+ sentences = [
276
+ 'what is the best drugstore shampoo for volume?',
277
+ '[\'#8. ... \', \'#7. ... \', \'#6. Hask Biotin Boost Shampoo. ... \', \'#5. Pantene Pro-V Sheer Volume Shampoo. ... \', \'#4. John Frieda Luxurious Volume Touchably Full Shampoo. ... \', \'#3. Acure Vivacious Volume Peppermint Shampoo. ... \', \'#2. OGX Thick & Full Biotin & Collagen Shampoo. ... \', "#1. L\'Oréal Paris EverPure Sulfate Free Volume Shampoo."]',
278
+ 'In electricity, the phase refers to the distribution of a load. What is the difference between single-phase and three-phase power supplies? Single-phase power is a two-wire alternating current (ac) power circuit. ... Three-phase power is a three-wire ac power circuit with each phase ac signal 120 electrical degrees apart.',
279
+ ]
280
+ embeddings = model.encode(sentences)
281
+ print(embeddings.shape)
282
+ # [3, 768]
283
+
284
+ # Get the similarity scores for the embeddings
285
+ similarities = model.similarity(embeddings, embeddings)
286
+ print(similarities.shape)
287
+ # [3, 3]
288
+ ```
289
+
290
+ <!--
291
+ ### Direct Usage (Transformers)
292
+
293
+ <details><summary>Click to see the direct usage in Transformers</summary>
294
+
295
+ </details>
296
+ -->
297
+
298
+ <!--
299
+ ### Downstream Usage (Sentence Transformers)
300
+
301
+ You can finetune this model on your own dataset.
302
+
303
+ <details><summary>Click to expand</summary>
304
+
305
+ </details>
306
+ -->
307
+
308
+ <!--
309
+ ### Out-of-Scope Use
310
+
311
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
312
+ -->
313
+
314
+ ## Evaluation
315
+
316
+ ### Metrics
317
+
318
+ #### Information Retrieval
319
+ * Dataset: `gooaq-dev`
320
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
321
+
322
+ | Metric | Value |
323
+ |:--------------------|:-----------|
324
+ | cosine_accuracy@1 | 0.7001 |
325
+ | cosine_accuracy@3 | 0.8712 |
326
+ | cosine_accuracy@5 | 0.9219 |
327
+ | cosine_accuracy@10 | 0.9629 |
328
+ | cosine_precision@1 | 0.7001 |
329
+ | cosine_precision@3 | 0.2904 |
330
+ | cosine_precision@5 | 0.1844 |
331
+ | cosine_precision@10 | 0.0963 |
332
+ | cosine_recall@1 | 0.7001 |
333
+ | cosine_recall@3 | 0.8712 |
334
+ | cosine_recall@5 | 0.9219 |
335
+ | cosine_recall@10 | 0.9629 |
336
+ | cosine_ndcg@10 | 0.8359 |
337
+ | cosine_mrr@10 | 0.7946 |
338
+ | **cosine_map@100** | **0.7966** |
339
+ | dot_accuracy@1 | 0.6709 |
340
+ | dot_accuracy@3 | 0.8558 |
341
+ | dot_accuracy@5 | 0.9096 |
342
+ | dot_accuracy@10 | 0.9567 |
343
+ | dot_precision@1 | 0.6709 |
344
+ | dot_precision@3 | 0.2853 |
345
+ | dot_precision@5 | 0.1819 |
346
+ | dot_precision@10 | 0.0957 |
347
+ | dot_recall@1 | 0.6709 |
348
+ | dot_recall@3 | 0.8558 |
349
+ | dot_recall@5 | 0.9096 |
350
+ | dot_recall@10 | 0.9567 |
351
+ | dot_ndcg@10 | 0.8178 |
352
+ | dot_mrr@10 | 0.7728 |
353
+ | dot_map@100 | 0.7751 |
354
+
355
+ <!--
356
+ ## Bias, Risks and Limitations
357
+
358
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
359
+ -->
360
+
361
+ <!--
362
+ ### Recommendations
363
+
364
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
365
+ -->
366
+
367
+ ## Training Details
368
+
369
+ ### Training Dataset
370
+
371
+ #### sentence-transformers/gooaq
372
+
373
+ * Dataset: [sentence-transformers/gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq) at [b089f72](https://huggingface.co/datasets/sentence-transformers/gooaq/tree/b089f728748a068b7bc5234e5bcf5b25e3c8279c)
374
+ * Size: 3,002,496 training samples
375
+ * Columns: <code>question</code> and <code>answer</code>
376
+ * Approximate statistics based on the first 1000 samples:
377
+ | | question | answer |
378
+ |:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
379
+ | type | string | string |
380
+ | details | <ul><li>min: 8 tokens</li><li>mean: 11.95 tokens</li><li>max: 24 tokens</li></ul> | <ul><li>min: 17 tokens</li><li>mean: 60.83 tokens</li><li>max: 130 tokens</li></ul> |
381
+ * Samples:
382
+ | question | answer |
383
+ |:------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
384
+ | <code>what are the differences between internet and web?</code> | <code>The Internet is a global network of networks while the Web, also referred formally as World Wide Web (www) is collection of information which is accessed via the Internet. Another way to look at this difference is; the Internet is infrastructure while the Web is service on top of that infrastructure.</code> |
385
+ | <code>who is the most important person in a first aid situation?</code> | <code>Subscribe to New First Aid For Free The main principle of incident management is that you are the most important person and your safety comes first! Your first actions when coming across the scene of an incident should be: Check for any dangers to yourself or bystanders. Manage any dangers found (if safe to do so)</code> |
386
+ | <code>why is jibjab not working?</code> | <code>Usually disabling your ad blockers for JibJab will resolve this issue. If you're still having issues loading the card after your ad blockers are disabled, you can try clearing your cache/cookies or updating and restarting your browser. As a last resort, you can try opening JibJab from a different browser.</code> |
387
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
388
+ ```json
389
+ {
390
+ "scale": 20.0,
391
+ "similarity_fct": "cos_sim"
392
+ }
393
+ ```
394
+
395
+ ### Evaluation Dataset
396
+
397
+ #### sentence-transformers/gooaq
398
+
399
+ * Dataset: [sentence-transformers/gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq) at [b089f72](https://huggingface.co/datasets/sentence-transformers/gooaq/tree/b089f728748a068b7bc5234e5bcf5b25e3c8279c)
400
+ * Size: 10,000 evaluation samples
401
+ * Columns: <code>question</code> and <code>answer</code>
402
+ * Approximate statistics based on the first 1000 samples:
403
+ | | question | answer |
404
+ |:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
405
+ | type | string | string |
406
+ | details | <ul><li>min: 8 tokens</li><li>mean: 12.01 tokens</li><li>max: 34 tokens</li></ul> | <ul><li>min: 13 tokens</li><li>mean: 59.81 tokens</li><li>max: 145 tokens</li></ul> |
407
+ * Samples:
408
+ | question | answer |
409
+ |:---------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
410
+ | <code>what are some common attributes/characteristics between animal and human?</code> | <code>['Culture.', 'Emotions.', 'Language.', 'Humour.', 'Tool Use.', 'Memory.', 'Self-Awareness.', 'Intelligence.']</code> |
411
+ | <code>is folic acid the same as vitamin b?</code> | <code>Vitamin B9, also called folate or folic acid, is one of 8 B vitamins. All B vitamins help the body convert food (carbohydrates) into fuel (glucose), which is used to produce energy. These B vitamins, often referred to as B-complex vitamins, also help the body use fats and protein.</code> |
412
+ | <code>are bendy buses still in london?</code> | <code>Bendy bus makes final journey for Transport for London. The last of London's bendy buses was taken off the roads on Friday night. ... The final route to be operated with bendy buses has been the 207 between Hayes and White City, and the last of the long vehicles was to run late on Friday.</code> |
413
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
414
+ ```json
415
+ {
416
+ "scale": 20.0,
417
+ "similarity_fct": "cos_sim"
418
+ }
419
+ ```
420
+
421
+ ### Training Hyperparameters
422
+ #### Non-Default Hyperparameters
423
+
424
+ - `eval_strategy`: steps
425
+ - `per_device_train_batch_size`: 128
426
+ - `per_device_eval_batch_size`: 128
427
+ - `learning_rate`: 2e-05
428
+ - `num_train_epochs`: 1
429
+ - `warmup_ratio`: 0.1
430
+ - `bf16`: True
431
+ - `batch_sampler`: no_duplicates
432
+
433
+ #### All Hyperparameters
434
+ <details><summary>Click to expand</summary>
435
+
436
+ - `overwrite_output_dir`: False
437
+ - `do_predict`: False
438
+ - `eval_strategy`: steps
439
+ - `prediction_loss_only`: True
440
+ - `per_device_train_batch_size`: 128
441
+ - `per_device_eval_batch_size`: 128
442
+ - `per_gpu_train_batch_size`: None
443
+ - `per_gpu_eval_batch_size`: None
444
+ - `gradient_accumulation_steps`: 1
445
+ - `eval_accumulation_steps`: None
446
+ - `learning_rate`: 2e-05
447
+ - `weight_decay`: 0.0
448
+ - `adam_beta1`: 0.9
449
+ - `adam_beta2`: 0.999
450
+ - `adam_epsilon`: 1e-08
451
+ - `max_grad_norm`: 1.0
452
+ - `num_train_epochs`: 1
453
+ - `max_steps`: -1
454
+ - `lr_scheduler_type`: linear
455
+ - `lr_scheduler_kwargs`: {}
456
+ - `warmup_ratio`: 0.1
457
+ - `warmup_steps`: 0
458
+ - `log_level`: passive
459
+ - `log_level_replica`: warning
460
+ - `log_on_each_node`: True
461
+ - `logging_nan_inf_filter`: True
462
+ - `save_safetensors`: True
463
+ - `save_on_each_node`: False
464
+ - `save_only_model`: False
465
+ - `restore_callback_states_from_checkpoint`: False
466
+ - `no_cuda`: False
467
+ - `use_cpu`: False
468
+ - `use_mps_device`: False
469
+ - `seed`: 42
470
+ - `data_seed`: None
471
+ - `jit_mode_eval`: False
472
+ - `use_ipex`: False
473
+ - `bf16`: True
474
+ - `fp16`: False
475
+ - `fp16_opt_level`: O1
476
+ - `half_precision_backend`: auto
477
+ - `bf16_full_eval`: False
478
+ - `fp16_full_eval`: False
479
+ - `tf32`: None
480
+ - `local_rank`: 0
481
+ - `ddp_backend`: None
482
+ - `tpu_num_cores`: None
483
+ - `tpu_metrics_debug`: False
484
+ - `debug`: []
485
+ - `dataloader_drop_last`: False
486
+ - `dataloader_num_workers`: 0
487
+ - `dataloader_prefetch_factor`: None
488
+ - `past_index`: -1
489
+ - `disable_tqdm`: False
490
+ - `remove_unused_columns`: True
491
+ - `label_names`: None
492
+ - `load_best_model_at_end`: False
493
+ - `ignore_data_skip`: False
494
+ - `fsdp`: []
495
+ - `fsdp_min_num_params`: 0
496
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
497
+ - `fsdp_transformer_layer_cls_to_wrap`: None
498
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
499
+ - `deepspeed`: None
500
+ - `label_smoothing_factor`: 0.0
501
+ - `optim`: adamw_torch
502
+ - `optim_args`: None
503
+ - `adafactor`: False
504
+ - `group_by_length`: False
505
+ - `length_column_name`: length
506
+ - `ddp_find_unused_parameters`: None
507
+ - `ddp_bucket_cap_mb`: None
508
+ - `ddp_broadcast_buffers`: False
509
+ - `dataloader_pin_memory`: True
510
+ - `dataloader_persistent_workers`: False
511
+ - `skip_memory_metrics`: True
512
+ - `use_legacy_prediction_loop`: False
513
+ - `push_to_hub`: False
514
+ - `resume_from_checkpoint`: None
515
+ - `hub_model_id`: None
516
+ - `hub_strategy`: every_save
517
+ - `hub_private_repo`: False
518
+ - `hub_always_push`: False
519
+ - `gradient_checkpointing`: False
520
+ - `gradient_checkpointing_kwargs`: None
521
+ - `include_inputs_for_metrics`: False
522
+ - `eval_do_concat_batches`: True
523
+ - `fp16_backend`: auto
524
+ - `push_to_hub_model_id`: None
525
+ - `push_to_hub_organization`: None
526
+ - `mp_parameters`:
527
+ - `auto_find_batch_size`: False
528
+ - `full_determinism`: False
529
+ - `torchdynamo`: None
530
+ - `ray_scope`: last
531
+ - `ddp_timeout`: 1800
532
+ - `torch_compile`: False
533
+ - `torch_compile_backend`: None
534
+ - `torch_compile_mode`: None
535
+ - `dispatch_batches`: None
536
+ - `split_batches`: None
537
+ - `include_tokens_per_second`: False
538
+ - `include_num_input_tokens_seen`: False
539
+ - `neftune_noise_alpha`: None
540
+ - `optim_target_modules`: None
541
+ - `batch_eval_metrics`: False
542
+ - `batch_sampler`: no_duplicates
543
+ - `multi_dataset_batch_sampler`: proportional
544
+
545
+ </details>
546
+
547
+ ### Training Logs
548
+ | Epoch | Step | Training Loss | loss | gooaq-dev_cosine_map@100 |
549
+ |:------:|:-----:|:-------------:|:------:|:------------------------:|
550
+ | 0 | 0 | - | - | 0.2018 |
551
+ | 0.0000 | 1 | 2.6207 | - | - |
552
+ | 0.0213 | 500 | 0.9092 | - | - |
553
+ | 0.0426 | 1000 | 0.2051 | - | - |
554
+ | 0.0639 | 1500 | 0.1354 | - | - |
555
+ | 0.0853 | 2000 | 0.1089 | 0.0719 | 0.7124 |
556
+ | 0.1066 | 2500 | 0.0916 | - | - |
557
+ | 0.1279 | 3000 | 0.0812 | - | - |
558
+ | 0.1492 | 3500 | 0.0716 | - | - |
559
+ | 0.1705 | 4000 | 0.0658 | 0.0517 | 0.7432 |
560
+ | 0.1918 | 4500 | 0.0623 | - | - |
561
+ | 0.2132 | 5000 | 0.0596 | - | - |
562
+ | 0.2345 | 5500 | 0.0554 | - | - |
563
+ | 0.2558 | 6000 | 0.0504 | 0.0401 | 0.7580 |
564
+ | 0.2771 | 6500 | 0.0498 | - | - |
565
+ | 0.2984 | 7000 | 0.0483 | - | - |
566
+ | 0.3197 | 7500 | 0.0487 | - | - |
567
+ | 0.3410 | 8000 | 0.0458 | 0.0359 | 0.7652 |
568
+ | 0.3624 | 8500 | 0.0435 | - | - |
569
+ | 0.3837 | 9000 | 0.0421 | - | - |
570
+ | 0.4050 | 9500 | 0.0421 | - | - |
571
+ | 0.4263 | 10000 | 0.0405 | 0.0329 | 0.7738 |
572
+ | 0.4476 | 10500 | 0.0392 | - | - |
573
+ | 0.4689 | 11000 | 0.0388 | - | - |
574
+ | 0.4903 | 11500 | 0.0388 | - | - |
575
+ | 0.5116 | 12000 | 0.0361 | 0.0290 | 0.7810 |
576
+ | 0.5329 | 12500 | 0.0362 | - | - |
577
+ | 0.5542 | 13000 | 0.0356 | - | - |
578
+ | 0.5755 | 13500 | 0.0352 | - | - |
579
+ | 0.5968 | 14000 | 0.0349 | 0.0267 | 0.7866 |
580
+ | 0.6182 | 14500 | 0.0334 | - | - |
581
+ | 0.6395 | 15000 | 0.0323 | - | - |
582
+ | 0.6608 | 15500 | 0.0325 | - | - |
583
+ | 0.6821 | 16000 | 0.0316 | 0.0256 | 0.7879 |
584
+ | 0.7034 | 16500 | 0.0313 | - | - |
585
+ | 0.7247 | 17000 | 0.0306 | - | - |
586
+ | 0.7460 | 17500 | 0.0328 | - | - |
587
+ | 0.7674 | 18000 | 0.0303 | 0.0238 | 0.7928 |
588
+ | 0.7887 | 18500 | 0.0301 | - | - |
589
+ | 0.8100 | 19000 | 0.0291 | - | - |
590
+ | 0.8313 | 19500 | 0.0286 | - | - |
591
+ | 0.8526 | 20000 | 0.0295 | 0.0218 | 0.7952 |
592
+ | 0.8739 | 20500 | 0.0288 | - | - |
593
+ | 0.8953 | 21000 | 0.0277 | - | - |
594
+ | 0.9166 | 21500 | 0.0266 | - | - |
595
+ | 0.9379 | 22000 | 0.0289 | 0.0218 | 0.7971 |
596
+ | 0.9592 | 22500 | 0.0286 | - | - |
597
+ | 0.9805 | 23000 | 0.0275 | - | - |
598
+ | 1.0 | 23457 | - | - | 0.7966 |
599
+
600
+
601
+ ### Environmental Impact
602
+ Carbon emissions were measured using [CodeCarbon](https://github.com/mlco2/codecarbon).
603
+ - **Energy Consumed**: 1.348 kWh
604
+ - **Carbon Emitted**: 0.524 kg of CO2
605
+ - **Hours Used**: 3.544 hours
606
+
607
+ ### Training Hardware
608
+ - **On Cloud**: No
609
+ - **GPU Model**: 1 x NVIDIA GeForce RTX 3090
610
+ - **CPU Model**: 13th Gen Intel(R) Core(TM) i7-13700K
611
+ - **RAM Size**: 31.78 GB
612
+
613
+ ### Framework Versions
614
+ - Python: 3.11.6
615
+ - Sentence Transformers: 3.1.0.dev0
616
+ - Transformers: 4.41.2
617
+ - PyTorch: 2.3.0+cu121
618
+ - Accelerate: 0.31.0
619
+ - Datasets: 2.20.0
620
+ - Tokenizers: 0.19.1
621
+
622
+ ## Citation
623
+
624
+ ### BibTeX
625
+
626
+ #### Sentence Transformers
627
+ ```bibtex
628
+ @inproceedings{reimers-2019-sentence-bert,
629
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
630
+ author = "Reimers, Nils and Gurevych, Iryna",
631
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
632
+ month = "11",
633
+ year = "2019",
634
+ publisher = "Association for Computational Linguistics",
635
+ url = "https://arxiv.org/abs/1908.10084",
636
+ }
637
+ ```
638
+
639
+ #### MultipleNegativesRankingLoss
640
+ ```bibtex
641
+ @misc{henderson2017efficient,
642
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
643
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
644
+ year={2017},
645
+ eprint={1705.00652},
646
+ archivePrefix={arXiv},
647
+ primaryClass={cs.CL}
648
+ }
649
+ ```
650
+
651
+ <!--
652
+ ## Glossary
653
+
654
+ *Clearly define terms in order to be accessible across audiences.*
655
+ -->
656
+
657
+ <!--
658
+ ## Model Card Authors
659
+
660
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
661
+ -->
662
+
663
+ <!--
664
+ ## Model Card Contact
665
+
666
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
667
+ -->
config.json ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "bert-base-uncased",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 3072,
14
+ "layer_norm_eps": 1e-12,
15
+ "max_position_embeddings": 512,
16
+ "model_type": "bert",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 12,
19
+ "pad_token_id": 0,
20
+ "position_embedding_type": "absolute",
21
+ "torch_dtype": "float32",
22
+ "transformers_version": "4.41.2",
23
+ "type_vocab_size": 2,
24
+ "use_cache": true,
25
+ "vocab_size": 30522
26
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.1.0.dev0",
4
+ "transformers": "4.41.2",
5
+ "pytorch": "2.3.0+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:47c330f9c990abacdaf39710703d1768f0ba89e237540a528675e762b8aeecbf
3
+ size 437951328
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": "[CLS]",
3
+ "mask_token": "[MASK]",
4
+ "pad_token": "[PAD]",
5
+ "sep_token": "[SEP]",
6
+ "unk_token": "[UNK]"
7
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,55 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_lower_case": true,
47
+ "mask_token": "[MASK]",
48
+ "model_max_length": 512,
49
+ "pad_token": "[PAD]",
50
+ "sep_token": "[SEP]",
51
+ "strip_accents": null,
52
+ "tokenize_chinese_chars": true,
53
+ "tokenizer_class": "BertTokenizer",
54
+ "unk_token": "[UNK]"
55
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff