sylvain471 commited on
Commit
721e873
·
verified ·
1 Parent(s): 6b9e49c

Add new SentenceTransformer model.

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 1024,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,900 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: actualdata/bilingual-embedding-large
3
+ datasets: []
4
+ language:
5
+ - en
6
+ library_name: sentence-transformers
7
+ license: apache-2.0
8
+ metrics:
9
+ - cosine_accuracy@1
10
+ - cosine_accuracy@3
11
+ - cosine_accuracy@5
12
+ - cosine_accuracy@10
13
+ - cosine_precision@1
14
+ - cosine_precision@3
15
+ - cosine_precision@5
16
+ - cosine_precision@10
17
+ - cosine_recall@1
18
+ - cosine_recall@3
19
+ - cosine_recall@5
20
+ - cosine_recall@10
21
+ - cosine_ndcg@10
22
+ - cosine_mrr@10
23
+ - cosine_map@100
24
+ pipeline_tag: sentence-similarity
25
+ tags:
26
+ - sentence-transformers
27
+ - sentence-similarity
28
+ - feature-extraction
29
+ - generated_from_trainer
30
+ - dataset_size:4885
31
+ - loss:MatryoshkaLoss
32
+ - loss:MultipleNegativesRankingLoss
33
+ widget:
34
+ - source_sentence: ' Le CO2, le CH4, le N2O, le SF6, le NF3 ainsi que les groupes
35
+ de gaz HFC et PFC.'
36
+ sentences:
37
+ - ' Qui a initié l''élaboration du guide sectoriel de réalisation d''un bilan des
38
+ émissions de gaz à effet de serre pour la filière cosmétique ?'
39
+ - ' Quel est l''objectif premier du Guide sectoriel de réalisation d''un bilan des
40
+ émissions de gaz à effet de serre pour la filière des sites de loisirs et culturels
41
+ ?'
42
+ - ' Quel est le gaz contribuant à l''augmentation de l''effet de serre qui doit
43
+ être pris en compte dans la réalisation des bilans ?'
44
+ - source_sentence: ' Il est conseillé d''implémenter d''abord les leviers déjà matures
45
+ et « sans regret » (efficacité énergétique, efficacité matières, décarbonation
46
+ du mix énergétique) avant d''envisager des technologies moins matures.'
47
+ sentences:
48
+ - ' Quel est le recommandé ordre d''implémentation des leviers de décarbonation
49
+ ?'
50
+ - ' Quels sont les types de connexions utilisés pour relier un utilisateur à une
51
+ ressource distante dans un réseau de communication ?'
52
+ - ' Comment peut-on utiliser le Bilan Carbone pour tenir compte de processus de
53
+ valorisation mis en œuvre par les entreprises du secteur agricole et agro-alimentaire
54
+ ?'
55
+ - source_sentence: ' Les échanges ont permis de décrire des exemples par poste d''émissions.'
56
+ sentences:
57
+ - ' Quel était l''objectif des échanges sur les bonnes pratiques utilisées dans
58
+ le secteur ?'
59
+ - Existe-t-il une méthode rigoureuse pour déterminer l'incertitude de ces facteurs
60
+ d'émissions monétaires?
61
+ - ' Quels sont les modes de transport pris en compte dans cette fiche ?'
62
+ - source_sentence: ' La variation du périmètre organisationnel par la vente d''une
63
+ usine, la variation du périmètre opérationnel par l''achat d''une nouvelle ligne
64
+ de production, le changement de valeur de facteurs d''émission, le changement
65
+ du mix des produits des usines et la dégradation des outils de production.'
66
+ sentences:
67
+ - ' Quel type de repas a un total de quantité (g) de 83229,6 ? '
68
+ - Quel est l'objectif principal de la collecte des données pour la réalisation d'un
69
+ bilan GES ?
70
+ - ' Quels sont les facteurs qui ont influencé l''évolution des émissions de GES
71
+ de l''entreprise ?'
72
+ - source_sentence: ' Le PCS intègre l''énergie libérée par la condensation de l''eau
73
+ après la combustion, tandis que le PCI ne l''intègre pas.'
74
+ sentences:
75
+ - ' La proportion d''énergie utilisée dans l''eau chaude sanitaire pour les résidences
76
+ principales (métropole uniquement) est-elle supérieure à 1 % ?'
77
+ - ' Qu''est-ce qui distingue le Pouvoir Calorifique Supérieur (PCS) du Pouvoir Calorifique
78
+ Inférieur (PCI) ?'
79
+ - ' Quelle méthode de mesure directe par suivi de la consommation des véhicules
80
+ de transport sera privilégiée si le matériel de transport est contrôlé ?'
81
+ model-index:
82
+ - name: test qwen2 Matryoshka
83
+ results:
84
+ - task:
85
+ type: information-retrieval
86
+ name: Information Retrieval
87
+ dataset:
88
+ name: dim 1024
89
+ type: dim_1024
90
+ metrics:
91
+ - type: cosine_accuracy@1
92
+ value: 0.31675874769797424
93
+ name: Cosine Accuracy@1
94
+ - type: cosine_accuracy@3
95
+ value: 0.425414364640884
96
+ name: Cosine Accuracy@3
97
+ - type: cosine_accuracy@5
98
+ value: 0.47697974217311234
99
+ name: Cosine Accuracy@5
100
+ - type: cosine_accuracy@10
101
+ value: 0.5561694290976059
102
+ name: Cosine Accuracy@10
103
+ - type: cosine_precision@1
104
+ value: 0.31675874769797424
105
+ name: Cosine Precision@1
106
+ - type: cosine_precision@3
107
+ value: 0.141804788213628
108
+ name: Cosine Precision@3
109
+ - type: cosine_precision@5
110
+ value: 0.09539594843462246
111
+ name: Cosine Precision@5
112
+ - type: cosine_precision@10
113
+ value: 0.05561694290976059
114
+ name: Cosine Precision@10
115
+ - type: cosine_recall@1
116
+ value: 0.31675874769797424
117
+ name: Cosine Recall@1
118
+ - type: cosine_recall@3
119
+ value: 0.425414364640884
120
+ name: Cosine Recall@3
121
+ - type: cosine_recall@5
122
+ value: 0.47697974217311234
123
+ name: Cosine Recall@5
124
+ - type: cosine_recall@10
125
+ value: 0.5561694290976059
126
+ name: Cosine Recall@10
127
+ - type: cosine_ndcg@10
128
+ value: 0.42756869844177203
129
+ name: Cosine Ndcg@10
130
+ - type: cosine_mrr@10
131
+ value: 0.38761729369464176
132
+ name: Cosine Mrr@10
133
+ - type: cosine_map@100
134
+ value: 0.399364505533715
135
+ name: Cosine Map@100
136
+ - task:
137
+ type: information-retrieval
138
+ name: Information Retrieval
139
+ dataset:
140
+ name: dim 896
141
+ type: dim_896
142
+ metrics:
143
+ - type: cosine_accuracy@1
144
+ value: 0.32228360957642727
145
+ name: Cosine Accuracy@1
146
+ - type: cosine_accuracy@3
147
+ value: 0.42357274401473294
148
+ name: Cosine Accuracy@3
149
+ - type: cosine_accuracy@5
150
+ value: 0.4732965009208103
151
+ name: Cosine Accuracy@5
152
+ - type: cosine_accuracy@10
153
+ value: 0.5488029465930019
154
+ name: Cosine Accuracy@10
155
+ - type: cosine_precision@1
156
+ value: 0.32228360957642727
157
+ name: Cosine Precision@1
158
+ - type: cosine_precision@3
159
+ value: 0.14119091467157763
160
+ name: Cosine Precision@3
161
+ - type: cosine_precision@5
162
+ value: 0.09465930018416206
163
+ name: Cosine Precision@5
164
+ - type: cosine_precision@10
165
+ value: 0.05488029465930018
166
+ name: Cosine Precision@10
167
+ - type: cosine_recall@1
168
+ value: 0.32228360957642727
169
+ name: Cosine Recall@1
170
+ - type: cosine_recall@3
171
+ value: 0.42357274401473294
172
+ name: Cosine Recall@3
173
+ - type: cosine_recall@5
174
+ value: 0.4732965009208103
175
+ name: Cosine Recall@5
176
+ - type: cosine_recall@10
177
+ value: 0.5488029465930019
178
+ name: Cosine Recall@10
179
+ - type: cosine_ndcg@10
180
+ value: 0.4272124343988002
181
+ name: Cosine Ndcg@10
182
+ - type: cosine_mrr@10
183
+ value: 0.3893734105060072
184
+ name: Cosine Mrr@10
185
+ - type: cosine_map@100
186
+ value: 0.40183454050045436
187
+ name: Cosine Map@100
188
+ - task:
189
+ type: information-retrieval
190
+ name: Information Retrieval
191
+ dataset:
192
+ name: dim 512
193
+ type: dim_512
194
+ metrics:
195
+ - type: cosine_accuracy@1
196
+ value: 0.3314917127071823
197
+ name: Cosine Accuracy@1
198
+ - type: cosine_accuracy@3
199
+ value: 0.42357274401473294
200
+ name: Cosine Accuracy@3
201
+ - type: cosine_accuracy@5
202
+ value: 0.47513812154696133
203
+ name: Cosine Accuracy@5
204
+ - type: cosine_accuracy@10
205
+ value: 0.5488029465930019
206
+ name: Cosine Accuracy@10
207
+ - type: cosine_precision@1
208
+ value: 0.3314917127071823
209
+ name: Cosine Precision@1
210
+ - type: cosine_precision@3
211
+ value: 0.14119091467157763
212
+ name: Cosine Precision@3
213
+ - type: cosine_precision@5
214
+ value: 0.09502762430939225
215
+ name: Cosine Precision@5
216
+ - type: cosine_precision@10
217
+ value: 0.05488029465930018
218
+ name: Cosine Precision@10
219
+ - type: cosine_recall@1
220
+ value: 0.3314917127071823
221
+ name: Cosine Recall@1
222
+ - type: cosine_recall@3
223
+ value: 0.42357274401473294
224
+ name: Cosine Recall@3
225
+ - type: cosine_recall@5
226
+ value: 0.47513812154696133
227
+ name: Cosine Recall@5
228
+ - type: cosine_recall@10
229
+ value: 0.5488029465930019
230
+ name: Cosine Recall@10
231
+ - type: cosine_ndcg@10
232
+ value: 0.43088591845526986
233
+ name: Cosine Ndcg@10
234
+ - type: cosine_mrr@10
235
+ value: 0.39430705369931895
236
+ name: Cosine Mrr@10
237
+ - type: cosine_map@100
238
+ value: 0.4065191633235482
239
+ name: Cosine Map@100
240
+ - task:
241
+ type: information-retrieval
242
+ name: Information Retrieval
243
+ dataset:
244
+ name: dim 256
245
+ type: dim_256
246
+ metrics:
247
+ - type: cosine_accuracy@1
248
+ value: 0.30755064456721914
249
+ name: Cosine Accuracy@1
250
+ - type: cosine_accuracy@3
251
+ value: 0.4125230202578269
252
+ name: Cosine Accuracy@3
253
+ - type: cosine_accuracy@5
254
+ value: 0.4677716390423573
255
+ name: Cosine Accuracy@5
256
+ - type: cosine_accuracy@10
257
+ value: 0.5395948434622467
258
+ name: Cosine Accuracy@10
259
+ - type: cosine_precision@1
260
+ value: 0.30755064456721914
261
+ name: Cosine Precision@1
262
+ - type: cosine_precision@3
263
+ value: 0.1375076734192756
264
+ name: Cosine Precision@3
265
+ - type: cosine_precision@5
266
+ value: 0.09355432780847145
267
+ name: Cosine Precision@5
268
+ - type: cosine_precision@10
269
+ value: 0.053959484346224676
270
+ name: Cosine Precision@10
271
+ - type: cosine_recall@1
272
+ value: 0.30755064456721914
273
+ name: Cosine Recall@1
274
+ - type: cosine_recall@3
275
+ value: 0.4125230202578269
276
+ name: Cosine Recall@3
277
+ - type: cosine_recall@5
278
+ value: 0.4677716390423573
279
+ name: Cosine Recall@5
280
+ - type: cosine_recall@10
281
+ value: 0.5395948434622467
282
+ name: Cosine Recall@10
283
+ - type: cosine_ndcg@10
284
+ value: 0.41562425407928066
285
+ name: Cosine Ndcg@10
286
+ - type: cosine_mrr@10
287
+ value: 0.3769351632611302
288
+ name: Cosine Mrr@10
289
+ - type: cosine_map@100
290
+ value: 0.3895577962122803
291
+ name: Cosine Map@100
292
+ - task:
293
+ type: information-retrieval
294
+ name: Information Retrieval
295
+ dataset:
296
+ name: dim 128
297
+ type: dim_128
298
+ metrics:
299
+ - type: cosine_accuracy@1
300
+ value: 0.2965009208103131
301
+ name: Cosine Accuracy@1
302
+ - type: cosine_accuracy@3
303
+ value: 0.40515653775322286
304
+ name: Cosine Accuracy@3
305
+ - type: cosine_accuracy@5
306
+ value: 0.44751381215469616
307
+ name: Cosine Accuracy@5
308
+ - type: cosine_accuracy@10
309
+ value: 0.5395948434622467
310
+ name: Cosine Accuracy@10
311
+ - type: cosine_precision@1
312
+ value: 0.2965009208103131
313
+ name: Cosine Precision@1
314
+ - type: cosine_precision@3
315
+ value: 0.13505217925107427
316
+ name: Cosine Precision@3
317
+ - type: cosine_precision@5
318
+ value: 0.08950276243093921
319
+ name: Cosine Precision@5
320
+ - type: cosine_precision@10
321
+ value: 0.053959484346224676
322
+ name: Cosine Precision@10
323
+ - type: cosine_recall@1
324
+ value: 0.2965009208103131
325
+ name: Cosine Recall@1
326
+ - type: cosine_recall@3
327
+ value: 0.40515653775322286
328
+ name: Cosine Recall@3
329
+ - type: cosine_recall@5
330
+ value: 0.44751381215469616
331
+ name: Cosine Recall@5
332
+ - type: cosine_recall@10
333
+ value: 0.5395948434622467
334
+ name: Cosine Recall@10
335
+ - type: cosine_ndcg@10
336
+ value: 0.40786326501955955
337
+ name: Cosine Ndcg@10
338
+ - type: cosine_mrr@10
339
+ value: 0.367228653278377
340
+ name: Cosine Mrr@10
341
+ - type: cosine_map@100
342
+ value: 0.3789438619494699
343
+ name: Cosine Map@100
344
+ ---
345
+
346
+ # test qwen2 Matryoshka
347
+
348
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [actualdata/bilingual-embedding-large](https://huggingface.co/actualdata/bilingual-embedding-large). It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
349
+
350
+ ## Model Details
351
+
352
+ ### Model Description
353
+ - **Model Type:** Sentence Transformer
354
+ - **Base model:** [actualdata/bilingual-embedding-large](https://huggingface.co/actualdata/bilingual-embedding-large) <!-- at revision b595d8ed97b05e847230c8bd2432ea248c2afe2d -->
355
+ - **Maximum Sequence Length:** 512 tokens
356
+ - **Output Dimensionality:** 1024 tokens
357
+ - **Similarity Function:** Cosine Similarity
358
+ <!-- - **Training Dataset:** Unknown -->
359
+ - **Language:** en
360
+ - **License:** apache-2.0
361
+
362
+ ### Model Sources
363
+
364
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
365
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
366
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
367
+
368
+ ### Full Model Architecture
369
+
370
+ ```
371
+ SentenceTransformer(
372
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BilingualModel
373
+ (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
374
+ (2): Normalize()
375
+ )
376
+ ```
377
+
378
+ ## Usage
379
+
380
+ ### Direct Usage (Sentence Transformers)
381
+
382
+ First install the Sentence Transformers library:
383
+
384
+ ```bash
385
+ pip install -U sentence-transformers
386
+ ```
387
+
388
+ Then you can load this model and run inference.
389
+ ```python
390
+ from sentence_transformers import SentenceTransformer
391
+
392
+ # Download from the 🤗 Hub
393
+ model = SentenceTransformer("sylvain471/bl_ademe_large")
394
+ # Run inference
395
+ sentences = [
396
+ " Le PCS intègre l'énergie libérée par la condensation de l'eau après la combustion, tandis que le PCI ne l'intègre pas.",
397
+ " Qu'est-ce qui distingue le Pouvoir Calorifique Supérieur (PCS) du Pouvoir Calorifique Inférieur (PCI) ?",
398
+ " La proportion d'énergie utilisée dans l'eau chaude sanitaire pour les résidences principales (métropole uniquement) est-elle supérieure à 1 % ?",
399
+ ]
400
+ embeddings = model.encode(sentences)
401
+ print(embeddings.shape)
402
+ # [3, 1024]
403
+
404
+ # Get the similarity scores for the embeddings
405
+ similarities = model.similarity(embeddings, embeddings)
406
+ print(similarities.shape)
407
+ # [3, 3]
408
+ ```
409
+
410
+ <!--
411
+ ### Direct Usage (Transformers)
412
+
413
+ <details><summary>Click to see the direct usage in Transformers</summary>
414
+
415
+ </details>
416
+ -->
417
+
418
+ <!--
419
+ ### Downstream Usage (Sentence Transformers)
420
+
421
+ You can finetune this model on your own dataset.
422
+
423
+ <details><summary>Click to expand</summary>
424
+
425
+ </details>
426
+ -->
427
+
428
+ <!--
429
+ ### Out-of-Scope Use
430
+
431
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
432
+ -->
433
+
434
+ ## Evaluation
435
+
436
+ ### Metrics
437
+
438
+ #### Information Retrieval
439
+ * Dataset: `dim_1024`
440
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
441
+
442
+ | Metric | Value |
443
+ |:--------------------|:-----------|
444
+ | cosine_accuracy@1 | 0.3168 |
445
+ | cosine_accuracy@3 | 0.4254 |
446
+ | cosine_accuracy@5 | 0.477 |
447
+ | cosine_accuracy@10 | 0.5562 |
448
+ | cosine_precision@1 | 0.3168 |
449
+ | cosine_precision@3 | 0.1418 |
450
+ | cosine_precision@5 | 0.0954 |
451
+ | cosine_precision@10 | 0.0556 |
452
+ | cosine_recall@1 | 0.3168 |
453
+ | cosine_recall@3 | 0.4254 |
454
+ | cosine_recall@5 | 0.477 |
455
+ | cosine_recall@10 | 0.5562 |
456
+ | cosine_ndcg@10 | 0.4276 |
457
+ | cosine_mrr@10 | 0.3876 |
458
+ | **cosine_map@100** | **0.3994** |
459
+
460
+ #### Information Retrieval
461
+ * Dataset: `dim_896`
462
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
463
+
464
+ | Metric | Value |
465
+ |:--------------------|:-----------|
466
+ | cosine_accuracy@1 | 0.3223 |
467
+ | cosine_accuracy@3 | 0.4236 |
468
+ | cosine_accuracy@5 | 0.4733 |
469
+ | cosine_accuracy@10 | 0.5488 |
470
+ | cosine_precision@1 | 0.3223 |
471
+ | cosine_precision@3 | 0.1412 |
472
+ | cosine_precision@5 | 0.0947 |
473
+ | cosine_precision@10 | 0.0549 |
474
+ | cosine_recall@1 | 0.3223 |
475
+ | cosine_recall@3 | 0.4236 |
476
+ | cosine_recall@5 | 0.4733 |
477
+ | cosine_recall@10 | 0.5488 |
478
+ | cosine_ndcg@10 | 0.4272 |
479
+ | cosine_mrr@10 | 0.3894 |
480
+ | **cosine_map@100** | **0.4018** |
481
+
482
+ #### Information Retrieval
483
+ * Dataset: `dim_512`
484
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
485
+
486
+ | Metric | Value |
487
+ |:--------------------|:-----------|
488
+ | cosine_accuracy@1 | 0.3315 |
489
+ | cosine_accuracy@3 | 0.4236 |
490
+ | cosine_accuracy@5 | 0.4751 |
491
+ | cosine_accuracy@10 | 0.5488 |
492
+ | cosine_precision@1 | 0.3315 |
493
+ | cosine_precision@3 | 0.1412 |
494
+ | cosine_precision@5 | 0.095 |
495
+ | cosine_precision@10 | 0.0549 |
496
+ | cosine_recall@1 | 0.3315 |
497
+ | cosine_recall@3 | 0.4236 |
498
+ | cosine_recall@5 | 0.4751 |
499
+ | cosine_recall@10 | 0.5488 |
500
+ | cosine_ndcg@10 | 0.4309 |
501
+ | cosine_mrr@10 | 0.3943 |
502
+ | **cosine_map@100** | **0.4065** |
503
+
504
+ #### Information Retrieval
505
+ * Dataset: `dim_256`
506
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
507
+
508
+ | Metric | Value |
509
+ |:--------------------|:-----------|
510
+ | cosine_accuracy@1 | 0.3076 |
511
+ | cosine_accuracy@3 | 0.4125 |
512
+ | cosine_accuracy@5 | 0.4678 |
513
+ | cosine_accuracy@10 | 0.5396 |
514
+ | cosine_precision@1 | 0.3076 |
515
+ | cosine_precision@3 | 0.1375 |
516
+ | cosine_precision@5 | 0.0936 |
517
+ | cosine_precision@10 | 0.054 |
518
+ | cosine_recall@1 | 0.3076 |
519
+ | cosine_recall@3 | 0.4125 |
520
+ | cosine_recall@5 | 0.4678 |
521
+ | cosine_recall@10 | 0.5396 |
522
+ | cosine_ndcg@10 | 0.4156 |
523
+ | cosine_mrr@10 | 0.3769 |
524
+ | **cosine_map@100** | **0.3896** |
525
+
526
+ #### Information Retrieval
527
+ * Dataset: `dim_128`
528
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
529
+
530
+ | Metric | Value |
531
+ |:--------------------|:-----------|
532
+ | cosine_accuracy@1 | 0.2965 |
533
+ | cosine_accuracy@3 | 0.4052 |
534
+ | cosine_accuracy@5 | 0.4475 |
535
+ | cosine_accuracy@10 | 0.5396 |
536
+ | cosine_precision@1 | 0.2965 |
537
+ | cosine_precision@3 | 0.1351 |
538
+ | cosine_precision@5 | 0.0895 |
539
+ | cosine_precision@10 | 0.054 |
540
+ | cosine_recall@1 | 0.2965 |
541
+ | cosine_recall@3 | 0.4052 |
542
+ | cosine_recall@5 | 0.4475 |
543
+ | cosine_recall@10 | 0.5396 |
544
+ | cosine_ndcg@10 | 0.4079 |
545
+ | cosine_mrr@10 | 0.3672 |
546
+ | **cosine_map@100** | **0.3789** |
547
+
548
+ <!--
549
+ ## Bias, Risks and Limitations
550
+
551
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
552
+ -->
553
+
554
+ <!--
555
+ ### Recommendations
556
+
557
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
558
+ -->
559
+
560
+ ## Training Details
561
+
562
+ ### Training Dataset
563
+
564
+ #### Unnamed Dataset
565
+
566
+
567
+ * Size: 4,885 training samples
568
+ * Columns: <code>positive</code> and <code>anchor</code>
569
+ * Approximate statistics based on the first 1000 samples:
570
+ | | positive | anchor |
571
+ |:--------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
572
+ | type | string | string |
573
+ | details | <ul><li>min: 3 tokens</li><li>mean: 32.82 tokens</li><li>max: 185 tokens</li></ul> | <ul><li>min: 2 tokens</li><li>mean: 26.77 tokens</li><li>max: 71 tokens</li></ul> |
574
+ * Samples:
575
+ | positive | anchor |
576
+ |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------|
577
+ | <code> Lorsque le traitement spécifique par catégorie de déchets produits par la Personne Morale est inconnu, le taux moyen local ou sectoriel de traitement en fin de vie (incinération, mise en décharge, recyclage, compostage, etc.) est utilisé. Le transport est également un paramètre à intégrer au calcul.</code> | <code> Quels sont les paramètres clés par type de traitement à prendre en compte pour réaliser un bilan d'émissions de gaz à effet de serre ?</code> |
578
+ | <code> Une analyse de cycle de vie fournit un moyen efficace et systémique pour évaluer les impacts environnementaux d’un produit, d’un service, d’une entreprise ou d’un procédé.</code> | <code> Qu'est-ce qu'une évaluation de cycle de vie (ACV) ?</code> |
579
+ | <code> 1 469,2 t CO2e.</code> | <code> Quel est le total des émissions annuelles de l'entreprise GAMMA ?</code> |
580
+ * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
581
+ ```json
582
+ {
583
+ "loss": "MultipleNegativesRankingLoss",
584
+ "matryoshka_dims": [
585
+ 1024,
586
+ 896,
587
+ 512,
588
+ 256,
589
+ 128
590
+ ],
591
+ "matryoshka_weights": [
592
+ 1,
593
+ 1,
594
+ 1,
595
+ 1,
596
+ 1
597
+ ],
598
+ "n_dims_per_step": -1
599
+ }
600
+ ```
601
+
602
+ ### Training Hyperparameters
603
+ #### Non-Default Hyperparameters
604
+
605
+ - `eval_strategy`: epoch
606
+ - `per_device_train_batch_size`: 16
607
+ - `gradient_accumulation_steps`: 8
608
+ - `learning_rate`: 2e-05
609
+ - `num_train_epochs`: 20
610
+ - `lr_scheduler_type`: cosine
611
+ - `warmup_ratio`: 0.1
612
+ - `bf16`: True
613
+ - `tf32`: True
614
+ - `load_best_model_at_end`: True
615
+ - `optim`: adamw_torch_fused
616
+ - `batch_sampler`: no_duplicates
617
+
618
+ #### All Hyperparameters
619
+ <details><summary>Click to expand</summary>
620
+
621
+ - `overwrite_output_dir`: False
622
+ - `do_predict`: False
623
+ - `eval_strategy`: epoch
624
+ - `prediction_loss_only`: True
625
+ - `per_device_train_batch_size`: 16
626
+ - `per_device_eval_batch_size`: 8
627
+ - `per_gpu_train_batch_size`: None
628
+ - `per_gpu_eval_batch_size`: None
629
+ - `gradient_accumulation_steps`: 8
630
+ - `eval_accumulation_steps`: None
631
+ - `torch_empty_cache_steps`: None
632
+ - `learning_rate`: 2e-05
633
+ - `weight_decay`: 0.0
634
+ - `adam_beta1`: 0.9
635
+ - `adam_beta2`: 0.999
636
+ - `adam_epsilon`: 1e-08
637
+ - `max_grad_norm`: 1.0
638
+ - `num_train_epochs`: 20
639
+ - `max_steps`: -1
640
+ - `lr_scheduler_type`: cosine
641
+ - `lr_scheduler_kwargs`: {}
642
+ - `warmup_ratio`: 0.1
643
+ - `warmup_steps`: 0
644
+ - `log_level`: passive
645
+ - `log_level_replica`: warning
646
+ - `log_on_each_node`: True
647
+ - `logging_nan_inf_filter`: True
648
+ - `save_safetensors`: True
649
+ - `save_on_each_node`: False
650
+ - `save_only_model`: False
651
+ - `restore_callback_states_from_checkpoint`: False
652
+ - `no_cuda`: False
653
+ - `use_cpu`: False
654
+ - `use_mps_device`: False
655
+ - `seed`: 42
656
+ - `data_seed`: None
657
+ - `jit_mode_eval`: False
658
+ - `use_ipex`: False
659
+ - `bf16`: True
660
+ - `fp16`: False
661
+ - `fp16_opt_level`: O1
662
+ - `half_precision_backend`: auto
663
+ - `bf16_full_eval`: False
664
+ - `fp16_full_eval`: False
665
+ - `tf32`: True
666
+ - `local_rank`: 0
667
+ - `ddp_backend`: None
668
+ - `tpu_num_cores`: None
669
+ - `tpu_metrics_debug`: False
670
+ - `debug`: []
671
+ - `dataloader_drop_last`: False
672
+ - `dataloader_num_workers`: 0
673
+ - `dataloader_prefetch_factor`: None
674
+ - `past_index`: -1
675
+ - `disable_tqdm`: False
676
+ - `remove_unused_columns`: True
677
+ - `label_names`: None
678
+ - `load_best_model_at_end`: True
679
+ - `ignore_data_skip`: False
680
+ - `fsdp`: []
681
+ - `fsdp_min_num_params`: 0
682
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
683
+ - `fsdp_transformer_layer_cls_to_wrap`: None
684
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
685
+ - `deepspeed`: None
686
+ - `label_smoothing_factor`: 0.0
687
+ - `optim`: adamw_torch_fused
688
+ - `optim_args`: None
689
+ - `adafactor`: False
690
+ - `group_by_length`: False
691
+ - `length_column_name`: length
692
+ - `ddp_find_unused_parameters`: None
693
+ - `ddp_bucket_cap_mb`: None
694
+ - `ddp_broadcast_buffers`: False
695
+ - `dataloader_pin_memory`: True
696
+ - `dataloader_persistent_workers`: False
697
+ - `skip_memory_metrics`: True
698
+ - `use_legacy_prediction_loop`: False
699
+ - `push_to_hub`: False
700
+ - `resume_from_checkpoint`: None
701
+ - `hub_model_id`: None
702
+ - `hub_strategy`: every_save
703
+ - `hub_private_repo`: False
704
+ - `hub_always_push`: False
705
+ - `gradient_checkpointing`: False
706
+ - `gradient_checkpointing_kwargs`: None
707
+ - `include_inputs_for_metrics`: False
708
+ - `eval_do_concat_batches`: True
709
+ - `fp16_backend`: auto
710
+ - `push_to_hub_model_id`: None
711
+ - `push_to_hub_organization`: None
712
+ - `mp_parameters`:
713
+ - `auto_find_batch_size`: False
714
+ - `full_determinism`: False
715
+ - `torchdynamo`: None
716
+ - `ray_scope`: last
717
+ - `ddp_timeout`: 1800
718
+ - `torch_compile`: False
719
+ - `torch_compile_backend`: None
720
+ - `torch_compile_mode`: None
721
+ - `dispatch_batches`: None
722
+ - `split_batches`: None
723
+ - `include_tokens_per_second`: False
724
+ - `include_num_input_tokens_seen`: False
725
+ - `neftune_noise_alpha`: None
726
+ - `optim_target_modules`: None
727
+ - `batch_eval_metrics`: False
728
+ - `eval_on_start`: False
729
+ - `eval_use_gather_object`: False
730
+ - `batch_sampler`: no_duplicates
731
+ - `multi_dataset_batch_sampler`: proportional
732
+
733
+ </details>
734
+
735
+ ### Training Logs
736
+ | Epoch | Step | Training Loss | dim_1024_cosine_map@100 | dim_128_cosine_map@100 | dim_256_cosine_map@100 | dim_512_cosine_map@100 | dim_896_cosine_map@100 |
737
+ |:-----------:|:-------:|:-------------:|:-----------------------:|:----------------------:|:----------------------:|:----------------------:|:----------------------:|
738
+ | 0.2614 | 10 | 5.4141 | - | - | - | - | - |
739
+ | 0.5229 | 20 | 4.2823 | - | - | - | - | - |
740
+ | 0.7843 | 30 | 3.0162 | - | - | - | - | - |
741
+ | 0.9935 | 38 | - | 0.3636 | 0.3170 | 0.3407 | 0.3566 | 0.3668 |
742
+ | 1.0458 | 40 | 2.5846 | - | - | - | - | - |
743
+ | 1.3072 | 50 | 2.2069 | - | - | - | - | - |
744
+ | 1.5686 | 60 | 1.7585 | - | - | - | - | - |
745
+ | 1.8301 | 70 | 1.3099 | - | - | - | - | - |
746
+ | 1.9869 | 76 | - | 0.3979 | 0.3353 | 0.3726 | 0.3895 | 0.3983 |
747
+ | 2.0915 | 80 | 1.1449 | - | - | - | - | - |
748
+ | 2.3529 | 90 | 1.0137 | - | - | - | - | - |
749
+ | 2.6144 | 100 | 0.6402 | - | - | - | - | - |
750
+ | 2.8758 | 110 | 0.4931 | - | - | - | - | - |
751
+ | 2.9804 | 114 | - | 0.4026 | 0.3568 | 0.3808 | 0.3882 | 0.3992 |
752
+ | 3.1373 | 120 | 0.4662 | - | - | - | - | - |
753
+ | 3.3987 | 130 | 0.3782 | - | - | - | - | - |
754
+ | 3.6601 | 140 | 0.2696 | - | - | - | - | - |
755
+ | 3.9216 | 150 | 0.2478 | - | - | - | - | - |
756
+ | 4.0 | 153 | - | 0.3805 | 0.3460 | 0.3613 | 0.3680 | 0.3850 |
757
+ | 4.1830 | 160 | 0.2655 | - | - | - | - | - |
758
+ | 4.4444 | 170 | 0.1952 | - | - | - | - | - |
759
+ | 4.7059 | 180 | 0.1494 | - | - | - | - | - |
760
+ | 4.9673 | 190 | 0.1482 | - | - | - | - | - |
761
+ | 4.9935 | 191 | - | 0.3806 | 0.3619 | 0.3702 | 0.3799 | 0.3814 |
762
+ | 5.2288 | 200 | 0.161 | - | - | - | - | - |
763
+ | 5.4902 | 210 | 0.1282 | - | - | - | - | - |
764
+ | 5.7516 | 220 | 0.0888 | - | - | - | - | - |
765
+ | 5.9869 | 229 | - | 0.3936 | 0.3685 | 0.3758 | 0.3870 | 0.3916 |
766
+ | 6.0131 | 230 | 0.1042 | - | - | - | - | - |
767
+ | 6.2745 | 240 | 0.126 | - | - | - | - | - |
768
+ | 6.5359 | 250 | 0.103 | - | - | - | - | - |
769
+ | 6.7974 | 260 | 0.0467 | - | - | - | - | - |
770
+ | 6.9804 | 267 | - | 0.4022 | 0.3689 | 0.3897 | 0.3950 | 0.4022 |
771
+ | 7.0588 | 270 | 0.0581 | - | - | - | - | - |
772
+ | 7.3203 | 280 | 0.0728 | - | - | - | - | - |
773
+ | 7.5817 | 290 | 0.064 | - | - | - | - | - |
774
+ | 7.8431 | 300 | 0.0271 | - | - | - | - | - |
775
+ | 8.0 | 306 | - | 0.4010 | 0.3756 | 0.3872 | 0.3988 | 0.4021 |
776
+ | 8.1046 | 310 | 0.0452 | - | - | - | - | - |
777
+ | 8.3660 | 320 | 0.0613 | - | - | - | - | - |
778
+ | 8.6275 | 330 | 0.0294 | - | - | - | - | - |
779
+ | 8.8889 | 340 | 0.0396 | - | - | - | - | - |
780
+ | 8.9935 | 344 | - | 0.3914 | 0.3722 | 0.3801 | 0.3916 | 0.3939 |
781
+ | 9.1503 | 350 | 0.024 | - | - | - | - | - |
782
+ | 9.4118 | 360 | 0.0253 | - | - | - | - | - |
783
+ | 9.6732 | 370 | 0.017 | - | - | - | - | - |
784
+ | 9.9346 | 380 | 0.0163 | - | - | - | - | - |
785
+ | 9.9869 | 382 | - | 0.3901 | 0.3660 | 0.3796 | 0.3892 | 0.3904 |
786
+ | 10.1961 | 390 | 0.0191 | - | - | - | - | - |
787
+ | 10.4575 | 400 | 0.017 | - | - | - | - | - |
788
+ | 10.7190 | 410 | 0.0108 | - | - | - | - | - |
789
+ | **10.9804** | **420** | **0.0118** | **0.3994** | **0.3789** | **0.3896** | **0.4065** | **0.4018** |
790
+ | 11.2418 | 430 | 0.0111 | - | - | - | - | - |
791
+ | 11.5033 | 440 | 0.011 | - | - | - | - | - |
792
+ | 11.7647 | 450 | 0.0052 | - | - | - | - | - |
793
+ | 12.0 | 459 | - | 0.4030 | 0.3772 | 0.3986 | 0.4034 | 0.3999 |
794
+ | 12.0261 | 460 | 0.0144 | - | - | - | - | - |
795
+ | 12.2876 | 470 | 0.0068 | - | - | - | - | - |
796
+ | 12.5490 | 480 | 0.0061 | - | - | - | - | - |
797
+ | 12.8105 | 490 | 0.0039 | - | - | - | - | - |
798
+ | 12.9935 | 497 | - | 0.4022 | 0.3733 | 0.3869 | 0.3995 | 0.3983 |
799
+ | 13.0719 | 500 | 0.0074 | - | - | - | - | - |
800
+ | 13.3333 | 510 | 0.005 | - | - | - | - | - |
801
+ | 13.5948 | 520 | 0.0045 | - | - | - | - | - |
802
+ | 13.8562 | 530 | 0.0035 | - | - | - | - | - |
803
+ | 13.9869 | 535 | - | 0.4027 | 0.3779 | 0.3891 | 0.4015 | 0.3999 |
804
+ | 14.1176 | 540 | 0.0047 | - | - | - | - | - |
805
+ | 14.3791 | 550 | 0.0043 | - | - | - | - | - |
806
+ | 14.6405 | 560 | 0.0038 | - | - | - | - | - |
807
+ | 14.9020 | 570 | 0.0034 | - | - | - | - | - |
808
+ | 14.9804 | 573 | - | 0.3954 | 0.3734 | 0.3875 | 0.3982 | 0.3962 |
809
+ | 15.1634 | 580 | 0.0037 | - | - | - | - | - |
810
+ | 15.4248 | 590 | 0.0039 | - | - | - | - | - |
811
+ | 15.6863 | 600 | 0.0034 | - | - | - | - | - |
812
+ | 15.9477 | 610 | 0.0033 | - | - | - | - | - |
813
+ | 16.0 | 612 | - | 0.3966 | 0.3720 | 0.3852 | 0.3948 | 0.3936 |
814
+ | 16.2092 | 620 | 0.0038 | - | - | - | - | - |
815
+ | 16.4706 | 630 | 0.0034 | - | - | - | - | - |
816
+ | 16.7320 | 640 | 0.0029 | - | - | - | - | - |
817
+ | 16.9935 | 650 | 0.0033 | 0.3968 | 0.3723 | 0.3844 | 0.3977 | 0.3966 |
818
+ | 17.2549 | 660 | 0.0034 | - | - | - | - | - |
819
+ | 17.5163 | 670 | 0.0033 | - | - | - | - | - |
820
+ | 17.7778 | 680 | 0.0028 | - | - | - | - | - |
821
+ | 17.9869 | 688 | - | 0.3965 | 0.3695 | 0.3861 | 0.3960 | 0.3969 |
822
+ | 18.0392 | 690 | 0.0033 | - | - | - | - | - |
823
+ | 18.3007 | 700 | 0.0033 | - | - | - | - | - |
824
+ | 18.5621 | 710 | 0.0036 | - | - | - | - | - |
825
+ | 18.8235 | 720 | 0.0026 | - | - | - | - | - |
826
+ | 18.9804 | 726 | - | 0.3962 | 0.3701 | 0.3819 | 0.3951 | 0.3964 |
827
+ | 19.0850 | 730 | 0.003 | - | - | - | - | - |
828
+ | 19.3464 | 740 | 0.0036 | - | - | - | - | - |
829
+ | 19.6078 | 750 | 0.0033 | - | - | - | - | - |
830
+ | 19.8693 | 760 | 0.0031 | 0.3994 | 0.3789 | 0.3896 | 0.4065 | 0.4018 |
831
+
832
+ * The bold row denotes the saved checkpoint.
833
+
834
+ ### Framework Versions
835
+ - Python: 3.10.12
836
+ - Sentence Transformers: 3.0.1
837
+ - Transformers: 4.44.2
838
+ - PyTorch: 2.4.1+cu121
839
+ - Accelerate: 0.34.2
840
+ - Datasets: 2.21.0
841
+ - Tokenizers: 0.19.1
842
+
843
+ ## Citation
844
+
845
+ ### BibTeX
846
+
847
+ #### Sentence Transformers
848
+ ```bibtex
849
+ @inproceedings{reimers-2019-sentence-bert,
850
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
851
+ author = "Reimers, Nils and Gurevych, Iryna",
852
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
853
+ month = "11",
854
+ year = "2019",
855
+ publisher = "Association for Computational Linguistics",
856
+ url = "https://arxiv.org/abs/1908.10084",
857
+ }
858
+ ```
859
+
860
+ #### MatryoshkaLoss
861
+ ```bibtex
862
+ @misc{kusupati2024matryoshka,
863
+ title={Matryoshka Representation Learning},
864
+ author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
865
+ year={2024},
866
+ eprint={2205.13147},
867
+ archivePrefix={arXiv},
868
+ primaryClass={cs.LG}
869
+ }
870
+ ```
871
+
872
+ #### MultipleNegativesRankingLoss
873
+ ```bibtex
874
+ @misc{henderson2017efficient,
875
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
876
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
877
+ year={2017},
878
+ eprint={1705.00652},
879
+ archivePrefix={arXiv},
880
+ primaryClass={cs.CL}
881
+ }
882
+ ```
883
+
884
+ <!--
885
+ ## Glossary
886
+
887
+ *Clearly define terms in order to be accessible across audiences.*
888
+ -->
889
+
890
+ <!--
891
+ ## Model Card Authors
892
+
893
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
894
+ -->
895
+
896
+ <!--
897
+ ## Model Card Contact
898
+
899
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
900
+ -->
config.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "actualdata/bilingual-embedding-large",
3
+ "architectures": [
4
+ "BilingualModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "auto_map": {
8
+ "AutoConfig": "dangvantuan/bilingual_impl--config.BilingualConfig",
9
+ "AutoModel": "dangvantuan/bilingual_impl--modeling.BilingualModel",
10
+ "AutoModelForMaskedLM": "dangvantuan/bilingual_impl--modeling.BilingualForMaskedLM",
11
+ "AutoModelForMultipleChoice": "dangvantuan/bilingual_impl--modeling.BilingualForMultipleChoice",
12
+ "AutoModelForQuestionAnswering": "dangvantuan/bilingual_impl--modeling.BilingualForQuestionAnswering",
13
+ "AutoModelForSequenceClassification": "dangvantuan/bilingual_impl--modeling.BilingualForSequenceClassification",
14
+ "AutoModelForTokenClassification": "dangvantuan/bilingual_impl--modeling.BilingualForTokenClassification"
15
+ },
16
+ "bos_token_id": 0,
17
+ "classifier_dropout": null,
18
+ "eos_token_id": 2,
19
+ "hidden_act": "gelu",
20
+ "hidden_dropout_prob": 0.1,
21
+ "hidden_size": 1024,
22
+ "initializer_range": 0.02,
23
+ "intermediate_size": 4096,
24
+ "layer_norm_eps": 1e-05,
25
+ "max_position_embeddings": 514,
26
+ "model_type": "bilingual",
27
+ "num_attention_heads": 16,
28
+ "num_hidden_layers": 24,
29
+ "output_past": true,
30
+ "pad_token_id": 1,
31
+ "position_embedding_type": "absolute",
32
+ "torch_dtype": "float32",
33
+ "transformers_version": "4.44.2",
34
+ "type_vocab_size": 1,
35
+ "use_cache": true,
36
+ "vocab_size": 250002
37
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.0.1",
4
+ "transformers": "4.44.2",
5
+ "pytorch": "2.4.1+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ccfd77375ca214703e8cf9c1947bcc811ec50d43231232fd7033faf1ed4f7eee
3
+ size 2239607176
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
sentencepiece.bpe.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cfc8146abe2a0488e9e2a0c56de7952f7c11ab059eca145a0a727afce0db2865
3
+ size 5069051
special_tokens_map.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "cls_token": {
10
+ "content": "<s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "eos_token": {
17
+ "content": "</s>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "mask_token": {
24
+ "content": "<mask>",
25
+ "lstrip": true,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "pad_token": {
31
+ "content": "<pad>",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "sep_token": {
38
+ "content": "</s>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ },
44
+ "unk_token": {
45
+ "content": "<unk>",
46
+ "lstrip": false,
47
+ "normalized": false,
48
+ "rstrip": false,
49
+ "single_word": false
50
+ }
51
+ }
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:883b037111086fd4dfebbbc9b7cee11e1517b5e0c0514879478661440f137085
3
+ size 17082987
tokenizer_config.json ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "<s>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "<pad>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "</s>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "3": {
28
+ "content": "<unk>",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "250001": {
36
+ "content": "<mask>",
37
+ "lstrip": true,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "additional_special_tokens": [],
45
+ "bos_token": "<s>",
46
+ "clean_up_tokenization_spaces": true,
47
+ "cls_token": "<s>",
48
+ "eos_token": "</s>",
49
+ "mask_token": "<mask>",
50
+ "max_length": 512,
51
+ "model_max_length": 512,
52
+ "pad_to_multiple_of": null,
53
+ "pad_token": "<pad>",
54
+ "pad_token_type_id": 0,
55
+ "padding_side": "right",
56
+ "sep_token": "</s>",
57
+ "stride": 0,
58
+ "tokenizer_class": "XLMRobertaTokenizer",
59
+ "truncation_side": "right",
60
+ "truncation_strategy": "longest_first",
61
+ "unk_token": "<unk>"
62
+ }