LeoChiuu commited on
Commit
15104f0
·
verified ·
1 Parent(s): e8920db

Add new SentenceTransformer model.

Browse files
README.md CHANGED
@@ -46,31 +46,31 @@ tags:
46
  - dataset_size:680
47
  - loss:ContrastiveLoss
48
  widget:
49
- - source_sentence: 他の選択肢は?
50
  sentences:
51
- - どこを探す?
52
- - 物の姿を変える魔法が使える村人を知っている?
53
- - 村長選で忙しいから
54
- - source_sentence: ジャックについて教えて
55
  sentences:
56
- - 井戸へ訪れた?
57
- - 青いオーブがどこにあるか知ってる?
58
- - それは物の見た目を変える魔法
59
- - source_sentence: 物の姿を変える魔法が使える村人を知っている?
60
  sentences:
61
- - タイマツが欲しい
62
- - それは何?
63
- - どっちがいいと思う?
64
- - source_sentence: リリアンはどんな魔法が使えるの?
65
  sentences:
66
- - どうしてキャンドルなの?
67
- - 物の姿を変える魔法が使える村人を知っている?
68
- - 物体を変える
69
- - source_sentence: なにするんだっけ?
70
  sentences:
71
- - 魔法使い
72
- - なにすればいい?
73
- - どっちをさがせばいい?
74
  model-index:
75
  - name: SentenceTransformer based on colorfulscoop/sbert-base-ja
76
  results:
@@ -82,109 +82,109 @@ model-index:
82
  type: custom-arc-semantics-data-jp
83
  metrics:
84
  - type: cosine_accuracy
85
- value: 0.875
86
  name: Cosine Accuracy
87
  - type: cosine_accuracy_threshold
88
- value: 0.7639791965484619
89
  name: Cosine Accuracy Threshold
90
  - type: cosine_f1
91
- value: 0.896969696969697
92
  name: Cosine F1
93
  - type: cosine_f1_threshold
94
- value: 0.7639791965484619
95
  name: Cosine F1 Threshold
96
  - type: cosine_precision
97
- value: 0.8705882352941177
98
  name: Cosine Precision
99
  - type: cosine_recall
100
- value: 0.925
101
  name: Cosine Recall
102
  - type: cosine_ap
103
- value: 0.852066796474829
104
  name: Cosine Ap
105
  - type: dot_accuracy
106
- value: 0.875
107
  name: Dot Accuracy
108
  - type: dot_accuracy_threshold
109
- value: 398.1038513183594
110
  name: Dot Accuracy Threshold
111
  - type: dot_f1
112
- value: 0.9017341040462428
113
  name: Dot F1
114
  - type: dot_f1_threshold
115
- value: 398.1038513183594
116
  name: Dot F1 Threshold
117
  - type: dot_precision
118
- value: 0.8387096774193549
119
  name: Dot Precision
120
  - type: dot_recall
121
- value: 0.975
122
  name: Dot Recall
123
  - type: dot_ap
124
- value: 0.8574534537645885
125
  name: Dot Ap
126
  - type: manhattan_accuracy
127
- value: 0.875
128
  name: Manhattan Accuracy
129
  - type: manhattan_accuracy_threshold
130
- value: 349.35498046875
131
  name: Manhattan Accuracy Threshold
132
  - type: manhattan_f1
133
- value: 0.896969696969697
134
  name: Manhattan F1
135
  - type: manhattan_f1_threshold
136
- value: 363.05401611328125
137
  name: Manhattan F1 Threshold
138
  - type: manhattan_precision
139
- value: 0.8705882352941177
140
  name: Manhattan Precision
141
  - type: manhattan_recall
142
- value: 0.925
143
  name: Manhattan Recall
144
  - type: manhattan_ap
145
- value: 0.8514114774274522
146
  name: Manhattan Ap
147
  - type: euclidean_accuracy
148
- value: 0.875
149
  name: Euclidean Accuracy
150
  - type: euclidean_accuracy_threshold
151
- value: 15.954280853271484
152
  name: Euclidean Accuracy Threshold
153
  - type: euclidean_f1
154
- value: 0.896969696969697
155
  name: Euclidean F1
156
  - type: euclidean_f1_threshold
157
- value: 16.386924743652344
158
  name: Euclidean F1 Threshold
159
  - type: euclidean_precision
160
- value: 0.8705882352941177
161
  name: Euclidean Precision
162
  - type: euclidean_recall
163
- value: 0.925
164
  name: Euclidean Recall
165
  - type: euclidean_ap
166
- value: 0.851318148268234
167
  name: Euclidean Ap
168
  - type: max_accuracy
169
- value: 0.875
170
  name: Max Accuracy
171
  - type: max_accuracy_threshold
172
- value: 398.1038513183594
173
  name: Max Accuracy Threshold
174
  - type: max_f1
175
- value: 0.9017341040462428
176
  name: Max F1
177
  - type: max_f1_threshold
178
- value: 398.1038513183594
179
  name: Max F1 Threshold
180
  - type: max_precision
181
- value: 0.8705882352941177
182
  name: Max Precision
183
  - type: max_recall
184
- value: 0.975
185
  name: Max Recall
186
  - type: max_ap
187
- value: 0.8574534537645885
188
  name: Max Ap
189
  ---
190
 
@@ -238,9 +238,9 @@ from sentence_transformers import SentenceTransformer
238
  model = SentenceTransformer("sentence_transformers_model_id")
239
  # Run inference
240
  sentences = [
241
- 'なにするんだっけ?',
242
- 'なにすればいい?',
243
- '魔法使い',
244
  ]
245
  embeddings = model.encode(sentences)
246
  print(embeddings.shape)
@@ -286,41 +286,41 @@ You can finetune this model on your own dataset.
286
 
287
  | Metric | Value |
288
  |:-----------------------------|:-----------|
289
- | cosine_accuracy | 0.875 |
290
- | cosine_accuracy_threshold | 0.764 |
291
- | cosine_f1 | 0.897 |
292
- | cosine_f1_threshold | 0.764 |
293
- | cosine_precision | 0.8706 |
294
- | cosine_recall | 0.925 |
295
- | cosine_ap | 0.8521 |
296
- | dot_accuracy | 0.875 |
297
- | dot_accuracy_threshold | 398.1039 |
298
- | dot_f1 | 0.9017 |
299
- | dot_f1_threshold | 398.1039 |
300
- | dot_precision | 0.8387 |
301
- | dot_recall | 0.975 |
302
- | dot_ap | 0.8575 |
303
- | manhattan_accuracy | 0.875 |
304
- | manhattan_accuracy_threshold | 349.355 |
305
- | manhattan_f1 | 0.897 |
306
- | manhattan_f1_threshold | 363.054 |
307
- | manhattan_precision | 0.8706 |
308
- | manhattan_recall | 0.925 |
309
- | manhattan_ap | 0.8514 |
310
- | euclidean_accuracy | 0.875 |
311
- | euclidean_accuracy_threshold | 15.9543 |
312
- | euclidean_f1 | 0.897 |
313
- | euclidean_f1_threshold | 16.3869 |
314
- | euclidean_precision | 0.8706 |
315
- | euclidean_recall | 0.925 |
316
- | euclidean_ap | 0.8513 |
317
- | max_accuracy | 0.875 |
318
- | max_accuracy_threshold | 398.1039 |
319
- | max_f1 | 0.9017 |
320
- | max_f1_threshold | 398.1039 |
321
- | max_precision | 0.8706 |
322
- | max_recall | 0.975 |
323
- | **max_ap** | **0.8575** |
324
 
325
  <!--
326
  ## Bias, Risks and Limitations
@@ -347,13 +347,13 @@ You can finetune this model on your own dataset.
347
  | | text1 | text2 | label |
348
  |:--------|:---------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:------------------------------------------------|
349
  | type | string | string | int |
350
- | details | <ul><li>min: 4 tokens</li><li>mean: 8.34 tokens</li><li>max: 15 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 8.06 tokens</li><li>max: 14 tokens</li></ul> | <ul><li>0: ~41.36%</li><li>1: ~58.64%</li></ul> |
351
  * Samples:
352
- | text1 | text2 | label |
353
- |:-------------------------|:----------------------------|:---------------|
354
- | <code>夕ご飯は何を食べたの?</code> | <code>昨晩何を食べたの?</code> | <code>1</code> |
355
- | <code>キャンドルがいいな</code> | <code>タイマツ</code> | <code>0</code> |
356
- | <code>当番表を見た</code> | <code>木にスカーフがひっかかってる</code> | <code>0</code> |
357
  * Loss: [<code>ContrastiveLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#contrastiveloss) with these parameters:
358
  ```json
359
  {
@@ -371,16 +371,16 @@ You can finetune this model on your own dataset.
371
  * Size: 680 evaluation samples
372
  * Columns: <code>text1</code>, <code>text2</code>, and <code>label</code>
373
  * Approximate statistics based on the first 680 samples:
374
- | | text1 | text2 | label |
375
- |:--------|:--------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:------------------------------------------------|
376
- | type | string | string | int |
377
- | details | <ul><li>min: 4 tokens</li><li>mean: 8.1 tokens</li><li>max: 14 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 7.76 tokens</li><li>max: 14 tokens</li></ul> | <ul><li>0: ~41.18%</li><li>1: ~58.82%</li></ul> |
378
  * Samples:
379
- | text1 | text2 | label |
380
- |:--------------------------|:-------------------------|:---------------|
381
- | <code>何を思い出せるかな?</code> | <code>井戸</code> | <code>0</code> |
382
- | <code>自分で探せ</code> | <code>いらない</code> | <code>1</code> |
383
- | <code>カーテンが揺れていたから</code> | <code>辛いスープがあったから</code> | <code>0</code> |
384
  * Loss: [<code>ContrastiveLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#contrastiveloss) with these parameters:
385
  ```json
386
  {
@@ -520,12 +520,12 @@ You can finetune this model on your own dataset.
520
  ### Training Logs
521
  | Epoch | Step | Training Loss | loss | custom-arc-semantics-data-jp_max_ap |
522
  |:------:|:----:|:-------------:|:------:|:-----------------------------------:|
523
- | None | 0 | - | - | 0.8251 |
524
- | 1.0147 | 69 | 0.0212 | 0.0175 | 0.8337 |
525
- | 2.0147 | 138 | 0.015 | 0.0156 | 0.8460 |
526
- | 3.0147 | 207 | 0.0123 | 0.0149 | 0.8538 |
527
- | 4.0147 | 276 | 0.0106 | 0.0146 | 0.8574 |
528
- | 4.9412 | 340 | 0.0096 | 0.0145 | 0.8575 |
529
 
530
 
531
  ### Framework Versions
 
46
  - dataset_size:680
47
  - loss:ContrastiveLoss
48
  widget:
49
+ - source_sentence: 木材の山の中にスカーフはある?
50
  sentences:
51
+ - 巻き割をした?
52
+ - どっちが欲しい?
53
+ - おすすめは?
54
+ - source_sentence: ' 君は猫なの?'
55
  sentences:
56
+ - どこ探すんだっけ?
57
+ - 足元よりも更に深くってなに?
58
+ - キミって猫?
59
+ - source_sentence: 欲しくない
60
  sentences:
61
+ - 物体を変化できる人
62
+ - どっちも欲しくない
63
+ - スカーフがキャンプファイヤーで燃えてる
64
+ - source_sentence: 外を見てみよう
65
  sentences:
66
+ - 誰かが魔法の呪文で花をぬいぐるみに変えた
67
+ - キミって猫?
68
+ - 長老
69
+ - source_sentence: 他には選べないの?
70
  sentences:
71
+ - お鍋から匂いがしたから
72
+ - どっちがおすすめ?
73
+ - なにするんだっけ?
74
  model-index:
75
  - name: SentenceTransformer based on colorfulscoop/sbert-base-ja
76
  results:
 
82
  type: custom-arc-semantics-data-jp
83
  metrics:
84
  - type: cosine_accuracy
85
+ value: 0.8235294117647058
86
  name: Cosine Accuracy
87
  - type: cosine_accuracy_threshold
88
+ value: 0.6800776720046997
89
  name: Cosine Accuracy Threshold
90
  - type: cosine_f1
91
+ value: 0.8571428571428572
92
  name: Cosine F1
93
  - type: cosine_f1_threshold
94
+ value: 0.6610503196716309
95
  name: Cosine F1 Threshold
96
  - type: cosine_precision
97
+ value: 0.7912087912087912
98
  name: Cosine Precision
99
  - type: cosine_recall
100
+ value: 0.935064935064935
101
  name: Cosine Recall
102
  - type: cosine_ap
103
+ value: 0.8465974769503343
104
  name: Cosine Ap
105
  - type: dot_accuracy
106
+ value: 0.8161764705882353
107
  name: Dot Accuracy
108
  - type: dot_accuracy_threshold
109
+ value: 441.6131591796875
110
  name: Dot Accuracy Threshold
111
  - type: dot_f1
112
+ value: 0.8520710059171598
113
  name: Dot F1
114
  - type: dot_f1_threshold
115
+ value: 379.92266845703125
116
  name: Dot F1 Threshold
117
  - type: dot_precision
118
+ value: 0.782608695652174
119
  name: Dot Precision
120
  - type: dot_recall
121
+ value: 0.935064935064935
122
  name: Dot Recall
123
  - type: dot_ap
124
+ value: 0.8509292792079832
125
  name: Dot Ap
126
  - type: manhattan_accuracy
127
+ value: 0.8308823529411765
128
  name: Manhattan Accuracy
129
  - type: manhattan_accuracy_threshold
130
+ value: 420.1961975097656
131
  name: Manhattan Accuracy Threshold
132
  - type: manhattan_f1
133
+ value: 0.8622754491017963
134
  name: Manhattan F1
135
  - type: manhattan_f1_threshold
136
+ value: 430.6374206542969
137
  name: Manhattan F1 Threshold
138
  - type: manhattan_precision
139
+ value: 0.8
140
  name: Manhattan Precision
141
  - type: manhattan_recall
142
+ value: 0.935064935064935
143
  name: Manhattan Recall
144
  - type: manhattan_ap
145
+ value: 0.848438229073751
146
  name: Manhattan Ap
147
  - type: euclidean_accuracy
148
+ value: 0.8308823529411765
149
  name: Euclidean Accuracy
150
  - type: euclidean_accuracy_threshold
151
+ value: 18.93894386291504
152
  name: Euclidean Accuracy Threshold
153
  - type: euclidean_f1
154
+ value: 0.8588957055214723
155
  name: Euclidean F1
156
  - type: euclidean_f1_threshold
157
+ value: 18.93894386291504
158
  name: Euclidean F1 Threshold
159
  - type: euclidean_precision
160
+ value: 0.813953488372093
161
  name: Euclidean Precision
162
  - type: euclidean_recall
163
+ value: 0.9090909090909091
164
  name: Euclidean Recall
165
  - type: euclidean_ap
166
+ value: 0.8470258990606743
167
  name: Euclidean Ap
168
  - type: max_accuracy
169
+ value: 0.8308823529411765
170
  name: Max Accuracy
171
  - type: max_accuracy_threshold
172
+ value: 441.6131591796875
173
  name: Max Accuracy Threshold
174
  - type: max_f1
175
+ value: 0.8622754491017963
176
  name: Max F1
177
  - type: max_f1_threshold
178
+ value: 430.6374206542969
179
  name: Max F1 Threshold
180
  - type: max_precision
181
+ value: 0.813953488372093
182
  name: Max Precision
183
  - type: max_recall
184
+ value: 0.935064935064935
185
  name: Max Recall
186
  - type: max_ap
187
+ value: 0.8509292792079832
188
  name: Max Ap
189
  ---
190
 
 
238
  model = SentenceTransformer("sentence_transformers_model_id")
239
  # Run inference
240
  sentences = [
241
+ '他には選べないの?',
242
+ 'どっちがおすすめ?',
243
+ 'お鍋から匂いがしたから',
244
  ]
245
  embeddings = model.encode(sentences)
246
  print(embeddings.shape)
 
286
 
287
  | Metric | Value |
288
  |:-----------------------------|:-----------|
289
+ | cosine_accuracy | 0.8235 |
290
+ | cosine_accuracy_threshold | 0.6801 |
291
+ | cosine_f1 | 0.8571 |
292
+ | cosine_f1_threshold | 0.6611 |
293
+ | cosine_precision | 0.7912 |
294
+ | cosine_recall | 0.9351 |
295
+ | cosine_ap | 0.8466 |
296
+ | dot_accuracy | 0.8162 |
297
+ | dot_accuracy_threshold | 441.6132 |
298
+ | dot_f1 | 0.8521 |
299
+ | dot_f1_threshold | 379.9227 |
300
+ | dot_precision | 0.7826 |
301
+ | dot_recall | 0.9351 |
302
+ | dot_ap | 0.8509 |
303
+ | manhattan_accuracy | 0.8309 |
304
+ | manhattan_accuracy_threshold | 420.1962 |
305
+ | manhattan_f1 | 0.8623 |
306
+ | manhattan_f1_threshold | 430.6374 |
307
+ | manhattan_precision | 0.8 |
308
+ | manhattan_recall | 0.9351 |
309
+ | manhattan_ap | 0.8484 |
310
+ | euclidean_accuracy | 0.8309 |
311
+ | euclidean_accuracy_threshold | 18.9389 |
312
+ | euclidean_f1 | 0.8589 |
313
+ | euclidean_f1_threshold | 18.9389 |
314
+ | euclidean_precision | 0.814 |
315
+ | euclidean_recall | 0.9091 |
316
+ | euclidean_ap | 0.847 |
317
+ | max_accuracy | 0.8309 |
318
+ | max_accuracy_threshold | 441.6132 |
319
+ | max_f1 | 0.8623 |
320
+ | max_f1_threshold | 430.6374 |
321
+ | max_precision | 0.814 |
322
+ | max_recall | 0.9351 |
323
+ | **max_ap** | **0.8509** |
324
 
325
  <!--
326
  ## Bias, Risks and Limitations
 
347
  | | text1 | text2 | label |
348
  |:--------|:---------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:------------------------------------------------|
349
  | type | string | string | int |
350
+ | details | <ul><li>min: 4 tokens</li><li>mean: 8.31 tokens</li><li>max: 15 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 8.03 tokens</li><li>max: 14 tokens</li></ul> | <ul><li>0: ~40.81%</li><li>1: ~59.19%</li></ul> |
351
  * Samples:
352
+ | text1 | text2 | label |
353
+ |:-------------------------|:-------------------------------|:---------------|
354
+ | <code>姿かたちを変える魔法</code> | <code>物の姿を変えられる魔法</code> | <code>1</code> |
355
+ | <code>青いオーブを見かけた?</code> | <code>青いオーブがどこにあるか知ってる?</code> | <code>1</code> |
356
+ | <code>猫のぬいぐるみを見たよ</code> | <code>猫のぬいぐるみを失くさなかった?</code> | <code>1</code> |
357
  * Loss: [<code>ContrastiveLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#contrastiveloss) with these parameters:
358
  ```json
359
  {
 
371
  * Size: 680 evaluation samples
372
  * Columns: <code>text1</code>, <code>text2</code>, and <code>label</code>
373
  * Approximate statistics based on the first 680 samples:
374
+ | | text1 | text2 | label |
375
+ |:--------|:---------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:------------------------------------------------|
376
+ | type | string | string | int |
377
+ | details | <ul><li>min: 4 tokens</li><li>mean: 8.24 tokens</li><li>max: 15 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 7.88 tokens</li><li>max: 14 tokens</li></ul> | <ul><li>0: ~43.38%</li><li>1: ~56.62%</li></ul> |
378
  * Samples:
379
+ | text1 | text2 | label |
380
+ |:------------------------|:-------------------------|:---------------|
381
+ | <code>調子はどう?</code> | <code>最近どう?</code> | <code>1</code> |
382
+ | <code>なにも要らない</code> | <code>家の中</code> | <code>0</code> |
383
+ | <code>昨日は何を作ったの?</code> | <code>ビーフシチュー食べた?</code> | <code>0</code> |
384
  * Loss: [<code>ContrastiveLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#contrastiveloss) with these parameters:
385
  ```json
386
  {
 
520
  ### Training Logs
521
  | Epoch | Step | Training Loss | loss | custom-arc-semantics-data-jp_max_ap |
522
  |:------:|:----:|:-------------:|:------:|:-----------------------------------:|
523
+ | None | 0 | - | - | 0.7957 |
524
+ | 1.0147 | 69 | 0.0205 | 0.0199 | 0.8294 |
525
+ | 2.0147 | 138 | 0.0148 | 0.0180 | 0.8410 |
526
+ | 3.0147 | 207 | 0.0118 | 0.0173 | 0.8455 |
527
+ | 4.0147 | 276 | 0.0104 | 0.0170 | 0.8489 |
528
+ | 4.9412 | 340 | 0.0098 | 0.0168 | 0.8509 |
529
 
530
 
531
  ### Framework Versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:8fe7d723f6f2967a407c28454804bf0b0d4f54133d6c4cc0cf1969aa91b6e299
3
  size 442491744
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:df532cbf8c515730b079cb46df0f8397bd0412de5a0619608c49d03caf8e7902
3
  size 442491744
runs/Sep12_19-08-06_default/events.out.tfevents.1726168095.default.1847.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:32eddd6cc79cf3f51591dc64c1ddf3bba17c026db5c91ff0bfdf16cb3f7d029c
3
+ size 22423