thegenerativegeneration commited on
Commit
fe0c514
1 Parent(s): 37bcb01

Upload 13 files

Browse files
Files changed (4) hide show
  1. README.md +67 -36
  2. config_setfit.json +2 -2
  3. model.safetensors +1 -1
  4. model_head.pkl +1 -1
README.md CHANGED
@@ -9,12 +9,11 @@ base_model: intfloat/multilingual-e5-small
9
  metrics:
10
  - accuracy
11
  widget:
12
- - text: 'query: Interessant. Hast du das schon mal ausprobiert?'
13
- - text: 'query: はい、持っていますよ。すぐにメールで送りますね。'
14
- - text: 'query: Va bene ci sentiamo dopo Marco buona giornata'
15
- - text: 'query: Ζητώ συγγνώμη, πρέπει να αποχωρήσω τώρα.'
16
- - text: 'query: Guten Morgen, Maria! Hast du die Präsentation für das Meeting heute
17
- fertig?'
18
  pipeline_tag: text-classification
19
  inference: true
20
  model-index:
@@ -61,10 +60,10 @@ The model has been trained using an efficient few-shot learning technique that i
61
  - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
62
 
63
  ### Model Labels
64
- | Label | Examples |
65
- |:------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
66
- | 0 | <ul><li>'query: สวัสดีค่ะ วันนี้เป็นอย่างไรบ้าง?'</li><li>'query: Jag förstår. Vad tycker du att vi ska göra nu?'</li><li>'query: Hej, wszystko w porządku. Właśnie dostałam nową pracę.'</li></ul> |
67
- | 1 | <ul><li>'query: Чудесно, доскоро!'</li><li>'query: Mama cheamă, trebuie să mă întorc acasă, pa.'</li><li>'query: Perdó, ja he de marxar.'</li></ul> |
68
 
69
  ## Evaluation
70
 
@@ -91,7 +90,7 @@ from setfit import SetFitModel
91
  # Download from the 🤗 Hub
92
  model = SetFitModel.from_pretrained("setfit_model_id")
93
  # Run inference
94
- preds = model("query: はい、持っていますよ。すぐにメールで送りますね。")
95
  ```
96
 
97
  <!--
@@ -123,23 +122,23 @@ preds = model("query: はい、持っていますよ。すぐにメールで送
123
  ### Training Set Metrics
124
  | Training set | Min | Median | Max |
125
  |:-------------|:----|:-------|:----|
126
- | Word count | 2 | 7.3663 | 21 |
127
 
128
  | Label | Training Sample Count |
129
  |:------|:----------------------|
130
- | 0 | 286 |
131
- | 1 | 290 |
132
 
133
  ### Training Hyperparameters
134
  - batch_size: (16, 2)
135
  - num_epochs: (1, 16)
136
- - max_steps: 900
137
  - sampling_strategy: undersampling
138
- - body_learning_rate: (1e-05, 1e-05)
139
  - head_learning_rate: 0.001
140
  - loss: CosineSimilarityLoss
141
  - distance_metric: cosine_distance
142
- - margin: 0.1
143
  - end_to_end: False
144
  - use_amp: False
145
  - warmup_proportion: 0.1
@@ -151,25 +150,57 @@ preds = model("query: はい、持っていますよ。すぐにメールで送
151
  ### Training Results
152
  | Epoch | Step | Training Loss | Validation Loss |
153
  |:------:|:----:|:-------------:|:---------------:|
154
- | 0.0006 | 1 | 0.3683 | - |
155
- | 0.0278 | 50 | 0.2855 | - |
156
- | 0.0555 | 100 | 0.1691 | 0.1598 |
157
- | 0.0833 | 150 | 0.0339 | - |
158
- | 0.1110 | 200 | 0.0134 | 0.0745 |
159
- | 0.1388 | 250 | 0.0309 | - |
160
- | 0.1666 | 300 | 0.0076 | 0.0344 |
161
- | 0.1943 | 350 | 0.0023 | - |
162
- | 0.2221 | 400 | 0.0012 | 0.0849 |
163
- | 0.2499 | 450 | 0.0007 | - |
164
- | 0.2776 | 500 | 0.0008 | 0.0932 |
165
- | 0.3054 | 550 | 0.0005 | - |
166
- | 0.3331 | 600 | 0.0005 | 0.0805 |
167
- | 0.3609 | 650 | 0.0004 | - |
168
- | 0.3887 | 700 | 0.0006 | 0.0951 |
169
- | 0.4164 | 750 | 0.0006 | - |
170
- | 0.4442 | 800 | 0.0016 | 0.0983 |
171
- | 0.4720 | 850 | 0.0008 | - |
172
- | 0.4997 | 900 | 0.0005 | 0.092 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
173
 
174
  ### Framework Versions
175
  - Python: 3.10.11
 
9
  metrics:
10
  - accuracy
11
  widget:
12
+ - text: 'query: Baiklah, kita cakap lagi nanti, Mark. Selamat hari!'
13
+ - text: 'query: Tôi xin lỗi nhưng tôi phải đi'
14
+ - text: 'query: 次回行くときは、私を連れて行ってください。もっと自然の中で活動したいと思っています。'
15
+ - text: 'query: Entschuldigung, ich muss jetzt gehen.'
16
+ - text: 'query: Buenos días, ¿cómo están ustedes?'
 
17
  pipeline_tag: text-classification
18
  inference: true
19
  model-index:
 
60
  - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
61
 
62
  ### Model Labels
63
+ | Label | Examples |
64
+ |:------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------|
65
+ | 0 | <ul><li>'query: Értem. Mit csinálunk most?'</li><li>'query: Ola Luca, que tal? Rematache o traballo?'</li><li>'query: Lijepo je. Hvala.'</li></ul> |
66
+ | 1 | <ul><li>'query: Жөнейін, кейін кездесеміз.'</li><li>'query: Така, ќе се видиме повторно.'</li><li>'query: ठीक है बाद में बात करते हैं मार्क अच्छा दिन'</li></ul> |
67
 
68
  ## Evaluation
69
 
 
90
  # Download from the 🤗 Hub
91
  model = SetFitModel.from_pretrained("setfit_model_id")
92
  # Run inference
93
+ preds = model("query: Tôi xin lỗi nhưng tôi phải đi")
94
  ```
95
 
96
  <!--
 
122
  ### Training Set Metrics
123
  | Training set | Min | Median | Max |
124
  |:-------------|:----|:-------|:----|
125
+ | Word count | 2 | 7.2168 | 25 |
126
 
127
  | Label | Training Sample Count |
128
  |:------|:----------------------|
129
+ | 0 | 346 |
130
+ | 1 | 346 |
131
 
132
  ### Training Hyperparameters
133
  - batch_size: (16, 2)
134
  - num_epochs: (1, 16)
135
+ - max_steps: 2500
136
  - sampling_strategy: undersampling
137
+ - body_learning_rate: (1e-06, 1e-06)
138
  - head_learning_rate: 0.001
139
  - loss: CosineSimilarityLoss
140
  - distance_metric: cosine_distance
141
+ - margin: 0.25
142
  - end_to_end: False
143
  - use_amp: False
144
  - warmup_proportion: 0.1
 
150
  ### Training Results
151
  | Epoch | Step | Training Loss | Validation Loss |
152
  |:------:|:----:|:-------------:|:---------------:|
153
+ | 0.0002 | 1 | 0.3607 | - |
154
+ | 0.0100 | 50 | 0.3634 | 0.3452 |
155
+ | 0.0200 | 100 | 0.3493 | 0.3377 |
156
+ | 0.0300 | 150 | 0.3244 | 0.3234 |
157
+ | 0.0400 | 200 | 0.3244 | 0.3034 |
158
+ | 0.0500 | 250 | 0.2931 | 0.2731 |
159
+ | 0.0600 | 300 | 0.2471 | 0.2398 |
160
+ | 0.0700 | 350 | 0.237 | 0.2168 |
161
+ | 0.0800 | 400 | 0.1964 | 0.2082 |
162
+ | 0.0900 | 450 | 0.2319 | 0.198 |
163
+ | 0.1000 | 500 | 0.2003 | 0.1968 |
164
+ | 0.1100 | 550 | 0.2014 | 0.1968 |
165
+ | 0.1200 | 600 | 0.1617 | 0.1879 |
166
+ | 0.1300 | 650 | 0.2214 | 0.1798 |
167
+ | 0.1400 | 700 | 0.2498 | 0.1768 |
168
+ | 0.1500 | 750 | 0.1527 | 0.1764 |
169
+ | 0.1600 | 800 | 0.1134 | 0.1733 |
170
+ | 0.1700 | 850 | 0.1393 | 0.1614 |
171
+ | 0.1800 | 900 | 0.1052 | 0.1549 |
172
+ | 0.1900 | 950 | 0.1772 | 0.149 |
173
+ | 0.2000 | 1000 | 0.1065 | 0.1504 |
174
+ | 0.2100 | 1050 | 0.087 | 0.1392 |
175
+ | 0.2200 | 1100 | 0.1416 | 0.1333 |
176
+ | 0.2300 | 1150 | 0.0767 | 0.1279 |
177
+ | 0.2400 | 1200 | 0.1228 | 0.1243 |
178
+ | 0.2500 | 1250 | 0.099 | 0.1128 |
179
+ | 0.2599 | 1300 | 0.1125 | 0.1106 |
180
+ | 0.2699 | 1350 | 0.1012 | 0.1156 |
181
+ | 0.2799 | 1400 | 0.0343 | 0.1022 |
182
+ | 0.2899 | 1450 | 0.0814 | 0.1012 |
183
+ | 0.2999 | 1500 | 0.0947 | 0.0965 |
184
+ | 0.3099 | 1550 | 0.0799 | 0.0964 |
185
+ | 0.3199 | 1600 | 0.113 | 0.0942 |
186
+ | 0.3299 | 1650 | 0.1125 | 0.0917 |
187
+ | 0.3399 | 1700 | 0.0507 | 0.0899 |
188
+ | 0.3499 | 1750 | 0.0986 | 0.0938 |
189
+ | 0.3599 | 1800 | 0.0885 | 0.0913 |
190
+ | 0.3699 | 1850 | 0.0712 | 0.0841 |
191
+ | 0.3799 | 1900 | 0.1131 | 0.0851 |
192
+ | 0.3899 | 1950 | 0.0701 | 0.0852 |
193
+ | 0.3999 | 2000 | 0.0805 | 0.0878 |
194
+ | 0.4099 | 2050 | 0.0375 | 0.0814 |
195
+ | 0.4199 | 2100 | 0.1236 | 0.0797 |
196
+ | 0.4299 | 2150 | 0.0532 | 0.0881 |
197
+ | 0.4399 | 2200 | 0.0265 | 0.0806 |
198
+ | 0.4499 | 2250 | 0.1268 | 0.0801 |
199
+ | 0.4599 | 2300 | 0.0557 | 0.0797 |
200
+ | 0.4699 | 2350 | 0.0956 | 0.0832 |
201
+ | 0.4799 | 2400 | 0.0671 | 0.081 |
202
+ | 0.4899 | 2450 | 0.1394 | 0.0794 |
203
+ | 0.4999 | 2500 | 0.1165 | 0.0798 |
204
 
205
  ### Framework Versions
206
  - Python: 3.10.11
config_setfit.json CHANGED
@@ -1,4 +1,4 @@
1
  {
2
- "normalize_embeddings": false,
3
- "labels": null
4
  }
 
1
  {
2
+ "labels": null,
3
+ "normalize_embeddings": false
4
  }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ee129ffbb039e468d217b93891bd4d7fb59fc6cb127dba8a76b3a0c9ca261203
3
  size 470637416
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:000d4f7a4721ae6d6796555a7183878a989fb298bc234cbeaa835da89987fcd1
3
  size 470637416
model_head.pkl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:df99fc1c0b63c98daf8f7d2ba317bcf6e7a91658fd8942d6f7028ae034ecb4d0
3
  size 4608
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:44141b4942a9da733aab94b65ce12c7ea66843c559dfd02e68618bf43a2f2997
3
  size 4608