edwsiew commited on
Commit
d1d8dcf
1 Parent(s): f38355a

Add SetFit model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,261 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: sentence-transformers/paraphrase-mpnet-base-v2
3
+ library_name: setfit
4
+ metrics:
5
+ - accuracy
6
+ pipeline_tag: text-classification
7
+ tags:
8
+ - setfit
9
+ - sentence-transformers
10
+ - text-classification
11
+ - generated_from_setfit_trainer
12
+ widget:
13
+ - text: category generator refuel fixed diesel special access no vendor acas problem
14
+ description rbs generator fuel low
15
+ - text: troubleshooting generator has been running non stop need emergency refuel
16
+ alarms are as follows rbs commercial power fail rbs generator fuel low rbs generator
17
+ running rbs generator shut down rbs gen transfer sw operated test results site
18
+ will go off the air without urgent refuel trouble description per ticket tt000080377504
19
+ sfp_as cvl06424 5 external alarm fieldreplaceableunit=sau 1 alarmport=2 rbs commercial
20
+ power fail history of trouble none vendor acas problem description fixed gen special
21
+ access none
22
+ - text: troubleshooting triage category generator shut down oss netcool alarms dxl05963
23
+ rbs generator shut down fieldreplaceableunit=sau alarmport=22 2024 08 21 02 11
24
+ 15 smart alarm rbs generator shut down mdat verification y no repeats no open
25
+ related tckt no active eim no intrusion knowledge judgement sending to vendor
26
+ to investigate and resolve gen shut down condition dispatch strategy vendor test
27
+ results triage category generator shut down oss netcool alarms dxl05963 rbs generator
28
+ shut down fieldreplaceableunit=sau alarmport=22 2024 08 21 02 11 15 smart alarm
29
+ rbs generator shut down mdat verification y no repeats no open related tckt no
30
+ active eim no intrusion knowledge judgement sending to vendor to investigate and
31
+ resolve gen shut down condition dispatch strategy vendor trouble description triage
32
+ category generator shut down oss netcool alarms dxl05963 rbs generator shut down
33
+ fieldreplaceableunit=sau alarmport=22 2024 08 21 02 11 15 smart alarm rbs generator
34
+ shut down mdat verification y no repeats no open related tckt no active eim no
35
+ intrusion knowledge judgement sending to vendor to investigate and resolve gen
36
+ shut down condition dispatch strategy vendor history of trouble triage category
37
+ generator shut down oss netcool alarms dxl05963 rbs generator shut down fieldreplaceableunit=sau
38
+ alarmport=22 2024 08 21 02 11 15 smart alarm rbs generator shut down mdat verification
39
+ y no repeats no open related tckt no active eim no intrusion knowledge judgement
40
+ sending to vendor to investigate and resolve gen shut down condition dispatch
41
+ strategy vendor vendor acas problem description triage category generator shut
42
+ down oss netcool alarms dxl05963 rbs generator shut down fieldreplaceableunit=sau
43
+ alarmport=22 2024 08 21 02 11 15 smart alarm rbs generator shut down mdat verification
44
+ y no repeats no open related tckt no active eim no intrusion knowledge judgement
45
+ sending to vendor to investigate and resolve gen shut down condition dispatch
46
+ strategy vendor special access triage category generator shut down oss netcool
47
+ alarms dxl05963 rbs generator shut down fieldreplaceableunit=sau alarmport=22
48
+ 2024 08 21 02 11 15 smart alarm rbs generator shut down mdat verification y no
49
+ repeats no open related tckt no active eim no intrusion knowledge judgement sending
50
+ to vendor to investigate and resolve gen shut down condition dispatch strategy
51
+ vendor
52
+ - text: troubleshooting triage category gen fail oss netcool alarms rbs generator
53
+ fail fieldreplaceableunit=sau alarmport=22 rbs generator fail ca placerville cell
54
+ site lotus carlsen 2024 08 20 06 27 41 smart alarm rbs generator fail fieldreplaceableunit=sau
55
+ alarmport=22 2024 08 20 06 27 36 mdat verification fixed gen history no repeats
56
+ tab no open related tickets in aots knowledge judgement sending to vendor to check
57
+ gen fail dispatch strategy vendor test results triage category gen fail oss netcool
58
+ alarms rbs generator fail fieldreplaceableunit=sau alarmport=22 rbs generator
59
+ fail ca placerville cell site lotus carlsen 2024 08 20 06 27 41 smart alarm rbs
60
+ generator fail fieldreplaceableunit=sau alarmport=22 2024 08 20 06 27 36 mdat
61
+ verification fixed gen history no repeats tab no open related tickets in aots
62
+ knowledge judgement sending to vendor to check gen fail dispatch strategy vendor
63
+ trouble description triage category gen fail oss netcool alarms rbs generator
64
+ fail fieldreplaceableunit=sau alarmport=22 rbs generator fail ca placerville cell
65
+ site lotus carlsen 2024 08 20 06 27 41 smart alarm rbs generator fail fieldreplaceableunit=sau
66
+ alarmport=22 2024 08 20 06 27 36 mdat verification fixed gen history no repeats
67
+ tab no open related tickets in aots knowledge judgement sending to vendor to check
68
+ gen fail dispatch strategy vendor history of trouble na vendor acas problem description
69
+ triage category gen fail oss netcool alarms rbs generator fail fieldreplaceableunit=sau
70
+ alarmport=22 rbs generator fail ca placerville cell site lotus carlsen 2024 08
71
+ 20 06 27 41 smart alarm rbs generator fail fieldreplaceableunit=sau alarmport=22
72
+ 2024 08 20 06 27 36 mdat verification fixed gen history no repeats tab no open
73
+ related tickets in aots knowledge judgement sending to vendor to check gen fail
74
+ dispatch strategy vendor special access na
75
+ - text: investigate gen fail requestor olivarez zachary requestor email zachary olivarez
76
+ verizonwireless com requestor phone 760 927 0406
77
+ inference: true
78
+ model-index:
79
+ - name: SetFit with sentence-transformers/paraphrase-mpnet-base-v2
80
+ results:
81
+ - task:
82
+ type: text-classification
83
+ name: Text Classification
84
+ dataset:
85
+ name: Unknown
86
+ type: unknown
87
+ split: test
88
+ metrics:
89
+ - type: accuracy
90
+ value: 0.625
91
+ name: Accuracy
92
+ ---
93
+
94
+ # SetFit with sentence-transformers/paraphrase-mpnet-base-v2
95
+
96
+ This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [sentence-transformers/paraphrase-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-mpnet-base-v2) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.
97
+
98
+ The model has been trained using an efficient few-shot learning technique that involves:
99
+
100
+ 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
101
+ 2. Training a classification head with features from the fine-tuned Sentence Transformer.
102
+
103
+ ## Model Details
104
+
105
+ ### Model Description
106
+ - **Model Type:** SetFit
107
+ - **Sentence Transformer body:** [sentence-transformers/paraphrase-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-mpnet-base-v2)
108
+ - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
109
+ - **Maximum Sequence Length:** 512 tokens
110
+ - **Number of Classes:** 2 classes
111
+ <!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
112
+ <!-- - **Language:** Unknown -->
113
+ <!-- - **License:** Unknown -->
114
+
115
+ ### Model Sources
116
+
117
+ - **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
118
+ - **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
119
+ - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
120
+
121
+ ### Model Labels
122
+ | Label | Examples |
123
+ |:------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
124
+ | 1 | <ul><li>'troubleshooting triage category generator shut down oss netcool alarms ccl04246 rbs generator shut down fieldreplaceableunit=sau alarmport=5 rbs generator shut down 2024 07 04 16 31 52 smart alarm ccl04246 rbs generator shut down fieldreplaceableunit=sau alarmport=5 2024 07 04 16 31 36 mdat oremis verification generac baldor magnum repeats open related tckt active eim intrusionknowledge judgement sending to vendor to investigate and resolve gen shut down condition dispatch strategy vendor test results triage category generator shut down oss netcool alarms ccl04246 rbs generator shut down fieldreplaceableunit=sau alarmport=5 rbs generator shut down 2024 07 04 16 31 52 smart alarm ccl04246 rbs generator shut down fieldreplaceableunit=sau alarmport=5 2024 07 04 16 31 36 mdat oremis verification generac baldor magnum repeats open related tckt active eim intrusionknowledge judgement sending to vendor to investigate and resolve gen shut down condition dispatch strategy vendor trouble description triage category generator shut down oss netcool alarms ccl04246 rbs generator shut down fieldreplaceableunit=sau alarmport=5 rbs generator shut down 2024 07 04 16 31 52 smart alarm ccl04246 rbs generator shut down fieldreplaceableunit=sau alarmport=5 2024 07 04 16 31 36 mdat oremis verification generac baldor magnum repeats open related tckt active eim intrusionknowledge judgement sending to vendor to investigate and resolve gen shut down condition dispatch strategy vendor history of trouble na vendor acas problem description triage category generator shut down oss netcool alarms ccl04246 rbs generator shut down fieldreplaceableunit=sau alarmport=5 rbs generator shut down 2024 07 04 16 31 52 smart alarm ccl04246 rbs generator shut down fieldreplaceableunit=sau alarmport=5 2024 07 04 16 31 36 mdat oremis verification generac baldor magnum repeats open related tckt active eim intrusionknowledge judgement sending to vendor to investigate and resolve gen shut down condition dispatch strategy vendor special access na'</li><li>'troubleshooting triage category rbs generator fuel leak alarm cvl08526 cvl08526 rbs generator fuel leak fieldreplaceableunit=sau 1 alarmport=23 2024 07 10 13 07 38 cvl08526 cvl08526 rbs rbs generator fuel leak fieldreplaceableunit=sau 1 alarmport=20 2024 07 10 13 05 04 mdat oremis verification generator generac baldor magnum sd30 manufacturer generac baldor magnum model sd30 status in use serial 3008406953 kw 30 prime power source no still on site yes engine perkins engine co ltd 404d 22ta manufacturer perkins engine co ltd model 404d 22ta serial gr84695u9967000g max engine kw 36 manufacturered date 2021 02 01 engine type diesel max brake hp 49 in service date 2022 07 13 fuel type ultra low sulfur diesel ulsd owner cell no repeats open related tckt active eim intrusion knowledge judgement sending to vendor to investigate and resolve gen rbs generator fuel leak condition dispatch strategy vendor test results triage category generator rbs generator fuel leak alarm cvl08526 cvl08526 rbs generator fuel leak fieldreplaceableunit=sau 1 alarmport=23 2024 07 10 13 07 38 cvl08526 cvl08526 rbs generator rbs generator fuel leak fieldreplaceableunit=sau 1 alarmport=20 2024 07 10 13 05 04 mdat oremis verification generator generac baldor magnum sd30 manufacturer generac baldor magnum model sd30 status in use serial 3008406953 kw 30 prime power source no still on site yes engine perkins engine co ltd 404d 22ta manufacturer perkins engine co ltd model 404d 22ta serial gr84695u9967000g max engine kw 36 manufacturered date 2021 02 01 engine type diesel max brake hp 49 in service date 2022 07 13 fuel type ultra low sulfur diesel ulsd owner cell no repeats open related tckt active eim intrusion knowledge judgement sending to vendor to investigate and resolve gen rbs generator fuel leak condition dispatch strategy vendor trouble description smart rbs generator fuel leak history of trouble na vendor acas problem description smart rbs generator fuel leak special access na'</li><li>'troubleshooting performed blackout test test results transfer switch display goes dark when commercial power is shut off generator does not start trouble description please create a work order for generator contractor to troubleshoot fixed generator transfer switch transfer switch failed blackout test and needs battery replaced will result in full site outage if commercial power is lost history of trouble unknown vendor acas problem description transfer switch display goes dark when commercial power is shut off generator does not start special access none'</li></ul> |
125
+ | 0 | <ul><li>'generaro will not start also please check the timer for tuesdays starting 9am alarm on the display oil pressure shutdown try to reset an start it but no luick poc davila rosalio requestor email rosalio davila verizonwireless com requestor phone 602 689 5506'</li><li>'troubleshooting triage category gen fail cli alarms rbs generator shut down fieldreplaceableunit=sau 1 alarmport=9 2024 08 28 19 24 42 mdat verification y generator generac baldor magnum sd30 3008240427 fixed knowledge judgement sending to vendor to check generator dispatch strategy vendor test results triage category gen fail cli alarms rbs generator shut down fieldreplaceableunit=sau 1 alarmport=9 2024 08 28 19 24 42 mdat verification y generator generac baldor magnum sd30 3008240427 fixed knowledge judgement sending to vendor to check generator dispatch strategy vendor trouble description triage category gen fail cli alarms rbs generator shut down fieldreplaceableunit=sau 1 alarmport=9 2024 08 28 19 24 42 mdat verification y generator generac baldor magnum sd30 3008240427 fixed knowledge judgement sending to vendor to check generator dispatch strategy vendor history of trouble n a special access n a vendor acas problem description triage category gen fail cli alarms rbs generator shut down fieldreplaceableunit=sau 1 alarmport=9 2024 08 28 19 24 42 mdat verification y generator generac baldor magnum sd30 3008240427 fixed knowledge judgement sending to vendor to check generator dispatch strategy vendor wo po 2024ai0643'</li><li>'category generator refuel fixed diesel special access n a vendor acas problem description dxl02686 rbs generator fuel low fieldreplaceableunit=sau alarmport=13 rbs generator fuel low tx hawkins cell site us 80 hawkins 2024 08 08 01 55 32 2024 08 08 01 58 24 04 32 33'</li></ul> |
126
+
127
+ ## Evaluation
128
+
129
+ ### Metrics
130
+ | Label | Accuracy |
131
+ |:--------|:---------|
132
+ | **all** | 0.625 |
133
+
134
+ ## Uses
135
+
136
+ ### Direct Use for Inference
137
+
138
+ First install the SetFit library:
139
+
140
+ ```bash
141
+ pip install setfit
142
+ ```
143
+
144
+ Then you can load this model and run inference.
145
+
146
+ ```python
147
+ from setfit import SetFitModel
148
+
149
+ # Download from the 🤗 Hub
150
+ model = SetFitModel.from_pretrained("edwsiew/phantom-dispatch-02")
151
+ # Run inference
152
+ preds = model("category generator refuel fixed diesel special access no vendor acas problem description rbs generator fuel low")
153
+ ```
154
+
155
+ <!--
156
+ ### Downstream Use
157
+
158
+ *List how someone could finetune this model on their own dataset.*
159
+ -->
160
+
161
+ <!--
162
+ ### Out-of-Scope Use
163
+
164
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
165
+ -->
166
+
167
+ <!--
168
+ ## Bias, Risks and Limitations
169
+
170
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
171
+ -->
172
+
173
+ <!--
174
+ ### Recommendations
175
+
176
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
177
+ -->
178
+
179
+ ## Training Details
180
+
181
+ ### Training Set Metrics
182
+ | Training set | Min | Median | Max |
183
+ |:-------------|:----|:---------|:----|
184
+ | Word count | 16 | 168.2540 | 915 |
185
+
186
+ | Label | Training Sample Count |
187
+ |:------|:----------------------|
188
+ | 0 | 14 |
189
+ | 1 | 49 |
190
+
191
+ ### Training Hyperparameters
192
+ - batch_size: (8, 8)
193
+ - num_epochs: (1, 1)
194
+ - max_steps: -1
195
+ - sampling_strategy: oversampling
196
+ - num_iterations: 20
197
+ - body_learning_rate: (2e-05, 2e-05)
198
+ - head_learning_rate: 2e-05
199
+ - loss: CosineSimilarityLoss
200
+ - distance_metric: cosine_distance
201
+ - margin: 0.25
202
+ - end_to_end: False
203
+ - use_amp: False
204
+ - warmup_proportion: 0.1
205
+ - seed: 42
206
+ - eval_max_steps: -1
207
+ - load_best_model_at_end: False
208
+
209
+ ### Training Results
210
+ | Epoch | Step | Training Loss | Validation Loss |
211
+ |:------:|:----:|:-------------:|:---------------:|
212
+ | 0.0032 | 1 | 0.31 | - |
213
+ | 0.1587 | 50 | 0.0308 | - |
214
+ | 0.3175 | 100 | 0.0131 | - |
215
+ | 0.4762 | 150 | 0.0023 | - |
216
+ | 0.6349 | 200 | 0.0056 | - |
217
+ | 0.7937 | 250 | 0.0009 | - |
218
+ | 0.9524 | 300 | 0.0003 | - |
219
+
220
+ ### Framework Versions
221
+ - Python: 3.12.0
222
+ - SetFit: 1.0.3
223
+ - Sentence Transformers: 3.0.1
224
+ - Transformers: 4.39.0
225
+ - PyTorch: 2.4.0+cu121
226
+ - Datasets: 2.21.0
227
+ - Tokenizers: 0.15.2
228
+
229
+ ## Citation
230
+
231
+ ### BibTeX
232
+ ```bibtex
233
+ @article{https://doi.org/10.48550/arxiv.2209.11055,
234
+ doi = {10.48550/ARXIV.2209.11055},
235
+ url = {https://arxiv.org/abs/2209.11055},
236
+ author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
237
+ keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
238
+ title = {Efficient Few-Shot Learning Without Prompts},
239
+ publisher = {arXiv},
240
+ year = {2022},
241
+ copyright = {Creative Commons Attribution 4.0 International}
242
+ }
243
+ ```
244
+
245
+ <!--
246
+ ## Glossary
247
+
248
+ *Clearly define terms in order to be accessible across audiences.*
249
+ -->
250
+
251
+ <!--
252
+ ## Model Card Authors
253
+
254
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
255
+ -->
256
+
257
+ <!--
258
+ ## Model Card Contact
259
+
260
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
261
+ -->
config.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "sentence-transformers/paraphrase-mpnet-base-v2",
3
+ "architectures": [
4
+ "MPNetModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "eos_token_id": 2,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 3072,
14
+ "layer_norm_eps": 1e-05,
15
+ "max_position_embeddings": 514,
16
+ "model_type": "mpnet",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 12,
19
+ "pad_token_id": 1,
20
+ "relative_attention_num_buckets": 32,
21
+ "torch_dtype": "float32",
22
+ "transformers_version": "4.39.0",
23
+ "vocab_size": 30527
24
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.0.1",
4
+ "transformers": "4.39.0",
5
+ "pytorch": "2.4.0+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
config_setfit.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "labels": null,
3
+ "normalize_embeddings": false
4
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a5529254b17f26ef8d1855ff5dddb6a40be7030e2012befa3bfe4aa3a164da5f
3
+ size 437967672
model_head.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ce5e91949d6d511c6ba82d45e3e0857db0ad44b0ee073597d9122e3beaba2647
3
+ size 7007
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "cls_token": {
10
+ "content": "<s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "eos_token": {
17
+ "content": "</s>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "mask_token": {
24
+ "content": "<mask>",
25
+ "lstrip": true,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "pad_token": {
31
+ "content": "<pad>",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "sep_token": {
38
+ "content": "</s>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ },
44
+ "unk_token": {
45
+ "content": "[UNK]",
46
+ "lstrip": false,
47
+ "normalized": false,
48
+ "rstrip": false,
49
+ "single_word": false
50
+ }
51
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "<s>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "<pad>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "</s>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "104": {
28
+ "content": "[UNK]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "30526": {
36
+ "content": "<mask>",
37
+ "lstrip": true,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "bos_token": "<s>",
45
+ "clean_up_tokenization_spaces": true,
46
+ "cls_token": "<s>",
47
+ "do_basic_tokenize": true,
48
+ "do_lower_case": true,
49
+ "eos_token": "</s>",
50
+ "mask_token": "<mask>",
51
+ "model_max_length": 512,
52
+ "never_split": null,
53
+ "pad_token": "<pad>",
54
+ "sep_token": "</s>",
55
+ "strip_accents": null,
56
+ "tokenize_chinese_chars": true,
57
+ "tokenizer_class": "MPNetTokenizer",
58
+ "unk_token": "[UNK]"
59
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff