tomaarsen HF staff commited on
Commit
75a2859
1 Parent(s): 1d4b830

Add SetFit ABSA model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 384,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false
7
+ }
README.md ADDED
@@ -0,0 +1,250 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: setfit
3
+ tags:
4
+ - setfit
5
+ - absa
6
+ - absa
7
+ - absa
8
+ - absa
9
+ - absa
10
+ - absa
11
+ - sentence-transformers
12
+ - text-classification
13
+ - generated_from_setfit_trainer
14
+ metrics:
15
+ - accuracy
16
+ widget:
17
+ - text: and very good prices.:Very good service and very good prices.
18
+ - text: 'very particular about sushi and were both:We are very particular about sushi
19
+ and were both please with every choice which included: ceviche mix (special),
20
+ crab dumplings, assorted sashimi, sushi and rolls, two types of sake, and the
21
+ banana tempura.'
22
+ - text: good and the waiters are friendly.:It's really also the service, is good and
23
+ the waiters are friendly.
24
+ - text: Our food was great too:Our food was great too!
25
+ - text: The food was pretty good:The food was pretty good, but a little flavorless
26
+ and the portions very small, including dessert.
27
+ pipeline_tag: text-classification
28
+ inference: false
29
+ co2_eq_emissions:
30
+ emissions: 5.960609724371976
31
+ source: codecarbon
32
+ training_type: fine-tuning
33
+ on_cloud: false
34
+ cpu_model: 13th Gen Intel(R) Core(TM) i7-13700K
35
+ ram_total_size: 31.777088165283203
36
+ hours_used: 0.073
37
+ hardware_used: 1 x NVIDIA GeForce RTX 3090
38
+ base_model: BAAI/bge-small-en-v1.5
39
+ model-index:
40
+ - name: SetFit Polarity Model Polarity Model Polarity Model Polarity Model Polarity
41
+ Model Polarity Model with BAAI/bge-small-en-v1.5
42
+ results:
43
+ - task:
44
+ type: text-classification
45
+ name: Text Classification
46
+ dataset:
47
+ name: Unknown
48
+ type: unknown
49
+ split: test
50
+ metrics:
51
+ - type: accuracy
52
+ value: 0.7260223048327138
53
+ name: Accuracy
54
+ ---
55
+
56
+ # SetFit Polarity Model Polarity Model Polarity Model Polarity Model Polarity Model Polarity Model with BAAI/bge-small-en-v1.5
57
+
58
+ This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Aspect Based Sentiment Analysis (ABSA). This SetFit model uses [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification. In particular, this model is in charge of classifying aspect polarities.
59
+
60
+ The model has been trained using an efficient few-shot learning technique that involves:
61
+
62
+ 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
63
+ 2. Training a classification head with features from the fine-tuned Sentence Transformer.
64
+
65
+ This model was trained within the context of a larger system for ABSA, which looks like so:
66
+
67
+ 1. Use a spaCy model to select possible aspect span candidates.
68
+ 2. Use a SetFit model to filter these possible aspect span candidates.
69
+ 3. **Use this SetFit model to classify the filtered aspect span candidates.**
70
+
71
+ ## Model Details
72
+
73
+ ### Model Description
74
+ - **Model Type:** SetFit
75
+ - **Sentence Transformer body:** [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5)
76
+ - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
77
+ - **SetFitABSA Aspect Model:** [tomaarsen/setfit-absa-bge-small-en-v1.5-restaurants-aspect](https://huggingface.co/tomaarsen/setfit-absa-bge-small-en-v1.5-restaurants-aspect)
78
+ - **SetFitABSA Polarity Model:** [tomaarsen/setfit-absa-bge-small-en-v1.5-restaurants-polarity](https://huggingface.co/tomaarsen/setfit-absa-bge-small-en-v1.5-restaurants-polarity)
79
+ - **Maximum Sequence Length:** 512 tokens
80
+ - **Number of Classes:** 4 classes
81
+ <!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
82
+ <!-- - **Language:** Unknown -->
83
+ <!-- - **License:** Unknown -->
84
+
85
+ ### Model Sources
86
+
87
+ - **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
88
+ - **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
89
+ - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
90
+
91
+ ### Model Labels
92
+ | Label | Examples |
93
+ |:---------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
94
+ | negative | <ul><li>'But the staff was so horrible:But the staff was so horrible to us.'</li><li>', forgot our toast, left out:They did not have mayonnaise, forgot our toast, left out ingredients (ie cheese in an omelet), below hot temperatures and the bacon was so over cooked it crumbled on the plate when you touched it.'</li><li>'did not have mayonnaise, forgot our:They did not have mayonnaise, forgot our toast, left out ingredients (ie cheese in an omelet), below hot temperatures and the bacon was so over cooked it crumbled on the plate when you touched it.'</li></ul> |
95
+ | positive | <ul><li>"factor was the food, which was:To be completely fair, the only redeeming factor was the food, which was above average, but couldn't make up for all the other deficiencies of Teodora."</li><li>"The food is uniformly exceptional:The food is uniformly exceptional, with a very capable kitchen which will proudly whip up whatever you feel like eating, whether it's on the menu or not."</li><li>"a very capable kitchen which will proudly:The food is uniformly exceptional, with a very capable kitchen which will proudly whip up whatever you feel like eating, whether it's on the menu or not."</li></ul> |
96
+ | neutral | <ul><li>"'s on the menu or not.:The food is uniformly exceptional, with a very capable kitchen which will proudly whip up whatever you feel like eating, whether it's on the menu or not."</li><li>'to sample both meats).:Our agreed favorite is the orrechiete with sausage and chicken (usually the waiters are kind enough to split the dish in half so you get to sample both meats).'</li><li>'to split the dish in half so:Our agreed favorite is the orrechiete with sausage and chicken (usually the waiters are kind enough to split the dish in half so you get to sample both meats).'</li></ul> |
97
+ | conflict | <ul><li>'The food was delicious but:The food was delicious but do not come here on a empty stomach.'</li><li>"The service varys from day:The service varys from day to day- sometimes they're very nice, and sometimes not."</li><li>'Though the Spider Roll may look like:Though the Spider Roll may look like a challenge to eat, with soft shell crab hanging out of the roll, it is well worth the price you pay for them.'</li></ul> |
98
+
99
+ ## Evaluation
100
+
101
+ ### Metrics
102
+ | Label | Accuracy |
103
+ |:--------|:---------|
104
+ | **all** | 0.7260 |
105
+
106
+ ## Uses
107
+
108
+ ### Direct Use for Inference
109
+
110
+ First install the SetFit library:
111
+
112
+ ```bash
113
+ pip install setfit
114
+ ```
115
+
116
+ Then you can load this model and run inference.
117
+
118
+ ```python
119
+ from setfit import AbsaModel
120
+
121
+ # Download from the 🤗 Hub
122
+ model = AbsaModel.from_pretrained(
123
+ "tomaarsen/setfit-absa-bge-small-en-v1.5-restaurants-aspect",
124
+ "tomaarsen/setfit-absa-bge-small-en-v1.5-restaurants-polarity",
125
+ )
126
+ # Run inference
127
+ preds = model("The food was great, but the venue is just way too busy.")
128
+ ```
129
+
130
+ <!--
131
+ ### Downstream Use
132
+
133
+ *List how someone could finetune this model on their own dataset.*
134
+ -->
135
+
136
+ <!--
137
+ ### Out-of-Scope Use
138
+
139
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
140
+ -->
141
+
142
+ <!--
143
+ ## Bias, Risks and Limitations
144
+
145
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
146
+ -->
147
+
148
+ <!--
149
+ ### Recommendations
150
+
151
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
152
+ -->
153
+
154
+ ## Training Details
155
+
156
+ ### Training Set Metrics
157
+ | Training set | Min | Median | Max |
158
+ |:-------------|:----|:--------|:----|
159
+ | Word count | 6 | 22.4902 | 51 |
160
+
161
+ | Label | Training Sample Count |
162
+ |:---------|:----------------------|
163
+ | conflict | 6 |
164
+ | negative | 37 |
165
+ | neutral | 30 |
166
+ | positive | 131 |
167
+
168
+ ### Training Hyperparameters
169
+ - batch_size: (256, 256)
170
+ - num_epochs: (5, 5)
171
+ - max_steps: 5000
172
+ - sampling_strategy: oversampling
173
+ - body_learning_rate: (2e-05, 1e-05)
174
+ - head_learning_rate: 0.01
175
+ - loss: CosineSimilarityLoss
176
+ - distance_metric: cosine_distance
177
+ - margin: 0.25
178
+ - end_to_end: False
179
+ - use_amp: True
180
+ - warmup_proportion: 0.1
181
+ - seed: 42
182
+ - load_best_model_at_end: True
183
+
184
+ ### Training Results
185
+ | Epoch | Step | Training Loss | Validation Loss |
186
+ |:----------:|:-------:|:-------------:|:---------------:|
187
+ | 0.0115 | 1 | 0.2334 | - |
188
+ | 0.5747 | 50 | 0.2242 | - |
189
+ | **1.1494** | **100** | **0.1609** | **0.1859** |
190
+ | 1.7241 | 150 | 0.0932 | - |
191
+ | 2.2989 | 200 | 0.0302 | 0.2054 |
192
+ | 2.8736 | 250 | 0.0206 | - |
193
+ | 3.4483 | 300 | 0.0071 | 0.2427 |
194
+ | 4.0230 | 350 | 0.003 | - |
195
+ | 4.5977 | 400 | 0.0025 | 0.2654 |
196
+
197
+ * The bold row denotes the saved checkpoint.
198
+ ### Environmental Impact
199
+ Carbon emissions were measured using [CodeCarbon](https://github.com/mlco2/codecarbon).
200
+ - **Carbon Emitted**: 0.006 kg of CO2
201
+ - **Hours Used**: 0.073 hours
202
+
203
+ ### Training Hardware
204
+ - **On Cloud**: No
205
+ - **GPU Model**: 1 x NVIDIA GeForce RTX 3090
206
+ - **CPU Model**: 13th Gen Intel(R) Core(TM) i7-13700K
207
+ - **RAM Size**: 31.78 GB
208
+
209
+ ### Framework Versions
210
+ - Python: 3.9.16
211
+ - SetFit: 1.0.0.dev0
212
+ - Sentence Transformers: 2.2.2
213
+ - Transformers: 4.29.0
214
+ - PyTorch: 1.13.1+cu117
215
+ - Datasets: 2.15.0
216
+ - Tokenizers: 0.13.3
217
+
218
+ ## Citation
219
+
220
+ ### BibTeX
221
+ ```bibtex
222
+ @article{https://doi.org/10.48550/arxiv.2209.11055,
223
+ doi = {10.48550/ARXIV.2209.11055},
224
+ url = {https://arxiv.org/abs/2209.11055},
225
+ author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
226
+ keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
227
+ title = {Efficient Few-Shot Learning Without Prompts},
228
+ publisher = {arXiv},
229
+ year = {2022},
230
+ copyright = {Creative Commons Attribution 4.0 International}
231
+ }
232
+ ```
233
+
234
+ <!--
235
+ ## Glossary
236
+
237
+ *Clearly define terms in order to be accessible across audiences.*
238
+ -->
239
+
240
+ <!--
241
+ ## Model Card Authors
242
+
243
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
244
+ -->
245
+
246
+ <!--
247
+ ## Model Card Contact
248
+
249
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
250
+ -->
config.json ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "models\\step_100\\",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "hidden_act": "gelu",
9
+ "hidden_dropout_prob": 0.1,
10
+ "hidden_size": 384,
11
+ "id2label": {
12
+ "0": "LABEL_0"
13
+ },
14
+ "initializer_range": 0.02,
15
+ "intermediate_size": 1536,
16
+ "label2id": {
17
+ "LABEL_0": 0
18
+ },
19
+ "layer_norm_eps": 1e-12,
20
+ "max_position_embeddings": 512,
21
+ "model_type": "bert",
22
+ "num_attention_heads": 12,
23
+ "num_hidden_layers": 12,
24
+ "pad_token_id": 0,
25
+ "position_embedding_type": "absolute",
26
+ "torch_dtype": "float32",
27
+ "transformers_version": "4.29.0",
28
+ "type_vocab_size": 2,
29
+ "use_cache": true,
30
+ "vocab_size": 30522
31
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "2.2.2",
4
+ "transformers": "4.28.1",
5
+ "pytorch": "1.13.0+cu117"
6
+ }
7
+ }
config_setfit.json ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ {
2
+ "labels": null,
3
+ "span_context": 3,
4
+ "normalize_embeddings": false
5
+ }
model_head.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:328d505003cc0ab45d534a2f8bf5051f278c35e282e4291b394cda9ae107fe04
3
+ size 13271
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0e8f7b15a77ed76e6167761443ddbb79a90ef913589f55a2f5afa90d3e61a670
3
+ size 133511213
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": true
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": "[CLS]",
3
+ "mask_token": "[MASK]",
4
+ "pad_token": "[PAD]",
5
+ "sep_token": "[SEP]",
6
+ "unk_token": "[UNK]"
7
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "clean_up_tokenization_spaces": true,
3
+ "cls_token": "[CLS]",
4
+ "do_basic_tokenize": true,
5
+ "do_lower_case": true,
6
+ "mask_token": "[MASK]",
7
+ "model_max_length": 512,
8
+ "never_split": null,
9
+ "pad_token": "[PAD]",
10
+ "sep_token": "[SEP]",
11
+ "strip_accents": null,
12
+ "tokenize_chinese_chars": true,
13
+ "tokenizer_class": "BertTokenizer",
14
+ "unk_token": "[UNK]"
15
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff