yaniseuranova
commited on
Commit
•
9c3c65e
1
Parent(s):
423904b
Add SetFit model
Browse files- README.md +74 -37
- config.json +1 -1
- config_setfit.json +3 -1
- model.safetensors +1 -1
- model_head.pkl +2 -2
README.md
CHANGED
@@ -10,14 +10,15 @@ tags:
|
|
10 |
- text-classification
|
11 |
- generated_from_setfit_trainer
|
12 |
widget:
|
13 |
-
- text:
|
14 |
-
|
15 |
-
- text:
|
16 |
-
|
17 |
-
|
18 |
-
|
19 |
-
|
20 |
-
- text:
|
|
|
21 |
inference: true
|
22 |
model-index:
|
23 |
- name: SetFit with sentence-transformers/all-MiniLM-L6-v2
|
@@ -31,7 +32,7 @@ model-index:
|
|
31 |
split: test
|
32 |
metrics:
|
33 |
- type: accuracy
|
34 |
-
value: 0.
|
35 |
name: Accuracy
|
36 |
---
|
37 |
|
@@ -51,7 +52,7 @@ The model has been trained using an efficient few-shot learning technique that i
|
|
51 |
- **Sentence Transformer body:** [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)
|
52 |
- **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
|
53 |
- **Maximum Sequence Length:** 256 tokens
|
54 |
-
- **Number of Classes:**
|
55 |
<!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
|
56 |
<!-- - **Language:** Unknown -->
|
57 |
<!-- - **License:** Unknown -->
|
@@ -63,17 +64,19 @@ The model has been trained using an efficient few-shot learning technique that i
|
|
63 |
- **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
|
64 |
|
65 |
### Model Labels
|
66 |
-
| Label
|
67 |
-
|
68 |
-
| lexical
|
69 |
-
| semantic
|
|
|
|
|
70 |
|
71 |
## Evaluation
|
72 |
|
73 |
### Metrics
|
74 |
| Label | Accuracy |
|
75 |
|:--------|:---------|
|
76 |
-
| **all** | 0.
|
77 |
|
78 |
## Uses
|
79 |
|
@@ -93,7 +96,7 @@ from setfit import SetFitModel
|
|
93 |
# Download from the 🤗 Hub
|
94 |
model = SetFitModel.from_pretrained("yaniseuranova/setfit-rag-hybrid-search-query-router-test")
|
95 |
# Run inference
|
96 |
-
preds = model("What is the purpose of
|
97 |
```
|
98 |
|
99 |
<!--
|
@@ -125,16 +128,18 @@ preds = model("What is the purpose of setting up a CUPS on a server?")
|
|
125 |
### Training Set Metrics
|
126 |
| Training set | Min | Median | Max |
|
127 |
|:-------------|:----|:--------|:----|
|
128 |
-
| Word count |
|
129 |
|
130 |
-
| Label
|
131 |
-
|
132 |
-
| lexical
|
133 |
-
| semantic
|
|
|
|
|
134 |
|
135 |
### Training Hyperparameters
|
136 |
-
- batch_size: (
|
137 |
-
- num_epochs: (
|
138 |
- max_steps: -1
|
139 |
- sampling_strategy: oversampling
|
140 |
- body_learning_rate: (2e-05, 1e-05)
|
@@ -150,20 +155,52 @@ preds = model("What is the purpose of setting up a CUPS on a server?")
|
|
150 |
- load_best_model_at_end: True
|
151 |
|
152 |
### Training Results
|
153 |
-
| Epoch | Step
|
154 |
-
|
155 |
-
| 0.
|
156 |
-
| 0.
|
157 |
-
| 0.
|
158 |
-
| 0.
|
159 |
-
| 0.
|
160 |
-
| 0.
|
161 |
-
| 0.
|
162 |
-
| 0.
|
163 |
-
| 0.
|
164 |
-
| 0.
|
165 |
-
| 0.
|
166 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
167 |
|
168 |
* The bold row denotes the saved checkpoint.
|
169 |
### Framework Versions
|
|
|
10 |
- text-classification
|
11 |
- generated_from_setfit_trainer
|
12 |
widget:
|
13 |
+
- text: What are the key components involved in developing a deep learning model for
|
14 |
+
handwritten digit recognition?
|
15 |
+
- text: What is the purpose of the message posted by the CR?
|
16 |
+
- text: How can researchers create and maintain public repositories for reproducible
|
17 |
+
research?
|
18 |
+
- text: What are the key components involved in developing a deep learning model for
|
19 |
+
handwritten digit recognition?
|
20 |
+
- text: How do you prioritize and delegate tasks to ensure efficient collaboration
|
21 |
+
and feedback?
|
22 |
inference: true
|
23 |
model-index:
|
24 |
- name: SetFit with sentence-transformers/all-MiniLM-L6-v2
|
|
|
32 |
split: test
|
33 |
metrics:
|
34 |
- type: accuracy
|
35 |
+
value: 0.5
|
36 |
name: Accuracy
|
37 |
---
|
38 |
|
|
|
52 |
- **Sentence Transformer body:** [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)
|
53 |
- **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
|
54 |
- **Maximum Sequence Length:** 256 tokens
|
55 |
+
- **Number of Classes:** 4 classes
|
56 |
<!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
|
57 |
<!-- - **Language:** Unknown -->
|
58 |
<!-- - **License:** Unknown -->
|
|
|
64 |
- **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
|
65 |
|
66 |
### Model Labels
|
67 |
+
| Label | Examples |
|
68 |
+
|:--------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
69 |
+
| lexical | <ul><li>'What are the key considerations when choosing an optimization method for a complex problem?'</li><li>'What are the challenges of being a remote mentor or sponsor?'</li><li>'How do researchers typically obtain information on the ranking of machine learning conferences?'</li></ul> |
|
70 |
+
| semantic | <ul><li>'What are common issues that users may encounter when accessing a platform that uses JumpCloud for authentication?'</li><li>'What are the key components involved in developing a deep learning model for handwritten digit recognition?'</li><li>'How can machine learning and data enrichment be used to improve business outcomes in various industries?'</li></ul> |
|
71 |
+
| very_semantic | <ul><li>"What are people's opinions on a particular topic?"</li><li>'What are the key considerations when proposing names for a project or initiative?'</li><li>'What are the key considerations for successful collaboration between industry and academia in research and development projects?'</li></ul> |
|
72 |
+
| very_lexical | <ul><li>'How can one track and store keys in a Flink operator?'</li><li>'What role do companies like Solvay play in addressing key societal challenges through their business strategies and operations?'</li><li>'What is the purpose of the scoring methodology in determining RAI maturity?'</li></ul> |
|
73 |
|
74 |
## Evaluation
|
75 |
|
76 |
### Metrics
|
77 |
| Label | Accuracy |
|
78 |
|:--------|:---------|
|
79 |
+
| **all** | 0.5 |
|
80 |
|
81 |
## Uses
|
82 |
|
|
|
96 |
# Download from the 🤗 Hub
|
97 |
model = SetFitModel.from_pretrained("yaniseuranova/setfit-rag-hybrid-search-query-router-test")
|
98 |
# Run inference
|
99 |
+
preds = model("What is the purpose of the message posted by the CR?")
|
100 |
```
|
101 |
|
102 |
<!--
|
|
|
128 |
### Training Set Metrics
|
129 |
| Training set | Min | Median | Max |
|
130 |
|:-------------|:----|:--------|:----|
|
131 |
+
| Word count | 8 | 14.4138 | 24 |
|
132 |
|
133 |
+
| Label | Training Sample Count |
|
134 |
+
|:--------------|:----------------------|
|
135 |
+
| lexical | 32 |
|
136 |
+
| semantic | 21 |
|
137 |
+
| very_lexical | 10 |
|
138 |
+
| very_semantic | 24 |
|
139 |
|
140 |
### Training Hyperparameters
|
141 |
+
- batch_size: (8, 8)
|
142 |
+
- num_epochs: (3, 3)
|
143 |
- max_steps: -1
|
144 |
- sampling_strategy: oversampling
|
145 |
- body_learning_rate: (2e-05, 1e-05)
|
|
|
155 |
- load_best_model_at_end: True
|
156 |
|
157 |
### Training Results
|
158 |
+
| Epoch | Step | Training Loss | Validation Loss |
|
159 |
+
|:-------:|:--------:|:-------------:|:---------------:|
|
160 |
+
| 0.0015 | 1 | 0.268 | - |
|
161 |
+
| 0.0736 | 50 | 0.2649 | - |
|
162 |
+
| 0.1473 | 100 | 0.3352 | - |
|
163 |
+
| 0.2209 | 150 | 0.2516 | - |
|
164 |
+
| 0.2946 | 200 | 0.2438 | - |
|
165 |
+
| 0.3682 | 250 | 0.1808 | - |
|
166 |
+
| 0.4418 | 300 | 0.2365 | - |
|
167 |
+
| 0.5155 | 350 | 0.1337 | - |
|
168 |
+
| 0.5891 | 400 | 0.2263 | - |
|
169 |
+
| 0.6627 | 450 | 0.1936 | - |
|
170 |
+
| 0.7364 | 500 | 0.0612 | - |
|
171 |
+
| 0.8100 | 550 | 0.1664 | - |
|
172 |
+
| 0.8837 | 600 | 0.0987 | - |
|
173 |
+
| 0.9573 | 650 | 0.0736 | - |
|
174 |
+
| 1.0 | 679 | - | 0.2288 |
|
175 |
+
| 1.0309 | 700 | 0.0568 | - |
|
176 |
+
| 1.1046 | 750 | 0.0765 | - |
|
177 |
+
| 1.1782 | 800 | 0.1193 | - |
|
178 |
+
| 1.2518 | 850 | 0.199 | - |
|
179 |
+
| 1.3255 | 900 | 0.2734 | - |
|
180 |
+
| 1.3991 | 950 | 0.194 | - |
|
181 |
+
| 1.4728 | 1000 | 0.1085 | - |
|
182 |
+
| 1.5464 | 1050 | 0.1496 | - |
|
183 |
+
| 1.6200 | 1100 | 0.1673 | - |
|
184 |
+
| 1.6937 | 1150 | 0.2225 | - |
|
185 |
+
| 1.7673 | 1200 | 0.0503 | - |
|
186 |
+
| 1.8409 | 1250 | 0.1531 | - |
|
187 |
+
| 1.9146 | 1300 | 0.2287 | - |
|
188 |
+
| 1.9882 | 1350 | 0.1187 | - |
|
189 |
+
| **2.0** | **1358** | **-** | **0.2055** |
|
190 |
+
| 2.0619 | 1400 | 0.0546 | - |
|
191 |
+
| 2.1355 | 1450 | 0.2072 | - |
|
192 |
+
| 2.2091 | 1500 | 0.1208 | - |
|
193 |
+
| 2.2828 | 1550 | 0.0837 | - |
|
194 |
+
| 2.3564 | 1600 | 0.0405 | - |
|
195 |
+
| 2.4300 | 1650 | 0.1334 | - |
|
196 |
+
| 2.5037 | 1700 | 0.1458 | - |
|
197 |
+
| 2.5773 | 1750 | 0.2189 | - |
|
198 |
+
| 2.6510 | 1800 | 0.0561 | - |
|
199 |
+
| 2.7246 | 1850 | 0.1656 | - |
|
200 |
+
| 2.7982 | 1900 | 0.1351 | - |
|
201 |
+
| 2.8719 | 1950 | 0.1826 | - |
|
202 |
+
| 2.9455 | 2000 | 0.1905 | - |
|
203 |
+
| 3.0 | 2037 | - | 0.2273 |
|
204 |
|
205 |
* The bold row denotes the saved checkpoint.
|
206 |
### Framework Versions
|
config.json
CHANGED
@@ -1,5 +1,5 @@
|
|
1 |
{
|
2 |
-
"_name_or_path": "checkpoints/
|
3 |
"architectures": [
|
4 |
"BertModel"
|
5 |
],
|
|
|
1 |
{
|
2 |
+
"_name_or_path": "checkpoints/step_1358",
|
3 |
"architectures": [
|
4 |
"BertModel"
|
5 |
],
|
config_setfit.json
CHANGED
@@ -2,6 +2,8 @@
|
|
2 |
"normalize_embeddings": false,
|
3 |
"labels": [
|
4 |
"lexical",
|
5 |
-
"semantic"
|
|
|
|
|
6 |
]
|
7 |
}
|
|
|
2 |
"normalize_embeddings": false,
|
3 |
"labels": [
|
4 |
"lexical",
|
5 |
+
"semantic",
|
6 |
+
"very_lexical",
|
7 |
+
"very_semantic"
|
8 |
]
|
9 |
}
|
model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 90864192
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:7fac62744a83855a95a3e80c70bf8a4648a3c5a1cd0053760fa1ff330790c771
|
3 |
size 90864192
|
model_head.pkl
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:a5a2800b0ffabd217138abf7b9e4a3321ce002b79f4c83251f28a4f0a7a58788
|
3 |
+
size 13367
|