metadata

base_model: sentence-transformers/all-MiniLM-L6-v2
library_name: setfit
metrics:
  - accuracy
pipeline_tag: text-classification
tags:
  - setfit
  - sentence-transformers
  - text-classification
  - generated_from_setfit_trainer
widget:
  - text: >-
      What are the key situations that require the preparation of a mission
      order?
  - text: >-
      How can audio data be used to improve speaker identification using neural
      networks?
  - text: >-
      How can organizations balance the need for data privacy with the benefits
      of involving interns in data-related projects?
  - text: What is the purpose of the message posted by the CR?
  - text: >-
      What are the consequences of adopting a 'if not broken, don't fix'
      attitude towards data monitoring?
inference: true
model-index:
  - name: SetFit with sentence-transformers/all-MiniLM-L6-v2
    results:
      - task:
          type: text-classification
          name: Text Classification
        dataset:
          name: Unknown
          type: unknown
          split: test
        metrics:
          - type: accuracy
            value: 0.3076923076923077
            name: Accuracy

SetFit with sentence-transformers/all-MiniLM-L6-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/all-MiniLM-L6-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: sentence-transformers/all-MiniLM-L6-v2
Classification head: a LogisticRegression instance
Maximum Sequence Length: 256 tokens
Number of Classes: 4 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
very_semantic	'What are the key considerations when proposing names for a project or initiative?' 'What are the key aspects of team life and events in a company?' 'What is being asked for or sought in this conversation?'
lexical	'Who is responsible for reviewing and signing documents related to conference submissions?' 'How do data architecture and management systems enable digital transformation and address its associated challenges?' 'How do keys or access credentials get shared or transferred among team members in a workplace?'
very_lexical	'What are some of the key challenges associated with handling and storing large amounts of genomic data?' "What is the focus of Eurobiomed's partnership with Digital113?" 'What are the key considerations for generating well-formatted JSON instances that conform to a given schema?'
semantic	'How can visualizations be used to enhance documentation and collaboration in software development?' 'What are the key considerations when choosing a distance metric for a vector database?' 'How can AI be leveraged to support HR departments in detecting and addressing gender bias?'

Evaluation

Metrics

Label	Accuracy
all	0.3077

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("yaniseuranova/setfit-rag-hybrid-search-query-router-test")
# Run inference
preds = model("What is the purpose of the message posted by the CR?")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	7	14.1913	24

Label	Training Sample Count
lexical	41
semantic	24
very_lexical	17
very_semantic	33

Training Hyperparameters

batch_size: (4, 4)
num_epochs: (2, 2)
max_steps: -1
sampling_strategy: oversampling
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
seed: 42
eval_max_steps: -1
load_best_model_at_end: True

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0004	1	0.4883	-
0.0209	50	0.3738	-
0.0417	100	0.2192	-
0.0626	150	0.1503	-
0.0834	200	0.1514	-
0.1043	250	0.1829	-
0.1251	300	0.4191	-
0.1460	350	0.2136	-
0.1668	400	0.1847	-
0.1877	450	0.1681	-
0.2085	500	0.222	-
0.2294	550	0.0397	-
0.2502	600	0.2626	-
0.2711	650	0.1343	-
0.2919	700	0.1769	-
0.3128	750	0.1704	-
0.3336	800	0.401	-
0.3545	850	0.1405	-
0.3753	900	0.1892	-
0.3962	950	0.1444	-
0.4170	1000	0.2337	-
0.4379	1050	0.1848	-
0.4587	1100	0.0601	-
0.4796	1150	0.2467	-
0.5004	1200	0.1829	-
0.5213	1250	0.1695	-
0.5421	1300	0.3892	-
0.5630	1350	0.1408	-
0.5838	1400	0.0506	-
0.6047	1450	0.1835	-
0.6255	1500	0.3284	-
0.6464	1550	0.1797	-
0.6672	1600	0.1118	-
0.6881	1650	0.1502	-
0.7089	1700	0.112	-
0.7298	1750	0.0401	-
0.7506	1800	0.117	-
0.7715	1850	0.1287	-
0.7923	1900	0.0623	-
0.8132	1950	0.2128	-
0.8340	2000	0.1542	-
0.8549	2050	0.1774	-
0.8757	2100	0.3252	-
0.8966	2150	0.0152	-
0.9174	2200	0.0539	-
0.9383	2250	0.0047	-
0.9591	2300	0.1232	-
0.9800	2350	0.3466	-
1.0	2398	-	0.3644
1.0008	2400	0.0296	-
1.0217	2450	0.3459	-
1.0425	2500	0.0867	-
1.0634	2550	0.1343	-
1.0842	2600	0.2074	-
1.1051	2650	0.0052	-
1.1259	2700	0.0548	-
1.1468	2750	0.0441	-
1.1676	2800	0.0821	-
1.1885	2850	0.0546	-
1.2093	2900	0.1286	-
1.2302	2950	0.1222	-
1.2510	3000	0.0227	-
1.2719	3050	0.3011	-
1.2927	3100	0.018	-
1.3136	3150	0.0581	-
1.3344	3200	0.0485	-
1.3553	3250	0.2369	-
1.3761	3300	0.1681	-
1.3970	3350	0.1289	-
1.4178	3400	0.1664	-
1.4387	3450	0.1467	-
1.4595	3500	0.1399	-
1.4804	3550	0.3045	-
1.5013	3600	0.2155	-
1.5221	3650	0.061	-
1.5430	3700	0.0787	-
1.5638	3750	0.3649	-
1.5847	3800	0.1202	-
1.6055	3850	0.1004	-
1.6264	3900	0.154	-
1.6472	3950	0.0944	-
1.6681	4000	0.0004	-
1.6889	4050	0.1843	-
1.7098	4100	0.2233	-
1.7306	4150	0.2203	-
1.7515	4200	0.0986	-
1.7723	4250	0.2295	-
1.7932	4300	0.1763	-
1.8140	4350	0.3487	-
1.8349	4400	0.3285	-
1.8557	4450	0.0152	-
1.8766	4500	0.1108	-
1.8974	4550	0.2416	-
1.9183	4600	0.0476	-
1.9391	4650	0.2929	-
1.9600	4700	0.1006	-
1.9808	4750	0.0925	-
2.0	4796	-	0.3669

The bold row denotes the saved checkpoint.

Framework Versions

Python: 3.10.12
SetFit: 1.0.3
Sentence Transformers: 2.6.1
Transformers: 4.39.0
PyTorch: 2.3.1+cu121
Datasets: 2.18.0
Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}