SetFit with sentence-transformers/paraphrase-mpnet-base-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-mpnet-base-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: sentence-transformers/paraphrase-mpnet-base-v2
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 5 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
general_faq	'What makes Banarasi silk sarees unique compared to other types of sarees, and what are their main varieties?' 'How to identify mashru silk' 'How can I verify the authenticity of Real Zari in a saree'
product discoverability	'bakery boxes with custom designs' 'What are the different fabric options available for sarees?' 'show me some trending sneakers under 25k'
product faq	'Is the Wmns Dunk Low Harvest Moon available in size 7?' 'What type of color is the Pure Katan silk Kadhwa Bootidaar Banarasi Saree?' 'What type of color is the Pure Katan Silk Pastel Orange Kadhwa Satin Tanchoi Banarasi Saree?'
product policy	'What is the policy for returning a product that was part of a special sale celebration?' 'Can I return an item if it was damaged during delivery preparation?' 'Do you offer express shipping for sneakers?'
order tracking	'I ordered the Cupcake Cases 3 days ago with order no 34567 how long will it take to deliver?' 'Do you provide shipping insurance for high-value orders?' 'My order has been shipped 1 day ago but still not out for delivery. Can you tell how long will it take to deliver?'

Evaluation

Metrics

Label	Accuracy
all	0.9245

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("Shankhdhar/classifier_woog_hkv")
# Run inference
preds = model("cookie boxes with inserts")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	4	11.9441	24

Label	Training Sample Count
general_faq	4
order tracking	28
product discoverability	40
product faq	40
product policy	31

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (2, 2)
max_steps: -1
sampling_strategy: oversampling
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
seed: 42
eval_max_steps: -1
load_best_model_at_end: True

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0010	1	0.3031	-
0.0517	50	0.1396	-
0.1033	100	0.0959	-
0.1550	150	0.0036	-
0.2066	200	0.0009	-
0.2583	250	0.0008	-
0.3099	300	0.0011	-
0.3616	350	0.0005	-
0.4132	400	0.0004	-
0.4649	450	0.0003	-
0.5165	500	0.0003	-
0.5682	550	0.0003	-
0.6198	600	0.0003	-
0.6715	650	0.0001	-
0.7231	700	0.0002	-
0.7748	750	0.0001	-
0.8264	800	0.0002	-
0.8781	850	0.0002	-
0.9298	900	0.0001	-
0.0010	1	0.0002	-
0.0517	50	0.0002	-
0.1033	100	0.0007	-
0.1550	150	0.0001	-
0.2066	200	0.0002	-
0.2583	250	0.0002	-
0.3099	300	0.0001	-
0.3616	350	0.0502	-
0.4132	400	0.0001	-
0.4649	450	0.0001	-
0.5165	500	0.0001	-
0.5682	550	0.0001	-
0.6198	600	0.0	-
0.6715	650	0.0	-
0.7231	700	0.0001	-
0.7748	750	0.0	-
0.8264	800	0.0001	-
0.8781	850	0.0001	-
0.9298	900	0.0001	-
0.9814	950	0.0001	-
1.0331	1000	0.0001	-
1.0847	1050	0.0001	-
1.1364	1100	0.0	-
1.1880	1150	0.0	-
1.2397	1200	0.0	-
1.2913	1250	0.0	-
1.3430	1300	0.0001	-
1.3946	1350	0.0	-
1.4463	1400	0.0	-
1.4979	1450	0.0	-
1.5496	1500	0.0	-
1.6012	1550	0.0	-
1.6529	1600	0.0	-
1.7045	1650	0.0	-
1.7562	1700	0.0001	-
1.8079	1750	0.0	-
1.8595	1800	0.0	-
1.9112	1850	0.0	-
1.9628	1900	0.0	-
0.0010	1	0.0	-
0.0517	50	0.0	-
0.1033	100	0.0001	-
0.1550	150	0.0	-
0.2066	200	0.0001	-
0.2583	250	0.0001	-
0.3099	300	0.0	-
0.3616	350	0.0402	-
0.4132	400	0.0001	-
0.4649	450	0.0	-
0.5165	500	0.0	-
0.5682	550	0.0	-
0.6198	600	0.0	-
0.6715	650	0.0	-
0.7231	700	0.0	-
0.7748	750	0.0	-
0.8264	800	0.0	-
0.8781	850	0.0	-
0.9298	900	0.0	-
0.9814	950	0.0	-
1.0331	1000	0.0	-
1.0847	1050	0.0	-
1.1364	1100	0.0	-
1.1880	1150	0.0	-
1.2397	1200	0.0	-
1.2913	1250	0.0	-
1.3430	1300	0.0	-
1.3946	1350	0.0	-
1.4463	1400	0.0	-
1.4979	1450	0.0	-
1.5496	1500	0.0	-
1.6012	1550	0.0	-
1.6529	1600	0.0	-
1.7045	1650	0.0	-
1.7562	1700	0.0	-
1.8079	1750	0.0	-
1.8595	1800	0.0	-
1.9112	1850	0.0	-
1.9628	1900	0.0	-

Framework Versions

Python: 3.10.12
SetFit: 1.0.3
Sentence Transformers: 3.0.1
Transformers: 4.39.0
PyTorch: 2.2.2+cu121
Datasets: 2.20.0
Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

Shankhdhar
/

classifier_woog_hkv