SetFit with sentence-transformers/paraphrase-mpnet-base-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-mpnet-base-v2 as the Sentence Transformer embedding model. A OneVsRestClassifier instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: sentence-transformers/paraphrase-mpnet-base-v2
Classification head: a OneVsRestClassifier instance
Maximum Sequence Length: 512 tokens

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Evaluation

Metrics

Label	Accuracy
all	0.7125

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("anismahmahi/G2-with-noPropaganda-multilabel-setfit-model")
# Run inference
preds = model("But the author is Bharath Ganesh.")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	1	23.3972	129

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (2, 2)
max_steps: -1
sampling_strategy: oversampling
num_iterations: 10
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
seed: 42
eval_max_steps: -1
load_best_model_at_end: True

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0003	1	0.3874	-
0.0135	50	0.3734	-
0.0270	100	0.2741	-
0.0405	150	0.2802	-
0.0539	200	0.2355	-
0.0674	250	0.2616	-
0.0809	300	0.262	-
0.0944	350	0.2302	-
0.1079	400	0.1962	-
0.1214	450	0.1438	-
0.1348	500	0.2001	-
0.1483	550	0.2126	-
0.1618	600	0.1244	-
0.1753	650	0.1968	-
0.1888	700	0.1473	-
0.2023	750	0.2407	-
0.2157	800	0.1607	-
0.2292	850	0.1376	-
0.2427	900	0.145	-
0.2562	950	0.1439	-
0.2697	1000	0.0418	-
0.2832	1050	0.0822	-
0.2967	1100	0.1042	-
0.3101	1150	0.0381	-
0.3236	1200	0.17	-
0.3371	1250	0.0253	-
0.3506	1300	0.1009	-
0.3641	1350	0.1355	-
0.3776	1400	0.0314	-
0.3910	1450	0.2185	-
0.4045	1500	0.0774	-
0.4180	1550	0.0512	-
0.4315	1600	0.0814	-
0.4450	1650	0.0169	-
0.4585	1700	0.0591	-
0.4720	1750	0.1232	-
0.4854	1800	0.0941	-
0.4989	1850	0.1024	-
0.5124	1900	0.0031	-
0.5259	1950	0.037	-
0.5394	2000	0.1418	-
0.5529	2050	0.0685	-
0.5663	2100	0.0326	-
0.5798	2150	0.0143	-
0.5933	2200	0.064	-
0.6068	2250	0.0612	-
0.6203	2300	0.0689	-
0.6338	2350	0.1402	-
0.6472	2400	0.288	-
0.6607	2450	0.0075	-
0.6742	2500	0.0785	-
0.6877	2550	0.0339	-
0.7012	2600	0.0668	-
0.7147	2650	0.0319	-
0.7282	2700	0.0622	-
0.7416	2750	0.1169	-
0.7551	2800	0.0249	-
0.7686	2850	0.0218	-
0.7821	2900	0.0621	-
0.7956	2950	0.0698	-
0.8091	3000	0.0562	-
0.8225	3050	0.0412	-
0.8360	3100	0.0048	-
0.8495	3150	0.0085	-
0.8630	3200	0.0122	-
0.8765	3250	0.0387	-
0.8900	3300	0.0053	-
0.9035	3350	0.0032	-
0.9169	3400	0.0156	-
0.9304	3450	0.0013	-
0.9439	3500	0.001	-
0.9574	3550	0.0009	-
0.9709	3600	0.0025	-
0.9844	3650	0.0006	-
0.9978	3700	0.0832	-
1.0	3708	-	0.2776
1.0113	3750	0.0735	-
1.0248	3800	0.0053	-
1.0383	3850	0.0614	-
1.0518	3900	0.0005	-
1.0653	3950	0.0046	-
1.0787	4000	0.0024	-
1.0922	4050	0.0004	-
1.1057	4100	0.0016	-
1.1192	4150	0.0789	-
1.1327	4200	0.0016	-
1.1462	4250	0.0018	-
1.1597	4300	0.0005	-
1.1731	4350	0.0051	-
1.1866	4400	0.0139	-
1.2001	4450	0.0021	-
1.2136	4500	0.0064	-
1.2271	4550	0.0025	-
1.2406	4600	0.0054	-
1.2540	4650	0.0022	-
1.2675	4700	0.0734	-
1.2810	4750	0.026	-
1.2945	4800	0.0004	-
1.3080	4850	0.0574	-
1.3215	4900	0.0043	-
1.3350	4950	0.0975	-
1.3484	5000	0.0125	-
1.3619	5050	0.0045	-
1.3754	5100	0.0011	-
1.3889	5150	0.0061	-
1.4024	5200	0.0004	-
1.4159	5250	0.0278	-
1.4293	5300	0.005	-
1.4428	5350	0.0302	-
1.4563	5400	0.0341	-
1.4698	5450	0.0007	-
1.4833	5500	0.0128	-
1.4968	5550	0.0459	-
1.5102	5600	0.0128	-
1.5237	5650	0.0003	-
1.5372	5700	0.004	-
1.5507	5750	0.0005	-
1.5642	5800	0.0005	-
1.5777	5850	0.001	-
1.5912	5900	0.0069	-
1.6046	5950	0.0124	-
1.6181	6000	0.0026	-
1.6316	6050	0.0143	-
1.6451	6100	0.0005	-
1.6586	6150	0.0362	-
1.6721	6200	0.0002	-
1.6855	6250	0.0608	-
1.6990	6300	0.0006	-
1.7125	6350	0.0003	-
1.7260	6400	0.0041	-
1.7395	6450	0.0045	-
1.7530	6500	0.0005	-
1.7665	6550	0.0014	-
1.7799	6600	0.0004	-
1.7934	6650	0.0211	-
1.8069	6700	0.0002	-
1.8204	6750	0.0048	-
1.8339	6800	0.0368	-
1.8474	6850	0.0107	-
1.8608	6900	0.0045	-
1.8743	6950	0.0062	-
1.8878	7000	0.0003	-
1.9013	7050	0.0001	-
1.9148	7100	0.0096	-
1.9283	7150	0.0008	-
1.9417	7200	0.0184	-
1.9552	7250	0.0006	-
1.9687	7300	0.0291	-
1.9822	7350	0.0335	-
1.9957	7400	0.0149	-
2.0	7416	-	0.2666

The bold row denotes the saved checkpoint.

Framework Versions

Python: 3.10.12
SetFit: 1.0.1
Sentence Transformers: 2.2.2
Transformers: 4.35.2
PyTorch: 2.1.0+cu121
Datasets: 2.16.1
Tokenizers: 0.15.0

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

anismahmahi
/

G2-with-noPropaganda-multilabel-setfit-model