SetFit with sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
Classification head: a LogisticRegression instance
Maximum Sequence Length: 128 tokens
Number of Classes: 6 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
matches-match_time	'Norwich City vs Newcastle United' 'will Manchester United play with chelsea' 'est-ce que Manchester United jouera avec chelsea'
matches-match_result	'Liverpool and West Ham result' 'what is the score of Wolverhampton match' 'who won in Liverpool vs Newcastle United match'
greet-who_are_you	'how can you help me' "pourquoi j'ai besoin de toi" 'je ne te comprends pas'
matches-team_next_match	'Real Madrid fixtures' 'quels sont les prochains matchs de Borussia Dortmund' 'próximos partidos de Atletico Madrid'
greet-good_bye	'See you later' 'A plus tard' 'stop'
greet-hi	'Hello buddy' 'Salut' 'Hey'

Evaluation

Metrics

Label	Accuracy
all	1.0

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("setfit_model_id")
# Run inference
preds = model("au revoir")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	1	5.2	10

Label	Training Sample Count
greet-hi	5
greet-who_are_you	7
greet-good_bye	5
matches-team_next_match	21
matches-match_time	12
matches-match_result	15

Training Hyperparameters

batch_size: (4, 4)
num_epochs: (4, 4)
max_steps: -1
sampling_strategy: oversampling
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
seed: 42
eval_max_steps: -1
load_best_model_at_end: True

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0012	1	0.1544	-
0.0121	10	0.0658	-
0.0241	20	0.1235	-
0.0362	30	0.2422	-
0.0483	40	0.2876	-
0.0603	50	0.1208	-
0.0724	60	0.1358	-
0.0844	70	0.1494	-
0.0965	80	0.1284	-
0.1086	90	0.1107	-
0.1206	100	0.2395	-
0.1327	110	0.0661	-
0.1448	120	0.1554	-
0.1568	130	0.0258	-
0.1689	140	0.0279	-
0.1809	150	0.1162	-
0.1930	160	0.0244	-
0.2051	170	0.0221	-
0.2171	180	0.0813	-
0.2292	190	0.0188	-
0.2413	200	0.03	-
0.2533	210	0.0019	-
0.2654	220	0.0076	-
0.2774	230	0.01	-
0.2895	240	0.0025	-
0.3016	250	0.0705	-
0.3136	260	0.0044	-
0.3257	270	0.0038	-
0.3378	280	0.006	-
0.3498	290	0.0018	-
0.3619	300	0.0003	-
0.3739	310	0.0007	-
0.3860	320	0.0128	-
0.3981	330	0.0022	-
0.4101	340	0.0008	-
0.4222	350	0.004	-
0.4343	360	0.0006	-
0.4463	370	0.0007	-
0.4584	380	0.0005	-
0.4704	390	0.0057	-
0.4825	400	0.0007	-
0.4946	410	0.0022	-
0.5066	420	0.0012	-
0.5187	430	0.0009	-
0.5308	440	0.0004	-
0.5428	450	0.0032	-
0.5549	460	0.0007	-
0.5669	470	0.0008	-
0.5790	480	0.0005	-
0.5911	490	0.0005	-
0.6031	500	0.0008	-
0.6152	510	0.0008	-
0.6273	520	0.0004	-
0.6393	530	0.0015	-
0.6514	540	0.0002	-
0.6634	550	0.0006	-
0.6755	560	0.0015	-
0.6876	570	0.0024	-
0.6996	580	0.0004	-
0.7117	590	0.0005	-
0.7238	600	0.0011	-
0.7358	610	0.0008	-
0.7479	620	0.0002	-
0.7600	630	0.0006	-
0.7720	640	0.0003	-
0.7841	650	0.0002	-
0.7961	660	0.0007	-
0.8082	670	0.0009	-
0.8203	680	0.0002	-
0.8323	690	0.0006	-
0.8444	700	0.0015	-
0.8565	710	0.0003	-
0.8685	720	0.0003	-
0.8806	730	0.0003	-
0.8926	740	0.0015	-
0.9047	750	0.0003	-
0.9168	760	0.0005	-
0.9288	770	0.0002	-
0.9409	780	0.0003	-
0.9530	790	0.0002	-
0.9650	800	0.0004	-
0.9771	810	0.0003	-
0.9891	820	0.001	-
1.0	829	-	0.0216
1.0012	830	0.0003	-
1.0133	840	0.0007	-
1.0253	850	0.0004	-
1.0374	860	0.0001	-
1.0495	870	0.0008	-
1.0615	880	0.0003	-
1.0736	890	0.0006	-
1.0856	900	0.0001	-
1.0977	910	0.0018	-
1.1098	920	0.0	-
1.1218	930	0.0001	-
1.1339	940	0.0007	-
1.1460	950	0.0009	-
1.1580	960	0.0004	-
1.1701	970	0.0003	-
1.1821	980	0.0015	-
1.1942	990	0.0002	-
1.2063	1000	0.0005	-
1.2183	1010	0.0002	-
1.2304	1020	0.0003	-
1.2425	1030	0.0001	-
1.2545	1040	0.0002	-
1.2666	1050	0.0004	-
1.2786	1060	0.0001	-
1.2907	1070	0.0002	-
1.3028	1080	0.0001	-
1.3148	1090	0.0002	-
1.3269	1100	0.0001	-
1.3390	1110	0.0002	-
1.3510	1120	0.0003	-
1.3631	1130	0.0001	-
1.3752	1140	0.0001	-
1.3872	1150	0.0001	-
1.3993	1160	0.0002	-
1.4113	1170	0.0001	-
1.4234	1180	0.0005	-
1.4355	1190	0.0002	-
1.4475	1200	0.0002	-
1.4596	1210	0.0002	-
1.4717	1220	0.0001	-
1.4837	1230	0.0001	-
1.4958	1240	0.0001	-
1.5078	1250	0.0001	-
1.5199	1260	0.001	-
1.5320	1270	0.0001	-
1.5440	1280	0.0003	-
1.5561	1290	0.0001	-
1.5682	1300	0.0002	-
1.5802	1310	0.0005	-
1.5923	1320	0.0002	-
1.6043	1330	0.0001	-
1.6164	1340	0.0004	-
1.6285	1350	0.0002	-
1.6405	1360	0.0001	-
1.6526	1370	0.0004	-
1.6647	1380	0.0003	-
1.6767	1390	0.0002	-
1.6888	1400	0.0001	-
1.7008	1410	0.0008	-
1.7129	1420	0.0003	-
1.7250	1430	0.0005	-
1.7370	1440	0.0001	-
1.7491	1450	0.0001	-
1.7612	1460	0.0001	-
1.7732	1470	0.0007	-
1.7853	1480	0.0001	-
1.7973	1490	0.0002	-
1.8094	1500	0.0001	-
1.8215	1510	0.001	-
1.8335	1520	0.0002	-
1.8456	1530	0.0003	-
1.8577	1540	0.0004	-
1.8697	1550	0.0005	-
1.8818	1560	0.0001	-
1.8938	1570	0.0006	-
1.9059	1580	0.0005	-
1.9180	1590	0.0002	-
1.9300	1600	0.0002	-
1.9421	1610	0.0001	-
1.9542	1620	0.0003	-
1.9662	1630	0.0005	-
1.9783	1640	0.0007	-
1.9903	1650	0.0001	-
2.0	1658	-	0.0186
2.0024	1660	0.0	-
2.0145	1670	0.0001	-
2.0265	1680	0.0002	-
2.0386	1690	0.0001	-
2.0507	1700	0.0002	-
2.0627	1710	0.0001	-
2.0748	1720	0.0001	-
2.0869	1730	0.0002	-
2.0989	1740	0.0001	-
2.1110	1750	0.0002	-
2.1230	1760	0.0001	-
2.1351	1770	0.0003	-
2.1472	1780	0.0006	-
2.1592	1790	0.0001	-
2.1713	1800	0.0002	-
2.1834	1810	0.0002	-
2.1954	1820	0.0001	-
2.2075	1830	0.0	-
2.2195	1840	0.0001	-
2.2316	1850	0.0002	-
2.2437	1860	0.0004	-
2.2557	1870	0.0003	-
2.2678	1880	0.0002	-
2.2799	1890	0.0002	-
2.2919	1900	0.0004	-
2.3040	1910	0.0002	-
2.3160	1920	0.0001	-
2.3281	1930	0.0	-
2.3402	1940	0.0002	-
2.3522	1950	0.0001	-
2.3643	1960	0.0	-
2.3764	1970	0.0003	-
2.3884	1980	0.0002	-
2.4005	1990	0.0001	-
2.4125	2000	0.0003	-
2.4246	2010	0.0003	-
2.4367	2020	0.0002	-
2.4487	2030	0.0002	-
2.4608	2040	0.0002	-
2.4729	2050	0.0001	-
2.4849	2060	0.0001	-
2.4970	2070	0.0002	-
2.5090	2080	0.0	-
2.5211	2090	0.0002	-
2.5332	2100	0.0004	-
2.5452	2110	0.0005	-
2.5573	2120	0.0003	-
2.5694	2130	0.0001	-
2.5814	2140	0.0002	-
2.5935	2150	0.0008	-
2.6055	2160	0.0002	-
2.6176	2170	0.0003	-
2.6297	2180	0.0001	-
2.6417	2190	0.0002	-
2.6538	2200	0.0001	-
2.6659	2210	0.0001	-
2.6779	2220	0.0	-
2.6900	2230	0.0002	-
2.7021	2240	0.0	-
2.7141	2250	0.0001	-
2.7262	2260	0.0001	-
2.7382	2270	0.0003	-
2.7503	2280	0.0001	-
2.7624	2290	0.0003	-
2.7744	2300	0.0001	-
2.7865	2310	0.0002	-
2.7986	2320	0.0001	-
2.8106	2330	0.0001	-
2.8227	2340	0.0001	-
2.8347	2350	0.0001	-
2.8468	2360	0.0002	-
2.8589	2370	0.0001	-
2.8709	2380	0.0001	-
2.8830	2390	0.0	-
2.8951	2400	0.0	-
2.9071	2410	0.0	-
2.9192	2420	0.0001	-
2.9312	2430	0.0002	-
2.9433	2440	0.0001	-
2.9554	2450	0.0001	-
2.9674	2460	0.0001	-
2.9795	2470	0.0003	-
2.9916	2480	0.0001	-
3.0	2487	-	0.0176
3.0036	2490	0.0001	-
3.0157	2500	0.0	-
3.0277	2510	0.0002	-
3.0398	2520	0.0	-
3.0519	2530	0.0002	-
3.0639	2540	0.0002	-
3.0760	2550	0.0	-
3.0881	2560	0.0001	-
3.1001	2570	0.0001	-
3.1122	2580	0.0003	-
3.1242	2590	0.0003	-
3.1363	2600	0.0001	-
3.1484	2610	0.0	-
3.1604	2620	0.0002	-
3.1725	2630	0.0001	-
3.1846	2640	0.0001	-
3.1966	2650	0.0001	-
3.2087	2660	0.0003	-
3.2207	2670	0.0001	-
3.2328	2680	0.0001	-
3.2449	2690	0.0001	-
3.2569	2700	0.0001	-
3.2690	2710	0.0002	-
3.2811	2720	0.0001	-
3.2931	2730	0.0005	-
3.3052	2740	0.0	-
3.3172	2750	0.0001	-
3.3293	2760	0.0002	-
3.3414	2770	0.0003	-
3.3534	2780	0.0001	-
3.3655	2790	0.0001	-
3.3776	2800	0.0001	-
3.3896	2810	0.0004	-
3.4017	2820	0.0001	-
3.4138	2830	0.0002	-
3.4258	2840	0.0001	-
3.4379	2850	0.0003	-
3.4499	2860	0.0001	-
3.4620	2870	0.0002	-
3.4741	2880	0.0001	-
3.4861	2890	0.0003	-
3.4982	2900	0.0003	-
3.5103	2910	0.0001	-
3.5223	2920	0.0	-
3.5344	2930	0.0	-
3.5464	2940	0.0001	-
3.5585	2950	0.0002	-
3.5706	2960	0.0002	-
3.5826	2970	0.0001	-
3.5947	2980	0.0	-
3.6068	2990	0.0001	-
3.6188	3000	0.0003	-
3.6309	3010	0.0001	-
3.6429	3020	0.0	-
3.6550	3030	0.0002	-
3.6671	3040	0.0003	-
3.6791	3050	0.0005	-
3.6912	3060	0.0001	-
3.7033	3070	0.0	-
3.7153	3080	0.0001	-
3.7274	3090	0.0002	-
3.7394	3100	0.0001	-
3.7515	3110	0.0001	-
3.7636	3120	0.0002	-
3.7756	3130	0.0001	-
3.7877	3140	0.0	-
3.7998	3150	0.0001	-
3.8118	3160	0.0001	-
3.8239	3170	0.0001	-
3.8359	3180	0.0001	-
3.8480	3190	0.0005	-
3.8601	3200	0.0	-
3.8721	3210	0.0001	-
3.8842	3220	0.0001	-
3.8963	3230	0.0001	-
3.9083	3240	0.0001	-
3.9204	3250	0.0001	-
3.9324	3260	0.0	-
3.9445	3270	0.0001	-
3.9566	3280	0.0001	-
3.9686	3290	0.0002	-
3.9807	3300	0.0002	-
3.9928	3310	0.0001	-
4.0	3316	-	0.0187

The bold row denotes the saved checkpoint.

Framework Versions

Python: 3.10.12
SetFit: 1.0.3
Sentence Transformers: 3.1.0
Transformers: 4.39.0
PyTorch: 2.4.0+cu121
Datasets: 3.0.0
Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

alexandremn
/

botpress_football_sft_model