Edit model card

SetFit with sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
matches-match_time
  • 'Norwich City vs Newcastle United'
  • 'will Manchester United play with chelsea'
  • 'est-ce que Manchester United jouera avec chelsea'
matches-match_result
  • 'Liverpool and West Ham result'
  • 'what is the score of Wolverhampton match'
  • 'who won in Liverpool vs Newcastle United match'
greet-who_are_you
  • 'how can you help me'
  • "pourquoi j'ai besoin de toi"
  • 'je ne te comprends pas'
matches-team_next_match
  • 'Real Madrid fixtures'
  • 'quels sont les prochains matchs de Borussia Dortmund'
  • 'próximos partidos de Atletico Madrid'
greet-good_bye
  • 'See you later'
  • 'A plus tard'
  • 'stop'
greet-hi
  • 'Hello buddy'
  • 'Salut'
  • 'Hey'

Evaluation

Metrics

Label Accuracy
all 1.0

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("setfit_model_id")
# Run inference
preds = model("au revoir")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 1 5.2 10
Label Training Sample Count
greet-hi 5
greet-who_are_you 7
greet-good_bye 5
matches-team_next_match 21
matches-match_time 12
matches-match_result 15

Training Hyperparameters

  • batch_size: (4, 4)
  • num_epochs: (4, 4)
  • max_steps: -1
  • sampling_strategy: oversampling
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: True

Training Results

Epoch Step Training Loss Validation Loss
0.0012 1 0.1544 -
0.0121 10 0.0658 -
0.0241 20 0.1235 -
0.0362 30 0.2422 -
0.0483 40 0.2876 -
0.0603 50 0.1208 -
0.0724 60 0.1358 -
0.0844 70 0.1494 -
0.0965 80 0.1284 -
0.1086 90 0.1107 -
0.1206 100 0.2395 -
0.1327 110 0.0661 -
0.1448 120 0.1554 -
0.1568 130 0.0258 -
0.1689 140 0.0279 -
0.1809 150 0.1162 -
0.1930 160 0.0244 -
0.2051 170 0.0221 -
0.2171 180 0.0813 -
0.2292 190 0.0188 -
0.2413 200 0.03 -
0.2533 210 0.0019 -
0.2654 220 0.0076 -
0.2774 230 0.01 -
0.2895 240 0.0025 -
0.3016 250 0.0705 -
0.3136 260 0.0044 -
0.3257 270 0.0038 -
0.3378 280 0.006 -
0.3498 290 0.0018 -
0.3619 300 0.0003 -
0.3739 310 0.0007 -
0.3860 320 0.0128 -
0.3981 330 0.0022 -
0.4101 340 0.0008 -
0.4222 350 0.004 -
0.4343 360 0.0006 -
0.4463 370 0.0007 -
0.4584 380 0.0005 -
0.4704 390 0.0057 -
0.4825 400 0.0007 -
0.4946 410 0.0022 -
0.5066 420 0.0012 -
0.5187 430 0.0009 -
0.5308 440 0.0004 -
0.5428 450 0.0032 -
0.5549 460 0.0007 -
0.5669 470 0.0008 -
0.5790 480 0.0005 -
0.5911 490 0.0005 -
0.6031 500 0.0008 -
0.6152 510 0.0008 -
0.6273 520 0.0004 -
0.6393 530 0.0015 -
0.6514 540 0.0002 -
0.6634 550 0.0006 -
0.6755 560 0.0015 -
0.6876 570 0.0024 -
0.6996 580 0.0004 -
0.7117 590 0.0005 -
0.7238 600 0.0011 -
0.7358 610 0.0008 -
0.7479 620 0.0002 -
0.7600 630 0.0006 -
0.7720 640 0.0003 -
0.7841 650 0.0002 -
0.7961 660 0.0007 -
0.8082 670 0.0009 -
0.8203 680 0.0002 -
0.8323 690 0.0006 -
0.8444 700 0.0015 -
0.8565 710 0.0003 -
0.8685 720 0.0003 -
0.8806 730 0.0003 -
0.8926 740 0.0015 -
0.9047 750 0.0003 -
0.9168 760 0.0005 -
0.9288 770 0.0002 -
0.9409 780 0.0003 -
0.9530 790 0.0002 -
0.9650 800 0.0004 -
0.9771 810 0.0003 -
0.9891 820 0.001 -
1.0 829 - 0.0216
1.0012 830 0.0003 -
1.0133 840 0.0007 -
1.0253 850 0.0004 -
1.0374 860 0.0001 -
1.0495 870 0.0008 -
1.0615 880 0.0003 -
1.0736 890 0.0006 -
1.0856 900 0.0001 -
1.0977 910 0.0018 -
1.1098 920 0.0 -
1.1218 930 0.0001 -
1.1339 940 0.0007 -
1.1460 950 0.0009 -
1.1580 960 0.0004 -
1.1701 970 0.0003 -
1.1821 980 0.0015 -
1.1942 990 0.0002 -
1.2063 1000 0.0005 -
1.2183 1010 0.0002 -
1.2304 1020 0.0003 -
1.2425 1030 0.0001 -
1.2545 1040 0.0002 -
1.2666 1050 0.0004 -
1.2786 1060 0.0001 -
1.2907 1070 0.0002 -
1.3028 1080 0.0001 -
1.3148 1090 0.0002 -
1.3269 1100 0.0001 -
1.3390 1110 0.0002 -
1.3510 1120 0.0003 -
1.3631 1130 0.0001 -
1.3752 1140 0.0001 -
1.3872 1150 0.0001 -
1.3993 1160 0.0002 -
1.4113 1170 0.0001 -
1.4234 1180 0.0005 -
1.4355 1190 0.0002 -
1.4475 1200 0.0002 -
1.4596 1210 0.0002 -
1.4717 1220 0.0001 -
1.4837 1230 0.0001 -
1.4958 1240 0.0001 -
1.5078 1250 0.0001 -
1.5199 1260 0.001 -
1.5320 1270 0.0001 -
1.5440 1280 0.0003 -
1.5561 1290 0.0001 -
1.5682 1300 0.0002 -
1.5802 1310 0.0005 -
1.5923 1320 0.0002 -
1.6043 1330 0.0001 -
1.6164 1340 0.0004 -
1.6285 1350 0.0002 -
1.6405 1360 0.0001 -
1.6526 1370 0.0004 -
1.6647 1380 0.0003 -
1.6767 1390 0.0002 -
1.6888 1400 0.0001 -
1.7008 1410 0.0008 -
1.7129 1420 0.0003 -
1.7250 1430 0.0005 -
1.7370 1440 0.0001 -
1.7491 1450 0.0001 -
1.7612 1460 0.0001 -
1.7732 1470 0.0007 -
1.7853 1480 0.0001 -
1.7973 1490 0.0002 -
1.8094 1500 0.0001 -
1.8215 1510 0.001 -
1.8335 1520 0.0002 -
1.8456 1530 0.0003 -
1.8577 1540 0.0004 -
1.8697 1550 0.0005 -
1.8818 1560 0.0001 -
1.8938 1570 0.0006 -
1.9059 1580 0.0005 -
1.9180 1590 0.0002 -
1.9300 1600 0.0002 -
1.9421 1610 0.0001 -
1.9542 1620 0.0003 -
1.9662 1630 0.0005 -
1.9783 1640 0.0007 -
1.9903 1650 0.0001 -
2.0 1658 - 0.0186
2.0024 1660 0.0 -
2.0145 1670 0.0001 -
2.0265 1680 0.0002 -
2.0386 1690 0.0001 -
2.0507 1700 0.0002 -
2.0627 1710 0.0001 -
2.0748 1720 0.0001 -
2.0869 1730 0.0002 -
2.0989 1740 0.0001 -
2.1110 1750 0.0002 -
2.1230 1760 0.0001 -
2.1351 1770 0.0003 -
2.1472 1780 0.0006 -
2.1592 1790 0.0001 -
2.1713 1800 0.0002 -
2.1834 1810 0.0002 -
2.1954 1820 0.0001 -
2.2075 1830 0.0 -
2.2195 1840 0.0001 -
2.2316 1850 0.0002 -
2.2437 1860 0.0004 -
2.2557 1870 0.0003 -
2.2678 1880 0.0002 -
2.2799 1890 0.0002 -
2.2919 1900 0.0004 -
2.3040 1910 0.0002 -
2.3160 1920 0.0001 -
2.3281 1930 0.0 -
2.3402 1940 0.0002 -
2.3522 1950 0.0001 -
2.3643 1960 0.0 -
2.3764 1970 0.0003 -
2.3884 1980 0.0002 -
2.4005 1990 0.0001 -
2.4125 2000 0.0003 -
2.4246 2010 0.0003 -
2.4367 2020 0.0002 -
2.4487 2030 0.0002 -
2.4608 2040 0.0002 -
2.4729 2050 0.0001 -
2.4849 2060 0.0001 -
2.4970 2070 0.0002 -
2.5090 2080 0.0 -
2.5211 2090 0.0002 -
2.5332 2100 0.0004 -
2.5452 2110 0.0005 -
2.5573 2120 0.0003 -
2.5694 2130 0.0001 -
2.5814 2140 0.0002 -
2.5935 2150 0.0008 -
2.6055 2160 0.0002 -
2.6176 2170 0.0003 -
2.6297 2180 0.0001 -
2.6417 2190 0.0002 -
2.6538 2200 0.0001 -
2.6659 2210 0.0001 -
2.6779 2220 0.0 -
2.6900 2230 0.0002 -
2.7021 2240 0.0 -
2.7141 2250 0.0001 -
2.7262 2260 0.0001 -
2.7382 2270 0.0003 -
2.7503 2280 0.0001 -
2.7624 2290 0.0003 -
2.7744 2300 0.0001 -
2.7865 2310 0.0002 -
2.7986 2320 0.0001 -
2.8106 2330 0.0001 -
2.8227 2340 0.0001 -
2.8347 2350 0.0001 -
2.8468 2360 0.0002 -
2.8589 2370 0.0001 -
2.8709 2380 0.0001 -
2.8830 2390 0.0 -
2.8951 2400 0.0 -
2.9071 2410 0.0 -
2.9192 2420 0.0001 -
2.9312 2430 0.0002 -
2.9433 2440 0.0001 -
2.9554 2450 0.0001 -
2.9674 2460 0.0001 -
2.9795 2470 0.0003 -
2.9916 2480 0.0001 -
3.0 2487 - 0.0176
3.0036 2490 0.0001 -
3.0157 2500 0.0 -
3.0277 2510 0.0002 -
3.0398 2520 0.0 -
3.0519 2530 0.0002 -
3.0639 2540 0.0002 -
3.0760 2550 0.0 -
3.0881 2560 0.0001 -
3.1001 2570 0.0001 -
3.1122 2580 0.0003 -
3.1242 2590 0.0003 -
3.1363 2600 0.0001 -
3.1484 2610 0.0 -
3.1604 2620 0.0002 -
3.1725 2630 0.0001 -
3.1846 2640 0.0001 -
3.1966 2650 0.0001 -
3.2087 2660 0.0003 -
3.2207 2670 0.0001 -
3.2328 2680 0.0001 -
3.2449 2690 0.0001 -
3.2569 2700 0.0001 -
3.2690 2710 0.0002 -
3.2811 2720 0.0001 -
3.2931 2730 0.0005 -
3.3052 2740 0.0 -
3.3172 2750 0.0001 -
3.3293 2760 0.0002 -
3.3414 2770 0.0003 -
3.3534 2780 0.0001 -
3.3655 2790 0.0001 -
3.3776 2800 0.0001 -
3.3896 2810 0.0004 -
3.4017 2820 0.0001 -
3.4138 2830 0.0002 -
3.4258 2840 0.0001 -
3.4379 2850 0.0003 -
3.4499 2860 0.0001 -
3.4620 2870 0.0002 -
3.4741 2880 0.0001 -
3.4861 2890 0.0003 -
3.4982 2900 0.0003 -
3.5103 2910 0.0001 -
3.5223 2920 0.0 -
3.5344 2930 0.0 -
3.5464 2940 0.0001 -
3.5585 2950 0.0002 -
3.5706 2960 0.0002 -
3.5826 2970 0.0001 -
3.5947 2980 0.0 -
3.6068 2990 0.0001 -
3.6188 3000 0.0003 -
3.6309 3010 0.0001 -
3.6429 3020 0.0 -
3.6550 3030 0.0002 -
3.6671 3040 0.0003 -
3.6791 3050 0.0005 -
3.6912 3060 0.0001 -
3.7033 3070 0.0 -
3.7153 3080 0.0001 -
3.7274 3090 0.0002 -
3.7394 3100 0.0001 -
3.7515 3110 0.0001 -
3.7636 3120 0.0002 -
3.7756 3130 0.0001 -
3.7877 3140 0.0 -
3.7998 3150 0.0001 -
3.8118 3160 0.0001 -
3.8239 3170 0.0001 -
3.8359 3180 0.0001 -
3.8480 3190 0.0005 -
3.8601 3200 0.0 -
3.8721 3210 0.0001 -
3.8842 3220 0.0001 -
3.8963 3230 0.0001 -
3.9083 3240 0.0001 -
3.9204 3250 0.0001 -
3.9324 3260 0.0 -
3.9445 3270 0.0001 -
3.9566 3280 0.0001 -
3.9686 3290 0.0002 -
3.9807 3300 0.0002 -
3.9928 3310 0.0001 -
4.0 3316 - 0.0187
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.0.3
  • Sentence Transformers: 3.1.0
  • Transformers: 4.39.0
  • PyTorch: 2.4.0+cu121
  • Datasets: 3.0.0
  • Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
8
Safetensors
Model size
118M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for alexandremn/botpress_football_sft_model

Evaluation results