Edit model card

SetFit with sentence-transformers/paraphrase-mpnet-base-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-mpnet-base-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
0
  • ', metal unless it was 70s oldskool'
  • "do whichever you think would be best then: if you rename the account, just let me know, and i'll go over there to recreate it; but if you'd prefer to rename the account, recreate it, and send me the password which i can then change, that's fine with me."
  • '" no, it was a far-too-much-of-an-in-joke on the fact that principle→principal is usually one of the first things the fa regulars jump on (along with the dreaded spaced em dash)\xa0–\xa0scent "'
1
  • "i'm going to kill you zink dawg your a scrap and a fag and you need to die. i'm going to kill you if someone else doesn't. you better keep protecting your user space because i will keep vandalizing it."
  • "hope your head gets cut off and someone wipes there ass with it and then stabs you in your heart!!! no one can keep me from here you dumb piece of shit, not yamla's ass ass or you!!! now hurry up and sit in some traffic so i can see your head roll down the street"
  • 'unblock me now you piece of shit! or i will find you and kill you, slowly and painfully!'

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("waterabbit114/my-setfit-classifier_threat")
# Run inference
preds = model("\"   link   thanks for fixing that disambiguation link on usher's album ) flash; \"")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 3 50.65 426
Label Training Sample Count
0 10
1 10

Training Hyperparameters

  • batch_size: (1, 1)
  • num_epochs: (10, 10)
  • max_steps: -1
  • sampling_strategy: oversampling
  • num_iterations: 20
  • body_learning_rate: (2e-05, 2e-05)
  • head_learning_rate: 2e-05
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0013 1 0.192 -
0.0625 50 0.0173 -
0.125 100 0.0013 -
0.1875 150 0.0024 -
0.25 200 0.0002 -
0.3125 250 0.0 -
0.375 300 0.0 -
0.4375 350 0.0006 -
0.5 400 0.0003 -
0.5625 450 0.0001 -
0.625 500 0.0001 -
0.6875 550 0.0002 -
0.75 600 0.0008 -
0.8125 650 0.0002 -
0.875 700 0.0001 -
0.9375 750 0.0009 -
1.0 800 0.0001 -
1.0625 850 0.0001 -
1.125 900 0.0001 -
1.1875 950 0.0 -
1.25 1000 0.0 -
1.3125 1050 0.0 -
1.375 1100 0.0001 -
1.4375 1150 0.0 -
1.5 1200 0.0 -
1.5625 1250 0.0 -
1.625 1300 0.0 -
1.6875 1350 0.0 -
1.75 1400 0.0003 -
1.8125 1450 0.0001 -
1.875 1500 0.0 -
1.9375 1550 0.0001 -
2.0 1600 0.0 -
2.0625 1650 0.0 -
2.125 1700 0.0001 -
2.1875 1750 0.0 -
2.25 1800 0.0 -
2.3125 1850 0.0 -
2.375 1900 0.0 -
2.4375 1950 0.0 -
2.5 2000 0.0 -
2.5625 2050 0.0 -
2.625 2100 0.0001 -
2.6875 2150 0.0 -
2.75 2200 0.0 -
2.8125 2250 0.0002 -
2.875 2300 0.0 -
2.9375 2350 0.0 -
3.0 2400 0.0002 -
3.0625 2450 0.0 -
3.125 2500 0.0001 -
3.1875 2550 0.0001 -
3.25 2600 0.0001 -
3.3125 2650 0.0 -
3.375 2700 0.0 -
3.4375 2750 0.0 -
3.5 2800 0.0 -
3.5625 2850 0.0 -
3.625 2900 0.0 -
3.6875 2950 0.0 -
3.75 3000 0.0 -
3.8125 3050 0.0 -
3.875 3100 0.0002 -
3.9375 3150 0.0 -
4.0 3200 0.0 -
4.0625 3250 0.0001 -
4.125 3300 0.0001 -
4.1875 3350 0.0 -
4.25 3400 0.0004 -
4.3125 3450 0.0001 -
4.375 3500 0.0001 -
4.4375 3550 0.0001 -
4.5 3600 0.0 -
4.5625 3650 0.0 -
4.625 3700 0.0 -
4.6875 3750 0.0 -
4.75 3800 0.0 -
4.8125 3850 0.0 -
4.875 3900 0.0001 -
4.9375 3950 0.0 -
5.0 4000 0.0 -
5.0625 4050 0.0 -
5.125 4100 0.0 -
5.1875 4150 0.0 -
5.25 4200 0.0 -
5.3125 4250 0.0002 -
5.375 4300 0.0 -
5.4375 4350 0.0 -
5.5 4400 0.0 -
5.5625 4450 0.0001 -
5.625 4500 0.0 -
5.6875 4550 0.0 -
5.75 4600 0.0002 -
5.8125 4650 0.0 -
5.875 4700 0.0 -
5.9375 4750 0.0 -
6.0 4800 0.0 -
6.0625 4850 0.0 -
6.125 4900 0.0 -
6.1875 4950 0.0 -
6.25 5000 0.0 -
6.3125 5050 0.0 -
6.375 5100 0.0001 -
6.4375 5150 0.0 -
6.5 5200 0.0 -
6.5625 5250 0.0 -
6.625 5300 0.0 -
6.6875 5350 0.0 -
6.75 5400 0.0 -
6.8125 5450 0.0 -
6.875 5500 0.0 -
6.9375 5550 0.0 -
7.0 5600 0.0 -
7.0625 5650 0.0 -
7.125 5700 0.0 -
7.1875 5750 0.0 -
7.25 5800 0.0001 -
7.3125 5850 0.0 -
7.375 5900 0.0 -
7.4375 5950 0.0 -
7.5 6000 0.0 -
7.5625 6050 0.0 -
7.625 6100 0.0 -
7.6875 6150 0.0 -
7.75 6200 0.0 -
7.8125 6250 0.0 -
7.875 6300 0.0 -
7.9375 6350 0.0 -
8.0 6400 0.0 -
8.0625 6450 0.0 -
8.125 6500 0.0 -
8.1875 6550 0.0 -
8.25 6600 0.0 -
8.3125 6650 0.0 -
8.375 6700 0.0 -
8.4375 6750 0.0 -
8.5 6800 0.0 -
8.5625 6850 0.0 -
8.625 6900 0.0 -
8.6875 6950 0.0 -
8.75 7000 0.0 -
8.8125 7050 0.0 -
8.875 7100 0.0 -
8.9375 7150 0.0 -
9.0 7200 0.0 -
9.0625 7250 0.0 -
9.125 7300 0.0 -
9.1875 7350 0.0 -
9.25 7400 0.0 -
9.3125 7450 0.0 -
9.375 7500 0.0 -
9.4375 7550 0.0 -
9.5 7600 0.0 -
9.5625 7650 0.0 -
9.625 7700 0.0 -
9.6875 7750 0.0 -
9.75 7800 0.0 -
9.8125 7850 0.0 -
9.875 7900 0.0 -
9.9375 7950 0.0 -
10.0 8000 0.0 -

Framework Versions

  • Python: 3.11.7
  • SetFit: 1.0.3
  • Sentence Transformers: 2.2.2
  • Transformers: 4.35.2
  • PyTorch: 2.1.1+cu121
  • Datasets: 2.14.5
  • Tokenizers: 0.15.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
4
Safetensors
Model size
109M params
Tensor type
F32
·

Finetuned from