Edit model card

SetFit with sentence-transformers/all-mpnet-base-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/all-mpnet-base-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
1
  • 'i panicked and made chatgpt write everything . '
  • 'it is fundamental that chatgpt with developer mode can say anything about anyone , at any time for any reason . '
  • "chatgpt itself mentioned that homebrewing requires good system mastery and an understanding of the developer 's game philosophy to make properly balanced feats . "
0
  • 'chatgpt confirmed it . '
  • "chatgpt does n't know that it writing text that is easily detected . "
  • "the timing of entering the initial prompt is essential to ensure that chatgpt understands the user 's request and can provide an accurate response . "
2
  • '3 . diversion : chatgpt might also create a diversion , directing a group of wasps to move away from the nest and act as a decoy . '
  • 'chatgpt can generate content on a wide range of subjects , so the possibilities are endless . '
  • 'does anyone know if chatgpt can generate the code of a sound wave , with the specifications that are requested , as it does with programming codes . '

Evaluation

Metrics

Label Accuracy Precision Recall F1
all 0.75 0.7667 0.7460 0.7488

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("setfit_model_id")
# Run inference
preds = model("chatgpt makes choices , algorithms are n't neutral . ")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 3 20.7848 51
Label Training Sample Count
0 26
1 27
2 26

Training Hyperparameters

  • batch_size: (32, 2)
  • num_epochs: (10, 10)
  • max_steps: -1
  • sampling_strategy: oversampling
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • l2_weight: 0.01
  • seed: 42
  • evaluation_strategy: epoch
  • eval_max_steps: -1
  • load_best_model_at_end: True

Training Results

Epoch Step Training Loss Validation Loss
0.0077 1 0.2555 -
0.3846 50 0.2528 -
0.7692 100 0.1993 -
1.0 130 - 0.1527
1.1538 150 0.0222 -
1.5385 200 0.0023 -
1.9231 250 0.0013 -
2.0 260 - 0.1461
2.3077 300 0.0015 -
2.6923 350 0.0005 -
3.0 390 - 0.1465
3.0769 400 0.0003 -
3.4615 450 0.0002 -
3.8462 500 0.0003 -
4.0 520 - 0.1353
4.2308 550 0.0007 -
4.6154 600 0.0002 -
5.0 650 0.0011 0.1491
5.3846 700 0.0002 -
5.7692 750 0.0002 -
6.0 780 - 0.1478
6.1538 800 0.0002 -
6.5385 850 0.0001 -
6.9231 900 0.0001 -
7.0 910 - 0.1472
7.3077 950 0.0001 -
7.6923 1000 0.0001 -
8.0 1040 - 0.1461
8.0769 1050 0.0001 -
8.4615 1100 0.0001 -
8.8462 1150 0.0001 -
9.0 1170 - 0.1393
9.2308 1200 0.0001 -
9.6154 1250 0.0001 -
10.0 1300 0.0001 0.1399

Framework Versions

  • Python: 3.11.7
  • SetFit: 1.1.0
  • Sentence Transformers: 3.2.0
  • Transformers: 4.45.2
  • PyTorch: 2.4.1
  • Datasets: 3.0.1
  • Tokenizers: 0.20.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
6
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for lucienbaumgartner/mentalizing-class

Finetuned
(165)
this model

Evaluation results