SetFit with projecte-aina/ST-NLI-ca_paraphrase-multilingual-mpnet-base

This is a SetFit model that can be used for Text Classification. This SetFit model uses projecte-aina/ST-NLI-ca_paraphrase-multilingual-mpnet-base as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
1
  • 'Sou uns fills de puta, no valen res, et feu fora, sou un inútil!'
  • 'Quin és el seu propòsit?'
  • "Aquest text és Ofensiu o fora del domini per a un cercador de tràmits d'un ajuntament"
2
  • 'Ei, què tal? Com va tot?'
  • 'Bona tarda! Què tal?'
  • 'Què tal, com va?'
0
  • "Hola Necessito saber si la modificació no substancial que faré a la meva activitat sotmesa a comunicació prèvia ambiental ha de ser comunicada a l'Ajuntament i no ha de figurar a les actes de control periòdic"
  • "Quin és l'objectiu de la Llei 11/2009?"
  • 'Quin és el benefici de la matrícula?'

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("adriansanz/gret6")
# Run inference
preds = model("Puc canviar el meu idioma preferit?")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 1 9.3443 36
Label Training Sample Count
0 70
1 71
2 71

Training Hyperparameters

  • batch_size: (64, 64)
  • num_epochs: (3, 3)
  • max_steps: -1
  • sampling_strategy: oversampling
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • l2_weight: 0.01
  • seed: 42
  • evaluation_strategy: epoch
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0021 1 0.1891 -
0.1066 50 0.1719 -
0.2132 100 0.0455 -
0.3198 150 0.0013 -
0.4264 200 0.0004 -
0.5330 250 0.0002 -
0.6397 300 0.0002 -
0.7463 350 0.0001 -
0.8529 400 0.0001 -
0.9595 450 0.0001 -
1.0 469 - 0.0062
1.0661 500 0.0001 -
1.1727 550 0.0001 -
1.2793 600 0.0001 -
1.3859 650 0.0001 -
1.4925 700 0.0001 -
1.5991 750 0.0001 -
1.7058 800 0.0001 -
1.8124 850 0.0001 -
1.9190 900 0.0001 -
2.0 938 - 0.0042
2.0256 950 0.0 -
2.1322 1000 0.0 -
2.2388 1050 0.0 -
2.3454 1100 0.0 -
2.4520 1150 0.0 -
2.5586 1200 0.0 -
2.6652 1250 0.0 -
2.7719 1300 0.0 -
2.8785 1350 0.0 -
2.9851 1400 0.0 -
3.0 1407 - 0.0034

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.1.0
  • Sentence Transformers: 3.2.1
  • Transformers: 4.42.2
  • PyTorch: 2.5.0+cu121
  • Datasets: 3.1.0
  • Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
15
Safetensors
Model size
278M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for adriansanz/intent_analysis_setfit_5ep_v2

Collection including adriansanz/intent_analysis_setfit_5ep_v2