Edit model card

SetFit

This is a SetFit model that can be used for Text Classification. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

  • Model Type: SetFit
  • Classification head: a LogisticRegression instance
  • Maximum Sequence Length: 128 tokens
  • Number of Classes: 44 classes

Model Sources

Model Labels

Label Examples
Shopping / electronics & multimedia
  • 'achat dji technology carte chn'
  • 'facture carte samsung paris opera carte'
Other / kids
  • 'virement sortant cadeau anniversaire neveu'
  • 'paiement carte lunapark family fun carte'
Bank services / other
  • 'paiement frais demande rib iban supplémentaires carte'
  • 'frais changement de pin carte'
Housing / rent
  • 'paiement loyer rue des oliviers carte'
  • 'sepa regl loyer resid les ormeaux carte'
Transportation / other
  • 'parking aeroport charles de gaulle carte'
  • 'frais douane import vehicule usa carte usd commission'
Bank services / transfers
  • 'transfer location vacances famille roux carte'
  • 'virement sepa entrant de loyer mars carte'
Investment / retirement & savings
  • 'alimentation plan epargne logement carte'
  • 'allocation retraite complémentaire carte'
Other / taxes
  • 'contribution economique territoriale siret frcte'
  • 'taxe apprentissage siret frapp'
Healthy & Beauty / other
  • 'adhésion club randonnée plein air'
  • 'achat en ligne produits aromatherapie naturesence carte'
Investment / securities
  • 'investissement silver etf carte silver oz'
  • 'transaction actions netflix carte usd'
Housing / other
  • 'virement recu du remboursement depot de garantie'
  • 'prlv sepa du alarmes securitas direct'
Housing / house loan
  • 'solde emprunt habitat fortuneo pret'
  • 'prelevement sepa pret habitation hsbc france'
Housing / utilities & bills
  • 'prlv sepa grdf'
  • 'prlv sepa total direct energie elec'
Bank services / general fees
  • 'frais opposition cheque perdu'
  • 'frais de gestion portefeuille titres'
Leisure & Entertainment / culture & events
  • 'prlv sepa cinema cgr lille'
  • 'achat carte festival rock en seine carte'
Transportation / taxi & carpool
  • 'prlv sepa blablacar carte'
  • 'facture carte du kakao taxi seoul carte kor krw commission'
Shopping / other
  • 'achat coffrets cadeaux pandore carte'
  • 'facture carte du magasin l unique montpellier carte'
Recurrent Payments / loans
  • 'retrait auto emma pret familial emmaprt carte'
  • 'paiement échéance axa pret professionnel carte'
Healthy & Beauty / doctor fees
  • 'facture carte du dr pierre neurologue carte'
  • 'facture carte du dr marchand orthopediste carte'
Bank services / withdrawal
  • 'retrait dab banque express toulouse carte fr'
  • 'retrait dab ecobanque lyon carte fr'
Other / other
  • 'facture carte du cinema rexy paris carte'
  • 'don association sos villages enfants'
Healthy & Beauty / pharmacy
  • 'prlv sepa pharmacie azureech'
  • 'debit carte pharmacie grand ciel carte'
Transportation / fuel
  • 'facture carte du total energies paris carte'
  • 'prlv sepa du q bruxelles carte bel'
Shopping / sporting goods
  • 'pmt carte fitnessboutique lyon carte'
  • 'paiement carte go sport montpellier carte'
Food & Drinks / groceries
  • 'facture carte du magasin asiatique lee carte'
  • 'debit charcuterie gourmets carte'
Other / pets
  • 'prlv sepa soins veterinaires urgences'
  • 'achat académie dressage canin carte'
Investment / real estate
  • 'virement sortant investissement immobilier crowdfunding carte'
  • 'virement recu vente local commercial nice carte'
Shopping / clothing
  • 'achat decathlon carte'
  • 'achat carte nike store carte usa usd commission'
Shopping / housing equipment
  • 'facture carte du conforama montpellier carte'
  • 'paiement par carte ambiances matieres marseille carte'
Transportation / maitenance
  • 'facture du vitres teintees luxe bordeaux carte'
  • 'debit du garage turbo moteurs strasbourg carte remise a neuf'
Recurrent Payments / other
  • 'abonnement annuel magazine interstellar transaction date'
  • 'cotisation annuelle club échecs rois et pions date'
Recurrent Payments / insurance
  • 'prelevement sepa assurance multirisque pro mma'
  • 'prélèvement mensuel assurance collective cnp'
Healthy & Beauty / veterinary
  • 'deworming petcare lyon carte'
  • 'prlv sepa hospital vet duval limoges'
Transportation / public transportation
  • 'achat titres v ville de lille carte'
  • 'abonnement tram strasbourg cts carte'
Healthy & Beauty / beauty & self-care
  • 'prlv sepa abonnement biotyfull box'
  • 'facture carte du mac cosmetics nice carte'
Leisure & Entertainment / other
  • 'paiement en ligne du amazon prime video carte usa'
  • 'facture carte du spotify premium carte usa'
Food & Drinks / eating out
  • 'facture carte du cafe de flore carte'
  • 'facture carte du mcdonald s carte usa usd commission'
Housing / services & maintenance
  • 'prlv sepa electricite generale flash'
  • 'virement recu soldes tuyauterie moderne'
Leisure & Entertainment / travel
  • 'prlv sepa eurostar'
  • 'achat carte hertz location carte usa usd commission'
Leisure & Entertainment / sports & hobbies
  • 'paiement en ligne du adidas fr carte'
  • 'facture carte du culture velo lyon carte'
Investment / other
  • 'souscription part sociale coop biolocal'
  • 'participation crowdfunding waterclean projet'
Transportation / car loan & leasing
  • 'virement mensualite bmw x debmwx'
  • 'prlv sepa dacia lodgy crdit auto'
Recurrent Payments / subscription
  • 'prlv sepa microsoft office svc carte'
  • 'facture carte du adobe creative cloud photo carte'
Food & Drinks / other
  • 'facture carte du café de flore carte'
  • 'debit carte caviste le grand cru carte'

Evaluation

Metrics

Label Accuracy
all 0.25

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("HEN10/setfit-particular-transaction-solon-embeddings-labels-large-kaggle-automatisation-v1")
# Run inference
preds = model("achat académie dressage canin carte")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 3 6.0455 10
Label Training Sample Count
Housing / rent 2
Housing / house loan 2
Housing / utilities & bills 2
Housing / services & maintenance 2
Housing / other 2
Food & Drinks / groceries 2
Food & Drinks / eating out 2
Food & Drinks / other 2
Leisure & Entertainment / sports & hobbies 2
Leisure & Entertainment / culture & events 2
Leisure & Entertainment / travel 2
Leisure & Entertainment / other 2
Transportation / car loan & leasing 2
Transportation / fuel 2
Transportation / public transportation 2
Transportation / taxi & carpool 2
Transportation / maitenance 2
Transportation / other 2
Recurrent Payments / loans 2
Recurrent Payments / insurance 2
Recurrent Payments / subscription 2
Recurrent Payments / other 2
Investment / securities 2
Investment / retirement & savings 2
Investment / real estate 2
Investment / other 2
Shopping / clothing 2
Shopping / electronics & multimedia 2
Shopping / sporting goods 2
Shopping / housing equipment 2
Shopping / other 2
Healthy & Beauty / doctor fees 2
Healthy & Beauty / pharmacy 2
Healthy & Beauty / beauty & self-care 2
Healthy & Beauty / veterinary 2
Healthy & Beauty / other 2
Bank services / transfers 2
Bank services / withdrawal 2
Bank services / general fees 2
Bank services / other 2
Other / taxes 2
Other / kids 2
Other / pets 2
Other / other 2

Training Hyperparameters

  • batch_size: (16, 16)
  • num_epochs: (1, 1)
  • max_steps: -1
  • sampling_strategy: oversampling
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: True
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 6
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0021 1 0.1662 -
0.1057 50 0.1483 -
0.2114 100 0.0681 -
0.3171 150 0.0298 -
0.4228 200 0.0245 -
0.5285 250 0.0117 -
0.6342 300 0.032 -
0.7400 350 0.0112 -
0.8457 400 0.0072 -
0.9514 450 0.0176 -

Framework Versions

  • Python: 3.10.13
  • SetFit: 1.0.3
  • Sentence Transformers: 2.6.1
  • Transformers: 4.39.3
  • PyTorch: 2.1.2
  • Datasets: 2.17.0
  • Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
577
Safetensors
Model size
33.4M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Evaluation results