metadata

base_model: BAAI/bge-small-en-v1.5
library_name: setfit
metrics:
  - accuracy
pipeline_tag: text-classification
tags:
  - setfit
  - sentence-transformers
  - text-classification
  - generated_from_setfit_trainer
widget:
  - text: >-
      I have owned this NAS for almost a year now and actually purchased a
      second one It works flawlessly and QNAP live tech support is superb There
      is also a fairly comprehensive forum for users as well I have slowly
      upgraded my capacities as newer larger capacity drives have come out on
      the market All have been recognized and the space expanded without a hitch
      I highly recommend this product 
  - text: Good as expected
  - text: >-
      This is a very good video editing package In the past I ve only used Corel
      video editing products but Cyberlink s offering is on par It offers
      similar options but they are different enough for me to want to use both
      products depending on what I m trying to achieve There are quick uploading
      options that make it very easy to get video onto Youtube and other online
      video sites 
  - text: Works great
  - text: >-
      This is my favorite crack open the computer and amuse myself for a few
      hours software Easy to pick up if you have no prior experience with
      computer animation but advanced enough that someone with the right skills
      could pull together an impressive movie 
inference: true

SetFit with BAAI/bge-small-en-v1.5

This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-small-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: BAAI/bge-small-en-v1.5
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 2 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
0	'Have used Turbo Tax for years Never a problem I m pretty concerned now with the news that many of their users had their returns hacked by people who gained access to Turbo Tax and stole the information Not sure I will use it next year until I research how serious this is was ' 'Can t beat an Apple computer Like P KB best by test ' 'Works for Mac or Pc but not on widows '
1	'Would not install activation code not accepted Returned it ' 'Worth all four of the software programs which are included in this product ' 'The marketing information makes this software look like it should be fabulous lots of useful features that I would love to experiment with However the software just doesn t work I will keep using my very old JASC version of this software instead '

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("selina09/yt_setfit")
# Run inference
preds = model("Works great")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	1	34.9207	102

Label	Training Sample Count
0	123
1	41

Training Hyperparameters

batch_size: (32, 32)
num_epochs: (10, 10)
max_steps: -1
sampling_strategy: oversampling
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0019	1	0.2503	-
0.0942	50	0.2406	-
0.1883	100	0.2029	-
0.2825	150	0.2207	-
0.3766	200	0.1612	-
0.4708	250	0.0725	-
0.5650	300	0.0163	-
0.6591	350	0.0108	-
0.7533	400	0.0153	-
0.8475	450	0.0486	-
0.9416	500	0.0191	-
1.0358	550	0.0207	-
1.1299	600	0.0148	-
1.2241	650	0.0031	-
1.3183	700	0.001	-
1.4124	750	0.0287	-
1.5066	800	0.0146	-
1.6008	850	0.0147	-
1.6949	900	0.0165	-
1.7891	950	0.0008	-
1.8832	1000	0.0165	-
1.9774	1050	0.0007	-
2.0716	1100	0.0129	-
2.1657	1150	0.0143	-
2.2599	1200	0.0006	-
2.3540	1250	0.0008	-
2.4482	1300	0.0047	-
2.5424	1350	0.0005	-
2.6365	1400	0.0116	-
2.7307	1450	0.0093	-
2.8249	1500	0.0211	-
2.9190	1550	0.0076	-
3.0132	1600	0.0047	-
3.1073	1650	0.0005	-
3.2015	1700	0.0064	-
3.2957	1750	0.014	-
3.3898	1800	0.0479	-
3.4840	1850	0.0005	-
3.5782	1900	0.0045	-
3.6723	1950	0.0188	-
3.7665	2000	0.0004	-
3.8606	2050	0.0122	-
3.9548	2100	0.0004	-
4.0490	2150	0.008	-
4.1431	2200	0.0245	-
4.2373	2250	0.005	-
4.3315	2300	0.0244	-
4.4256	2350	0.0208	-
4.5198	2400	0.0237	-
4.6139	2450	0.0005	-
4.7081	2500	0.0004	-
4.8023	2550	0.02	-
4.8964	2600	0.0004	-
4.9906	2650	0.0067	-
5.0847	2700	0.0099	-
5.1789	2750	0.0138	-
5.2731	2800	0.0192	-
5.3672	2850	0.0217	-
5.4614	2900	0.0056	-
5.5556	2950	0.0003	-
5.6497	3000	0.0052	-
5.7439	3050	0.0123	-
5.8380	3100	0.0136	-
5.9322	3150	0.0221	-
6.0264	3200	0.0235	-
6.1205	3250	0.0144	-
6.2147	3300	0.0174	-
6.3089	3350	0.007	-
6.4030	3400	0.0044	-
6.4972	3450	0.0003	-
6.5913	3500	0.007	-
6.6855	3550	0.0004	-
6.7797	3600	0.0384	-
6.8738	3650	0.0055	-
6.9680	3700	0.0056	-
7.0621	3750	0.0118	-
7.1563	3800	0.0143	-
7.2505	3850	0.0289	-
7.3446	3900	0.0301	-
7.4388	3950	0.0119	-
7.5330	4000	0.012	-
7.6271	4050	0.0138	-
7.7213	4100	0.0148	-
7.8154	4150	0.0003	-
7.9096	4200	0.0268	-
8.0038	4250	0.0131	-
8.0979	4300	0.0237	-
8.1921	4350	0.0004	-
8.2863	4400	0.0211	-
8.3804	4450	0.0092	-
8.4746	4500	0.005	-
8.5687	4550	0.0056	-
8.6629	4600	0.0168	-
8.7571	4650	0.0045	-
8.8512	4700	0.0184	-
8.9454	4750	0.0049	-
9.0395	4800	0.0047	-
9.1337	4850	0.0099	-
9.2279	4900	0.0054	-
9.3220	4950	0.0185	-
9.4162	5000	0.005	-
9.5104	5050	0.0004	-
9.6045	5100	0.013	-
9.6987	5150	0.0002	-
9.7928	5200	0.0187	-
9.8870	5250	0.0003	-
9.9812	5300	0.0081	-

Framework Versions

Python: 3.10.12
SetFit: 1.0.3
Sentence Transformers: 3.0.1
Transformers: 4.40.2
PyTorch: 2.4.0+cu121
Datasets: 2.21.0
Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}