--- library_name: setfit tags: - setfit - sentence-transformers - text-classification - generated_from_setfit_trainer base_model: sentence-transformers/paraphrase-mpnet-base-v2 metrics: - accuracy widget: - text: What is the price of the organic honey? - text: Variety of cookie boxes - text: Is the Popcorn Box available in a pack of 50? - text: What is the price range for the sugarfree chocolate heart sugarfree chocolate box pack of 5? - text: Do you have the Off-White x Air Jordan 2 Retro Low SP Black Varsity Royal in size 10? pipeline_tag: text-classification inference: true model-index: - name: SetFit with sentence-transformers/paraphrase-mpnet-base-v2 results: - task: type: text-classification name: Text Classification dataset: name: Unknown type: unknown split: test metrics: - type: accuracy value: 0.8533333333333334 name: Accuracy --- # SetFit with sentence-transformers/paraphrase-mpnet-base-v2 This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [sentence-transformers/paraphrase-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-mpnet-base-v2) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification. The model has been trained using an efficient few-shot learning technique that involves: 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning. 2. Training a classification head with features from the fine-tuned Sentence Transformer. ## Model Details ### Model Description - **Model Type:** SetFit - **Sentence Transformer body:** [sentence-transformers/paraphrase-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-mpnet-base-v2) - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance - **Maximum Sequence Length:** 512 tokens - **Number of Classes:** 5 classes ### Model Sources - **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit) - **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055) - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit) ### Model Labels | Label | Examples | |:------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | product faq |

'Does the Meenakari jal jangla -Rani saree have meenakari?'
'Is the Nike Dunk Low Premium Bacon available in size 7?'
'What is the best way to recycle the packaging boxes for wholesale orders for wholesale orders?'

| | order tracking |

'I ordered the Cake Boards 7 days ago with order no 43210 how long will it take to deliver?'
'I want to deliver bags to Pune, how many days will it take to deliver?'
'I want to deliver packaging to Surat, how many days will it take to deliver?'

| | product policy |

'What is the procedure for returning a product that was part of a special promotion occasion?'
'Can I return an item if it was damaged during delivery preparation?'
'What is the procedure for returning a product that was part of a special occasion promotion?'

| | general faq |

'What are the key factors to consider when developing a personalized diet plan for weight loss?'
'What are some tips for maximizing the antioxidant content when brewing green tea?'
'Can you explain why Mashru silk is considered more comfortable to wear compared to pure silk sarees?'

| | product discoverability |

'Can you show me sarees in bright colors suitable for weddings?'
'Do you have adidas Superstar shoes?'
'Do you have any bestseller teas available?'

| ## Evaluation ### Metrics | Label | Accuracy | |:--------|:---------| | **all** | 0.8533 | ## Uses ### Direct Use for Inference First install the SetFit library: ```bash pip install setfit ``` Then you can load this model and run inference. ```python from setfit import SetFitModel # Download from the 🤗 Hub model = SetFitModel.from_pretrained("Shankhdhar/classifier_woog_firstbud") # Run inference preds = model("Variety of cookie boxes") ``` ## Training Details ### Training Set Metrics | Training set | Min | Median | Max | |:-------------|:----|:--------|:----| | Word count | 4 | 12.1961 | 28 | | Label | Training Sample Count | |:------------------------|:----------------------| | general faq | 24 | | order tracking | 32 | | product discoverability | 50 | | product faq | 50 | | product policy | 48 | ### Training Hyperparameters - batch_size: (16, 16) - num_epochs: (2, 2) - max_steps: -1 - sampling_strategy: oversampling - body_learning_rate: (2e-05, 1e-05) - head_learning_rate: 0.01 - loss: CosineSimilarityLoss - distance_metric: cosine_distance - margin: 0.25 - end_to_end: False - use_amp: False - warmup_proportion: 0.1 - seed: 42 - eval_max_steps: -1 - load_best_model_at_end: True ### Training Results | Epoch | Step | Training Loss | Validation Loss | |:------:|:----:|:-------------:|:---------------:| | 0.0005 | 1 | 0.2265 | - | | 0.0244 | 50 | 0.1831 | - | | 0.0489 | 100 | 0.1876 | - | | 0.0733 | 150 | 0.1221 | - | | 0.0978 | 200 | 0.0228 | - | | 0.1222 | 250 | 0.0072 | - | | 0.1467 | 300 | 0.0282 | - | | 0.1711 | 350 | 0.0015 | - | | 0.1956 | 400 | 0.0005 | - | | 0.2200 | 450 | 0.0008 | - | | 0.2445 | 500 | 0.0004 | - | | 0.2689 | 550 | 0.0003 | - | | 0.2934 | 600 | 0.0003 | - | | 0.3178 | 650 | 0.0002 | - | | 0.3423 | 700 | 0.0002 | - | | 0.3667 | 750 | 0.0002 | - | | 0.3912 | 800 | 0.0003 | - | | 0.4156 | 850 | 0.0002 | - | | 0.4401 | 900 | 0.0002 | - | | 0.4645 | 950 | 0.0001 | - | | 0.4890 | 1000 | 0.0001 | - | | 0.5134 | 1050 | 0.0001 | - | | 0.5379 | 1100 | 0.0001 | - | | 0.5623 | 1150 | 0.0002 | - | | 0.5868 | 1200 | 0.0002 | - | | 0.6112 | 1250 | 0.0001 | - | | 0.6357 | 1300 | 0.0001 | - | | 0.6601 | 1350 | 0.0001 | - | | 0.6846 | 1400 | 0.0001 | - | | 0.7090 | 1450 | 0.0001 | - | | 0.7335 | 1500 | 0.0001 | - | | 0.7579 | 1550 | 0.0001 | - | | 0.7824 | 1600 | 0.0001 | - | | 0.8068 | 1650 | 0.0001 | - | | 0.8313 | 1700 | 0.0001 | - | | 0.8557 | 1750 | 0.0011 | - | | 0.8802 | 1800 | 0.0002 | - | | 0.9046 | 1850 | 0.0001 | - | | 0.9291 | 1900 | 0.0001 | - | | 0.9535 | 1950 | 0.0002 | - | | 0.9780 | 2000 | 0.0001 | - | | 1.0024 | 2050 | 0.0001 | - | | 1.0269 | 2100 | 0.0002 | - | | 1.0513 | 2150 | 0.0001 | - | | 1.0758 | 2200 | 0.0001 | - | | 1.1002 | 2250 | 0.0001 | - | | 1.1247 | 2300 | 0.0001 | - | | 1.1491 | 2350 | 0.0001 | - | | 1.1736 | 2400 | 0.0001 | - | | 1.1980 | 2450 | 0.0001 | - | | 1.2225 | 2500 | 0.0001 | - | | 1.2469 | 2550 | 0.0001 | - | | 1.2714 | 2600 | 0.0001 | - | | 1.2958 | 2650 | 0.0001 | - | | 1.3203 | 2700 | 0.0001 | - | | 1.3447 | 2750 | 0.0001 | - | | 1.3692 | 2800 | 0.0001 | - | | 1.3936 | 2850 | 0.0001 | - | | 1.4181 | 2900 | 0.0001 | - | | 1.4425 | 2950 | 0.0001 | - | | 1.4670 | 3000 | 0.0001 | - | | 1.4914 | 3050 | 0.0001 | - | | 1.5159 | 3100 | 0.0001 | - | | 1.5403 | 3150 | 0.0001 | - | | 1.5648 | 3200 | 0.0001 | - | | 1.5892 | 3250 | 0.0001 | - | | 1.6137 | 3300 | 0.0001 | - | | 1.6381 | 3350 | 0.0001 | - | | 1.6626 | 3400 | 0.0001 | - | | 1.6870 | 3450 | 0.0001 | - | | 1.7115 | 3500 | 0.0001 | - | | 1.7359 | 3550 | 0.0 | - | | 1.7604 | 3600 | 0.0001 | - | | 1.7848 | 3650 | 0.0001 | - | | 1.8093 | 3700 | 0.0001 | - | | 1.8337 | 3750 | 0.0 | - | | 1.8582 | 3800 | 0.0001 | - | | 1.8826 | 3850 | 0.0001 | - | | 1.9071 | 3900 | 0.0001 | - | | 1.9315 | 3950 | 0.0 | - | | 1.9560 | 4000 | 0.0 | - | | 1.9804 | 4050 | 0.0001 | - | ### Framework Versions - Python: 3.10.13 - SetFit: 1.0.3 - Sentence Transformers: 3.0.1 - Transformers: 4.39.0 - PyTorch: 2.2.2+cu121 - Datasets: 2.19.2 - Tokenizers: 0.15.2 ## Citation ### BibTeX ```bibtex @article{https://doi.org/10.48550/arxiv.2209.11055, doi = {10.48550/ARXIV.2209.11055}, url = {https://arxiv.org/abs/2209.11055}, author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren}, keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences}, title = {Efficient Few-Shot Learning Without Prompts}, publisher = {arXiv}, year = {2022}, copyright = {Creative Commons Attribution 4.0 International} } ```