diff --git "a/README.md" "b/README.md" new file mode 100644--- /dev/null +++ "b/README.md" @@ -0,0 +1,271 @@ +--- +library_name: setfit +tags: +- setfit +- sentence-transformers +- text-classification +- generated_from_setfit_trainer +base_model: sentence-transformers/paraphrase-mpnet-base-v2 +metrics: +- accuracy +widget: +- text: I think it sounds pretty good, especially for a pic disc! Sounds on par with + rose pink Cadillac and probably better than smooth big cat. My one issue is I + have a few skips in the first song....but I'm using my backup scratch needle right + now so I'm not sure if it's actually the record The sea glass looks super cool + too, cheers! +- text: Nice. Were the chrome strips on the power assist steps wrapped or painted? + Thinking of dechroming mine and thinking the vinyl will get scuffed off pretty + quickly. +- text: Oh and consider yourself blessed you got meteorite, I have sonic and swirl + marks and scratches are so easily seen, with grey it hides much better +- text: https://preview.redd.it/by2gzb77m2wa1.jpeg?width=1284&format=pjpg&auto=webp&s=6d38c244f6a82b6af4b4eebe91c59f60536f289e + Under the light the paint looks terrible but outside of that, the car is sooo + clean. Wish I could add more than one pic. The interior and everything mechanical + is just amazingly clean. +- text: Not true. Once oxidation has begun there’s no stopping it you can minimize + the oxidation of the affected area by coating it but you can’t stop it +pipeline_tag: text-classification +inference: true +--- + +# SetFit with sentence-transformers/paraphrase-mpnet-base-v2 + +This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [sentence-transformers/paraphrase-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-mpnet-base-v2) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification. + +The model has been trained using an efficient few-shot learning technique that involves: + +1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning. +2. Training a classification head with features from the fine-tuned Sentence Transformer. + +## Model Details + +### Model Description +- **Model Type:** SetFit +- **Sentence Transformer body:** [sentence-transformers/paraphrase-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-mpnet-base-v2) +- **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance +- **Maximum Sequence Length:** 512 tokens +- **Number of Classes:** 36 classes + + + + +### Model Sources + +- **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit) +- **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055) +- **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit) + +### Model Labels +| Label | Examples | +|:------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| 19 | | +| 13 | | +| 18 | | +| 20 | | +| 15 | | +| 22 | | +| 2 | | +| 21 | | +| 8 | | +| 0 | | +| 27 | | +| 9 | | +| 3 | | +| 12 | | +| 26 | | +| 7 | | +| 14 | | +| 5 | | +| 16 | | +| 24 | | +| 1 | | +| 11 | | +| 6 | | +| 4 | | +| 28 | | +| 32 | | +| 25 | | +| 33 | | +| 35 | | +| 30 | | +| 17 | | +| 23 | | +| 10 | | +| 29 | | +| 34 | | +| 31 | | + +## Uses + +### Direct Use for Inference + +First install the SetFit library: + +```bash +pip install setfit +``` + +Then you can load this model and run inference. + +```python +from setfit import SetFitModel + +# Download from the 🤗 Hub +model = SetFitModel.from_pretrained("bhaskars113/toyota-paint-attribute-1.2") +# Run inference +preds = model("Oh and consider yourself blessed you got meteorite, I have sonic and swirl marks and scratches are so easily seen, with grey it hides much better") +``` + + + + + + + + + +## Training Details + +### Training Set Metrics +| Training set | Min | Median | Max | +|:-------------|:----|:--------|:----| +| Word count | 2 | 46.2451 | 924 | + +| Label | Training Sample Count | +|:------|:----------------------| +| 0 | 16 | +| 1 | 16 | +| 2 | 16 | +| 3 | 16 | +| 4 | 16 | +| 5 | 16 | +| 6 | 16 | +| 7 | 16 | +| 8 | 16 | +| 9 | 16 | +| 10 | 4 | +| 11 | 11 | +| 12 | 20 | +| 13 | 13 | +| 14 | 16 | +| 15 | 2 | +| 16 | 20 | +| 17 | 2 | +| 18 | 8 | +| 19 | 5 | +| 20 | 14 | +| 21 | 15 | +| 22 | 3 | +| 23 | 5 | +| 24 | 18 | +| 25 | 3 | +| 26 | 13 | +| 27 | 7 | +| 28 | 1 | +| 29 | 1 | +| 30 | 4 | +| 31 | 1 | +| 32 | 1 | +| 33 | 2 | +| 34 | 2 | +| 35 | 4 | + +### Training Hyperparameters +- batch_size: (16, 16) +- num_epochs: (1, 1) +- max_steps: -1 +- sampling_strategy: oversampling +- num_iterations: 20 +- body_learning_rate: (2e-05, 2e-05) +- head_learning_rate: 2e-05 +- loss: CosineSimilarityLoss +- distance_metric: cosine_distance +- margin: 0.25 +- end_to_end: False +- use_amp: False +- warmup_proportion: 0.1 +- seed: 42 +- eval_max_steps: -1 +- load_best_model_at_end: False + +### Training Results +| Epoch | Step | Training Loss | Validation Loss | +|:------:|:----:|:-------------:|:---------------:| +| 0.0011 | 1 | 0.1689 | - | +| 0.0563 | 50 | 0.2155 | - | +| 0.1126 | 100 | 0.139 | - | +| 0.1689 | 150 | 0.0656 | - | +| 0.2252 | 200 | 0.0359 | - | +| 0.2815 | 250 | 0.0462 | - | +| 0.3378 | 300 | 0.0182 | - | +| 0.3941 | 350 | 0.0235 | - | +| 0.4505 | 400 | 0.0401 | - | +| 0.5068 | 450 | 0.042 | - | +| 0.5631 | 500 | 0.0461 | - | +| 0.6194 | 550 | 0.0034 | - | +| 0.6757 | 600 | 0.0181 | - | +| 0.7320 | 650 | 0.0094 | - | +| 0.7883 | 700 | 0.0584 | - | +| 0.8446 | 750 | 0.0175 | - | +| 0.9009 | 800 | 0.0036 | - | +| 0.9572 | 850 | 0.0274 | - | + +### Framework Versions +- Python: 3.10.12 +- SetFit: 1.0.3 +- Sentence Transformers: 2.7.0 +- Transformers: 4.40.2 +- PyTorch: 2.2.1+cu121 +- Datasets: 2.19.1 +- Tokenizers: 0.19.1 + +## Citation + +### BibTeX +```bibtex +@article{https://doi.org/10.48550/arxiv.2209.11055, + doi = {10.48550/ARXIV.2209.11055}, + url = {https://arxiv.org/abs/2209.11055}, + author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren}, + keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences}, + title = {Efficient Few-Shot Learning Without Prompts}, + publisher = {arXiv}, + year = {2022}, + copyright = {Creative Commons Attribution 4.0 International} +} +``` + + + + + + \ No newline at end of file