Add SetFit model

Browse files

Files changed (13) hide show

1_Pooling/config.json +10 -0
README.md +291 -0
config.json +32 -0
config_sentence_transformers.json +10 -0
config_setfit.json +4 -0
model.safetensors +3 -0
model_head.pkl +3 -0
modules.json +20 -0
sentence_bert_config.json +4 -0
special_tokens_map.json +37 -0
tokenizer.json +0 -0
tokenizer_config.json +57 -0
vocab.txt +0 -0

1_Pooling/config.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "word_embedding_dimension": 768,
+  "pooling_mode_cls_token": true,
+  "pooling_mode_mean_tokens": false,
+  "pooling_mode_max_tokens": false,
+  "pooling_mode_mean_sqrt_len_tokens": false,
+  "pooling_mode_weightedmean_tokens": false,
+  "pooling_mode_lasttoken": false,
+  "include_prompt": true
+}

README.md ADDED Viewed

	@@ -0,0 +1,291 @@

+---
+base_model: BAAI/bge-base-en-v1.5
+library_name: setfit
+metrics:
+- accuracy
+pipeline_tag: text-classification
+tags:
+- setfit
+- sentence-transformers
+- text-classification
+- generated_from_setfit_trainer
+widget:
+- text: 'Reasoning:
+    The answer provides a solid overview of identifying a funnel spider, including
+    its dark brown or black body, shiny carapace, and large fangs. These points align
+    well with the details in the provided document. However, while the answer includes
+    the key features described in the document, it misses a few additional characteristics
+    such as the spinnerets, size variations, and geographical habitat that are valuable
+    in identifying funnel spiders more comprehensively. Nonetheless, the answer remains
+    relevant and concise based on the essential points covered.
+    Evaluation: Good'
+- text: 'Reasoning:
+    The answer provides a comprehensive and accurate description of how to write a
+    paper in MLA format. It mentions key points such as setting up 1-inch margins,
+    using 12-point font, and double-spacing the text, which are directly aligned with
+    the instructions in the provided document. It also addresses creating a running
+    header, typing the heading information in the upper left corner, and centering
+    the paper''s title, all of which are specified in the document. The answer is
+    well-supported by the document, relevant to the question asked, and presented
+    concisely.
+    Final Evaluation: Good
+    Evaluation: Good'
+- text: 'Reasoning:
+    The provided answer offers relevant and practical advice for getting into medical
+    school, including focusing on core science subjects, gaining clinical experience,
+    and preparing for the MCAT. These points align well with the suggestions in the
+    document. However, the answer overlooks several important details such as engaging
+    in extracurricular activities, seeking leadership opportunities, and preparing
+    a comprehensive application, which the document emphasizes. Including these aspects
+    would have made the response more comprehensive and better aligned with the provided
+    document.
+    Evaluation: Good'
+- text: 'Reasoning:
+    The provided answer offers several strategic tips for becoming adept at hide and
+    seek. The suggestions include creative hiding strategies like staying in the room
+    where the seeker started, using camouflage, and hiding in plain sight, which align
+    well with the document''s advice. The document recommends looking for long edges
+    to hide behind, using dense curtains, hiding in laundry baskets, and looking for
+    multi-colored areas, all of which are echoed in the answer. The advice given is
+    concise, relevant, and matches the tips and guidelines from the document.
+    Final Evaluation: good
+    Evaluation: Good
+    '
+- text: 'Reasoning:
+    1. **Context Grounding**: The answer refers to making a saline solution for treating
+    a baby''s cough, using a method that significantly deviates from the details provided
+    in the document. The document provides specific instructions for a saline solution
+    and its administration, but the quantities and steps in the answer do not align
+    with the document''s instructions. Additionally, the document does not suggest
+    using 2 cups of water and the method of inserting the suction bulb into the baby''s
+    nostril about an inch is incorrect and potentially harmful, deviating significantly
+    from the recommended depth.
+    2. **Relevance**: While the answer aims to address how to treat a baby''s cough,
+    it includes incorrect measurements and method details, which makes it less reliable
+    and potentially unsafe.
+    3. **Conciseness**: The answer is fairly concise but includes inaccuracies that
+    make it untrustworthy.
+    Due to these points, the answer does not meet the necessary criteria of being
+    well-grounded in the document, relevant, or correctly concise.
+    Final Evaluation: Bad'
+inference: true
+model-index:
+- name: SetFit with BAAI/bge-base-en-v1.5
+  results:
+  - task:
+      type: text-classification
+      name: Text Classification
+    dataset:
+      name: Unknown
+      type: unknown
+      split: test
+    metrics:
+    - type: accuracy
+      value: 0.8666666666666667
+      name: Accuracy
+---
+# SetFit with BAAI/bge-base-en-v1.5
+This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.
+The model has been trained using an efficient few-shot learning technique that involves:
+1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
+2. Training a classification head with features from the fine-tuned Sentence Transformer.
+## Model Details
+### Model Description
+- **Model Type:** SetFit
+- **Sentence Transformer body:** [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5)
+- **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
+- **Maximum Sequence Length:** 512 tokens
+- **Number of Classes:** 2 classes
+<!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
+<!-- - **Language:** Unknown -->
+<!-- - **License:** Unknown -->
+### Model Sources
+- **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
+- **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
+- **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
+### Model Labels
+| Label | Examples                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
+|:------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| 0     | <ul><li>"Reasoning:\nThe answer provided closely aligns with the specific instructions given in the document about petting a bearded dragon. It correctly mentions using 1 or 2 fingers to gently stroke the dragon's head, lowering your hand slowly to avoid startling it, and washing hands before and after petting to reduce the risk of bacteria transfer. However, the part about using a specific perfume or scent to help the dragon recognize you is not supported by the text and is, in fact, incorrect.\n\nFinal Evaluation: Bad\nResult: Bad\n"</li><li>'Reasoning:\nThe answer provided addresses the physical characteristics of a funnel spider but includes several inaccuracies and deviations from the information in the provided document. Key errors include describing the funnel spider as light brown or gray with a soft, dull carapace, which contradicts the document’s description of a dark brown or black body and a hard, shiny carapace. Additionally, the claim that funnel spiders have 3 non-poisonous fangs pointing sideways is incorrect based on the document, which states that the funnel spider has two large, downward-pointing fangs that are poisonous. The document provides clear and detailed descriptions that should form the basis for an accurate answer.\n\nFinal Evaluation: Bad'</li><li>'Reasoning:\nThe answer provided, "Luis Figo left Barcelona to join Real Madrid," while factually correct according to the provided document, is entirely unrelated to the question "How to Calculate Real Estate Commissions." The document and the answer focus on a historical event in soccer rather than providing any information or calculations related to real estate commissions. \n\nFinal Evaluation: Bad'</li></ul> |
+| 1     | <ul><li>'Reasoning:\nThe answer is well-supported by the document and directly relates to the question of how to hold a note while singing. It addresses key aspects such as breathing techniques, posture, and controlled release of air, all of which are mentioned in the provided document. The answer stays concise and clear, without deviating into unrelated topics, effectively summarizing the necessary steps for holding a note.\n\nFinal result: Good'</li><li>'Reasoning:\nThe answer is well-founded in the provided document and directly relates to the question of how to stop feeling empty. It suggests practical actions like keeping a journal, trying new activities, and making new friends, all of which are discussed in the document. The recommendations in the answer are summarized clearly and are appropriate responses to the question without providing extraneous information.\n\nFinal Evaluation:\nGood'</li><li>'Reasoning:\nThe answer aligns well with the instructions provided in the document and effectively addresses the question of how to dry curly hair. It begins by recommending gently squeezing out excess water, followed by the application of a leave-in conditioner and the use of a wide-tooth comb for detangling, which are all steps mentioned in the document. The answer then advises adding styling products and parting the hair to lift the roots, which helps expedite the air-drying process. The key points from the document are reflected in the answer, ensuring it is contextually grounded and relevant.\n\nEvaluation: Good'</li></ul>                                                                                                                                                                  |
+## Evaluation
+### Metrics
+| Label   | Accuracy |
+|:--------|:---------|
+| **all** | 0.8667   |
+## Uses
+### Direct Use for Inference
+First install the SetFit library:
+```bash
+pip install setfit
+```
+Then you can load this model and run inference.
+```python
+from setfit import SetFitModel
+# Download from the 🤗 Hub
+model = SetFitModel.from_pretrained("Netta1994/setfit_baai_wikisum_gpt-4o_improved-cot-instructions_chat_few_shot_generated_only_rea")
+# Run inference
+preds = model("Reasoning:
+The answer provides a solid overview of identifying a funnel spider, including its dark brown or black body, shiny carapace, and large fangs. These points align well with the details in the provided document. However, while the answer includes the key features described in the document, it misses a few additional characteristics such as the spinnerets, size variations, and geographical habitat that are valuable in identifying funnel spiders more comprehensively. Nonetheless, the answer remains relevant and concise based on the essential points covered.
+Evaluation: Good")
+```
+<!--
+### Downstream Use
+*List how someone could finetune this model on their own dataset.*
+-->
+<!--
+### Out-of-Scope Use
+*List how the model may foreseeably be misused and address what users ought not to do with the model.*
+-->
+<!--
+## Bias, Risks and Limitations
+*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
+-->
+<!--
+### Recommendations
+*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
+-->
+## Training Details
+### Training Set Metrics
+| Training set | Min | Median  | Max |
+|:-------------|:----|:--------|:----|
+| Word count   | 58  | 93.6479 | 177 |
+| Label | Training Sample Count |
+|:------|:----------------------|
+| 0     | 34                    |
+| 1     | 37                    |
+### Training Hyperparameters
+- batch_size: (16, 16)
+- num_epochs: (5, 5)
+- max_steps: -1
+- sampling_strategy: oversampling
+- num_iterations: 20
+- body_learning_rate: (2e-05, 2e-05)
+- head_learning_rate: 2e-05
+- loss: CosineSimilarityLoss
+- distance_metric: cosine_distance
+- margin: 0.25
+- end_to_end: False
+- use_amp: False
+- warmup_proportion: 0.1
+- l2_weight: 0.01
+- seed: 42
+- eval_max_steps: -1
+- load_best_model_at_end: False
+### Training Results
+| Epoch  | Step | Training Loss | Validation Loss |
+|:------:|:----:|:-------------:|:---------------:|
+| 0.0056 | 1    | 0.2213        | -               |
+| 0.2809 | 50   | 0.2497        | -               |
+| 0.5618 | 100  | 0.1439        | -               |
+| 0.8427 | 150  | 0.0107        | -               |
+| 1.1236 | 200  | 0.003         | -               |
+| 1.4045 | 250  | 0.0022        | -               |
+| 1.6854 | 300  | 0.002         | -               |
+| 1.9663 | 350  | 0.0018        | -               |
+| 2.2472 | 400  | 0.0017        | -               |
+| 2.5281 | 450  | 0.0015        | -               |
+| 2.8090 | 500  | 0.0015        | -               |
+| 3.0899 | 550  | 0.0014        | -               |
+| 3.3708 | 600  | 0.0014        | -               |
+| 3.6517 | 650  | 0.0012        | -               |
+| 3.9326 | 700  | 0.0013        | -               |
+| 4.2135 | 750  | 0.0012        | -               |
+| 4.4944 | 800  | 0.0013        | -               |
+| 4.7753 | 850  | 0.0012        | -               |
+### Framework Versions
+- Python: 3.10.14
+- SetFit: 1.1.0
+- Sentence Transformers: 3.1.0
+- Transformers: 4.44.0
+- PyTorch: 2.4.1+cu121
+- Datasets: 2.19.2
+- Tokenizers: 0.19.1
+## Citation
+### BibTeX
+```bibtex
+@article{https://doi.org/10.48550/arxiv.2209.11055,
+    doi = {10.48550/ARXIV.2209.11055},
+    url = {https://arxiv.org/abs/2209.11055},
+    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
+    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
+    title = {Efficient Few-Shot Learning Without Prompts},
+    publisher = {arXiv},
+    year = {2022},
+    copyright = {Creative Commons Attribution 4.0 International}
+}
+```
+<!--
+## Glossary
+*Clearly define terms in order to be accessible across audiences.*
+-->
+<!--
+## Model Card Authors
+*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
+-->
+<!--
+## Model Card Contact
+*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
+-->

config.json ADDED Viewed

	@@ -0,0 +1,32 @@

+{
+  "_name_or_path": "BAAI/bge-base-en-v1.5",
+  "architectures": [
+    "BertModel"
+  ],
+  "attention_probs_dropout_prob": 0.1,
+  "classifier_dropout": null,
+  "gradient_checkpointing": false,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 768,
+  "id2label": {
+    "0": "LABEL_0"
+  },
+  "initializer_range": 0.02,
+  "intermediate_size": 3072,
+  "label2id": {
+    "LABEL_0": 0
+  },
+  "layer_norm_eps": 1e-12,
+  "max_position_embeddings": 512,
+  "model_type": "bert",
+  "num_attention_heads": 12,
+  "num_hidden_layers": 12,
+  "pad_token_id": 0,
+  "position_embedding_type": "absolute",
+  "torch_dtype": "float32",
+  "transformers_version": "4.44.0",
+  "type_vocab_size": 2,
+  "use_cache": true,
+  "vocab_size": 30522
+}

config_sentence_transformers.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "__version__": {
+    "sentence_transformers": "3.1.0",
+    "transformers": "4.44.0",
+    "pytorch": "2.4.1+cu121"
+  },
+  "prompts": {},
+  "default_prompt_name": null,
+  "similarity_fn_name": null
+}

config_setfit.json ADDED Viewed

	@@ -0,0 +1,4 @@

+{
+  "labels": null,
+  "normalize_embeddings": false
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ec36f1eb276381073abc50769ce1e93527a7091881f91823529b80b8d1859e95
+size 437951328

model_head.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:bd3129265930c3a3d60a1a0ee774ec4d411c45201a554c39a600a7fc3a8ae160
+size 7007

modules.json ADDED Viewed

	@@ -0,0 +1,20 @@

+[
+  {
+    "idx": 0,
+    "name": "0",
+    "path": "",
+    "type": "sentence_transformers.models.Transformer"
+  },
+  {
+    "idx": 1,
+    "name": "1",
+    "path": "1_Pooling",
+    "type": "sentence_transformers.models.Pooling"
+  },
+  {
+    "idx": 2,
+    "name": "2",
+    "path": "2_Normalize",
+    "type": "sentence_transformers.models.Normalize"
+  }
+]

sentence_bert_config.json ADDED Viewed

	@@ -0,0 +1,4 @@

+{
+  "max_seq_length": 512,
+  "do_lower_case": true
+}

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,37 @@

+{
+  "cls_token": {
+    "content": "[CLS]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "mask_token": {
+    "content": "[MASK]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "[PAD]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "sep_token": {
+    "content": "[SEP]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "[UNK]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,57 @@

+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "[PAD]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "100": {
+      "content": "[UNK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "101": {
+      "content": "[CLS]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "102": {
+      "content": "[SEP]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "103": {
+      "content": "[MASK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "clean_up_tokenization_spaces": true,
+  "cls_token": "[CLS]",
+  "do_basic_tokenize": true,
+  "do_lower_case": true,
+  "mask_token": "[MASK]",
+  "model_max_length": 512,
+  "never_split": null,
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "strip_accents": null,
+  "tokenize_chinese_chars": true,
+  "tokenizer_class": "BertTokenizer",
+  "unk_token": "[UNK]"
+}

vocab.txt ADDED Viewed

The diff for this file is too large to render. See raw diff