Edit model card

SetFit with BAAI/bge-base-en-v1.5

This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-base-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
1
  • 'Reasoning:\n- **Good Aspects:**\n 1. Context Grounding: The answer addresses the specific context provided in the document, mentioning sitemap-related issues and resolutions.\n 2. Relevance: The answer directly addresses the problem of not discovering pages via sitemaps, following the steps mentioned in the document.\n 3. Conciseness: The instructions given are straightforward and to the point.\n\n- **Bad Aspects:**\n 1. Context Grounding: The use of is consistent with placeholders in the document but it maybe found more natural to use a real identifier.\n 2. Handling Details: It might miss elaborating on a particular detail that can be crucial, such as how to specifically use the inspection tool. \n\nFinal Result:\n- Good'
  • 'Reasoning:\n\n### Good Points:\n1. Context Grounding: The answer is grounded in the provided document, as it follows the steps described in the text for enabling clients to book multiple participants.\n2. Relevance: The steps provided are directly relevant to the question, outlining the specific process within Bookings.\n\n### Bad Points:\n1. Conciseness: There is an extraneous, seemingly miswritten phrase "John Youngimum" which should be "maximum." This introduces unnecessary confusion.\n2. Detailing: While the steps are mostly accurate, they missed clear labelling for each scroll and edit actions, causing potential confusion. The final step should also emphasize that the overall save action confirms the entire setup.\n3. Errors and Clarity: It doesn't maintain precise terminology, for instance, "John Youngimum number of participants." The correct term should be "maximum number of participants."\n\nFinal verdict:\nBased on the outlined reasoning, due to the critical mistake and slight inconsistencies, the evaluation is:\n \nBad'
  • 'Reasoning why the answer may be good:\n1. Context Grounding: The answer attempts to address an issue related to booking services, which could be inferred to be supported by a company offering a booking feature, as mentioned in the document.\n2. Relevance: The answer is somewhat relevant as it talks about a booking error, possibly correlating with the "Online Booking" feature from the document.\n3. Conciseness: The answer is concise and does not contain unnecessary information.\n4. Instructions: The answer advises that the issue has been resolved, which is a form of instruction or information on what to expect.\n\nReasoning why the answer may be bad:\n1. Context Grounding: The provided document does not contain any specific details about an error related to changing the location for booking services or its resolution. Thus, the answer lacks proper grounding in the document.\n2. Relevance: While it mentions booking, the document does not reference details related to the error or its resolution, making the answer's relevance questionable.\n3. Conciseness: Though the answer is concise, the lack of detailed instructions or steps to follow if the error persists is critical.\n4. Instructions: The answer does not give detailed actionable steps on what to do if the error still occurs, like refreshing the page or contacting support.\n\nFinal result: Bad'
0
  • "### Reasoning:\n\n#### Why the answer may be good:\n1. Context Grounding: The answer aligns with the context provided in the document, which clearly states that transferring the booking application from one site to another is not possible.\n2. Relevance: It directly addresses the specific question asked—whether the booking app can be updated on the site.\n3. Conciseness: The answer is brief and directly addresses the question.\n4. Correct Information: The instructions for voting on the feature are clearly detailed and reflect the information given in the document.\n\n#### Why the answer may be bad:\n1. Misinterpretation: The answer might be slightly misdirected if the user's intent was about updating the app in the sense of its features or versions rather than just transferring it between sites.\n2. Lack of Detail: It lacks information on what updating entails; it only covers site transferring limitations mentioned in the document.\n\n### Final Result:\n- Bad: The answer misinterprets the context as primarily transferring the app between sites and does not address potential updates within the app on the same site."
  • 'Reasoning for the answer being good:\n1. Context Grounding: The answer is well-supported by the provided document. It follows the steps outlined in the section "Adding and setting up an additional service list" as well as steps for "Setting up a page with services for site members only."\n2. Relevance: The answer directly addresses the question "What should I do to add a service?" by providing explicit instructions related to adding and displaying services.\n3. Conciseness: The answer is clear and to the point, although it might be detailed, it avoids unnecessary information.\n4. Correct and detailed instructions: The answer provides a step-by-step detailed guide on how to add a service list and how to make it visible either to all users or just to site members.\n\nReasoning for the answer being bad:\n1. Context Grounding: Although the instructions provided are based on the document, the reference to "setting up a page with services for site members only" may be considered slightly out of scope since the question did not specify this requirement.\n2. Relevance: The answer might slightly stray by including membership-specific instructions which were not explicitly asked.\n3. Conciseness: While comprehensive, the inclusion of additional steps related to member-only pages may make the answer longer than necessary for a general query about adding services.\n4. Correct and detailed instructions: The instructions are generally correct and detailed but could be streamlined to focus solely on the core question asked.\n\nFinal result: Good'
  • 'Reasoning:\n\nGood Aspects:\n1. Context Grounding: The answer largely draws from the provided document and relates directly to the technical process described.\n2. Relevance: It focuses on the exact procedure necessary to display blog categories on the blog feed.\n3. Conciseness: The steps provided are relatively brief and follow the logical sequence in the document.\n4. Correct and Detailed Instructions: It lists out tasks such as creating datasets, connecting to blog categories, and setting up filters, which align with the document's guidance.\n\nBad Aspects:\n1. Detail: The steps are vague; specifically, "95593638" is repeatedly used in place of what should be an action (e.g., "create," "add"), rendering the instructions confusing and incomplete.\n2. Accuracy: The inserted number sequence (95593638) disrupts the clarity and comprehension, making it unclear how to proceed with each step.\n3. Completeness: There is missing information on how to carry out tasks, such as clicking specific options and connecting fields, making it challenging to follow through without referring back to the document.\n\nFinal Result: \n\nBad'

Evaluation

Metrics

Label Accuracy
all 0.4375

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("Netta1994/setfit_baai_wix_qa_gpt-4o_improved-cot-instructions_two_reasoning_only_reasoning_1726")
# Run inference
preds = model("Reasoning for Good:
1. **Context Grounding**: The answer is well-supported by the provided document, accurately reflecting the steps outlined.
2. **Relevance**: The answer directly addresses the specific question posed about changing the reservation reference from the service page to the booking calendar.
3. **Conciseness**: The answer is concise and clear, providing straightforward steps without unnecessary information.
4. **Correct and Detailed Instructions**: It provides precise, step-by-step instructions that align correctly with the provided document. 

Reasoning for Bad:
- There are no significant deviations from the document or extraneous information.
- There are no contradictions or errors in the steps mentioned.

Final Result: 
Good")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 91 151.7556 233
Label Training Sample Count
0 22
1 23

Training Hyperparameters

  • batch_size: (16, 16)
  • num_epochs: (5, 5)
  • max_steps: -1
  • sampling_strategy: oversampling
  • num_iterations: 20
  • body_learning_rate: (2e-05, 2e-05)
  • head_learning_rate: 2e-05
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • l2_weight: 0.01
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0088 1 0.1829 -
0.4425 50 0.2598 -
0.8850 100 0.1764 -
1.3274 150 0.0079 -
1.7699 200 0.0026 -
2.2124 250 0.0021 -
2.6549 300 0.0019 -
3.0973 350 0.0016 -
3.5398 400 0.0015 -
3.9823 450 0.0016 -
4.4248 500 0.0015 -
4.8673 550 0.0015 -

Framework Versions

  • Python: 3.10.14
  • SetFit: 1.1.0
  • Sentence Transformers: 3.1.0
  • Transformers: 4.44.0
  • PyTorch: 2.4.1+cu121
  • Datasets: 2.19.2
  • Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
16
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Netta1994/setfit_baai_wix_qa_gpt-4o_improved-cot-instructions_two_reasoning_only_reasoning_1726

Finetuned
(259)
this model

Evaluation results