Dragneel
/

ticket-classification-v1

Safetensors

distilbert

Model card Files Files and versions Community

Dragneel commited on 9 days ago

Commit

ae026c2

verified ·

1 Parent(s): dcb14a5

Create README.md

Browse files

Files changed (1) hide show

README.md +165 -0

README.md ADDED Viewed

	@@ -0,0 +1,165 @@

+language:
+- en
+license: apache-2.0
+tags:
+- text-classification
+- customer-support
+- ticket-classification
+- distilbert
+datasets:
+- custom
+metrics:
+- accuracy
+model-index:
+- name: ticket-classification-v1
+  results:
+  - task:
+      type: text-classification
+      name: Text Classification
+    dataset:
+      name: Custom Ticket Dataset
+      type: custom
+    metrics:
+    - name: Accuracy
+      type: accuracy
+      value: 0.9485
+---
+# Model Card for Dragneel/ticket-classification-v1
+This model fine-tunes the DistilBERT base uncased model to classify customer support tickets into four categories. It achieves **94.85% accuracy** on the evaluation dataset.
+## Model Details
+### Model Description
+This model is designed to automatically categorize customer support tickets based on their content. It can classify tickets into the following categories:
+- **Billing Question**: Issues related to billing, payments, subscriptions, etc.
+- **Feature Request**: Suggestions for new features or improvements
+- **General Inquiry**: General questions about products or services
+- **Technical Issue**: Technical problems, bugs, errors, etc.
+The model uses DistilBERT as its base architecture - a distilled version of BERT that is smaller, faster, and more efficient while retaining good performance.
+- **Developed by:** Dragneel
+- **Model type:** Text Classification
+- **Language(s):** English
+- **License:** Apache 2.0
+- **Finetuned from model:** [distilbert/distilbert-base-uncased](https://huggingface.co/distilbert/distilbert-base-uncased)
+## Uses
+### Direct Use
+This model can be directly used for:
+- Automated ticket routing and prioritization
+- Customer support workflow optimization
+- Analytics on ticket categories
+- Real-time ticket classification
+### Out-of-Scope Use
+This model should not be used for:
+- Processing sensitive customer information without proper privacy measures
+- Making final decisions without human review for complex or critical issues
+- Classifying tickets in languages other than English
+- Categorizing content outside the customer support domain
+## Bias, Risks, and Limitations
+- The model was trained on a specific dataset and may not generalize well to significantly different customer support contexts
+- Performance may degrade for very technical or domain-specific tickets not represented in the training data
+- Very short or ambiguous tickets might be misclassified
+### Recommendations
+Users should review classifications for accuracy, especially for tickets that fall on the boundary between categories. Consider retraining the model on domain-specific data if using in a specialized industry.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+```python
+from transformers import pipeline
+# Load the model
+classifier = pipeline("text-classification", model="Dragneel/ticket-classification-v1")
+# Example tickets
+tickets = [
+    "I was charged twice for my subscription this month. Can you help?",
+    "The app keeps crashing whenever I try to upload a file",
+    "Would it be possible to add dark mode to the dashboard?",
+    "What are your business hours?"
+]
+# Classify tickets
+for ticket in tickets:
+    result = classifier(ticket)
+    print(f"Ticket: {ticket}")
+    print(f"Category: {result[0]['label']}")
+    print(f"Confidence: {result[0]['score']:.4f}")
+    print()
+```
+### ID to Label Mapping
+```python
+id_to_label = {
+    0: 'Billing Question',
+    1: 'Feature Request',
+    2: 'General Inquiry',
+    3: 'Technical Issue'
+}
+```
+## Training Details
+### Training Data
+The model was trained on a dataset of customer support tickets that include diverse examples across all four categories. Each ticket typically contains a title and description detailing the customer's issue or request.
+### Training Procedure
+#### Training Hyperparameters
+- **Learning rate:** 0.001
+- **Batch size:** 2
+- **Epochs:** 10 (with early stopping)
+- **Weight decay:** 0.01
+- **Early stopping patience:** 2 epochs
+- **Optimizer:** AdamW
+- **Training regime:** fp32
+## Evaluation
+### Testing Data, Factors & Metrics
+#### Metrics
+The model is evaluated using the following metrics:
+- Accuracy: Percentage of correctly classified tickets
+- Loss: Cross-entropy loss on the evaluation dataset
+### Results
+The model achieved the following metrics on the evaluation dataset:
+| Metric | Value |
+|--------|-------|
+| Accuracy | 94.85% |
+| Loss | 0.248 |
+| Runtime | 16.01s |
+| Samples/second | 23.05 |
+## Technical Specifications
+### Model Architecture and Objective
+The model architecture is based on DistilBERT, a distilled version of BERT. It consists of the base DistilBERT model with a classification head layer on top. The model was fine-tuned using cross-entropy loss to predict the correct category for each ticket.
+## Model Card Contact
+For inquiries about this model, please open an issue on the model repository.
+```