File size: 4,916 Bytes
ae026c2 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 |
language:
- en
license: apache-2.0
tags:
- text-classification
- customer-support
- ticket-classification
- distilbert
datasets:
- custom
metrics:
- accuracy
model-index:
- name: ticket-classification-v1
results:
- task:
type: text-classification
name: Text Classification
dataset:
name: Custom Ticket Dataset
type: custom
metrics:
- name: Accuracy
type: accuracy
value: 0.9485
---
# Model Card for Dragneel/ticket-classification-v1
This model fine-tunes the DistilBERT base uncased model to classify customer support tickets into four categories. It achieves **94.85% accuracy** on the evaluation dataset.
## Model Details
### Model Description
This model is designed to automatically categorize customer support tickets based on their content. It can classify tickets into the following categories:
- **Billing Question**: Issues related to billing, payments, subscriptions, etc.
- **Feature Request**: Suggestions for new features or improvements
- **General Inquiry**: General questions about products or services
- **Technical Issue**: Technical problems, bugs, errors, etc.
The model uses DistilBERT as its base architecture - a distilled version of BERT that is smaller, faster, and more efficient while retaining good performance.
- **Developed by:** Dragneel
- **Model type:** Text Classification
- **Language(s):** English
- **License:** Apache 2.0
- **Finetuned from model:** [distilbert/distilbert-base-uncased](https://huggingface.co/distilbert/distilbert-base-uncased)
## Uses
### Direct Use
This model can be directly used for:
- Automated ticket routing and prioritization
- Customer support workflow optimization
- Analytics on ticket categories
- Real-time ticket classification
### Out-of-Scope Use
This model should not be used for:
- Processing sensitive customer information without proper privacy measures
- Making final decisions without human review for complex or critical issues
- Classifying tickets in languages other than English
- Categorizing content outside the customer support domain
## Bias, Risks, and Limitations
- The model was trained on a specific dataset and may not generalize well to significantly different customer support contexts
- Performance may degrade for very technical or domain-specific tickets not represented in the training data
- Very short or ambiguous tickets might be misclassified
### Recommendations
Users should review classifications for accuracy, especially for tickets that fall on the boundary between categories. Consider retraining the model on domain-specific data if using in a specialized industry.
## How to Get Started with the Model
Use the code below to get started with the model.
```python
from transformers import pipeline
# Load the model
classifier = pipeline("text-classification", model="Dragneel/ticket-classification-v1")
# Example tickets
tickets = [
"I was charged twice for my subscription this month. Can you help?",
"The app keeps crashing whenever I try to upload a file",
"Would it be possible to add dark mode to the dashboard?",
"What are your business hours?"
]
# Classify tickets
for ticket in tickets:
result = classifier(ticket)
print(f"Ticket: {ticket}")
print(f"Category: {result[0]['label']}")
print(f"Confidence: {result[0]['score']:.4f}")
print()
```
### ID to Label Mapping
```python
id_to_label = {
0: 'Billing Question',
1: 'Feature Request',
2: 'General Inquiry',
3: 'Technical Issue'
}
```
## Training Details
### Training Data
The model was trained on a dataset of customer support tickets that include diverse examples across all four categories. Each ticket typically contains a title and description detailing the customer's issue or request.
### Training Procedure
#### Training Hyperparameters
- **Learning rate:** 0.001
- **Batch size:** 2
- **Epochs:** 10 (with early stopping)
- **Weight decay:** 0.01
- **Early stopping patience:** 2 epochs
- **Optimizer:** AdamW
- **Training regime:** fp32
## Evaluation
### Testing Data, Factors & Metrics
#### Metrics
The model is evaluated using the following metrics:
- Accuracy: Percentage of correctly classified tickets
- Loss: Cross-entropy loss on the evaluation dataset
### Results
The model achieved the following metrics on the evaluation dataset:
| Metric | Value |
|--------|-------|
| Accuracy | 94.85% |
| Loss | 0.248 |
| Runtime | 16.01s |
| Samples/second | 23.05 |
## Technical Specifications
### Model Architecture and Objective
The model architecture is based on DistilBERT, a distilled version of BERT. It consists of the base DistilBERT model with a classification head layer on top. The model was fine-tuned using cross-entropy loss to predict the correct category for each ticket.
## Model Card Contact
For inquiries about this model, please open an issue on the model repository.
```
|