Dragneel commited on
Commit
ae026c2
·
verified ·
1 Parent(s): dcb14a5

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +165 -0
README.md ADDED
@@ -0,0 +1,165 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ language:
2
+ - en
3
+ license: apache-2.0
4
+ tags:
5
+ - text-classification
6
+ - customer-support
7
+ - ticket-classification
8
+ - distilbert
9
+ datasets:
10
+ - custom
11
+ metrics:
12
+ - accuracy
13
+ model-index:
14
+ - name: ticket-classification-v1
15
+ results:
16
+ - task:
17
+ type: text-classification
18
+ name: Text Classification
19
+ dataset:
20
+ name: Custom Ticket Dataset
21
+ type: custom
22
+ metrics:
23
+ - name: Accuracy
24
+ type: accuracy
25
+ value: 0.9485
26
+ ---
27
+
28
+ # Model Card for Dragneel/ticket-classification-v1
29
+
30
+ This model fine-tunes the DistilBERT base uncased model to classify customer support tickets into four categories. It achieves **94.85% accuracy** on the evaluation dataset.
31
+
32
+ ## Model Details
33
+
34
+ ### Model Description
35
+
36
+ This model is designed to automatically categorize customer support tickets based on their content. It can classify tickets into the following categories:
37
+
38
+ - **Billing Question**: Issues related to billing, payments, subscriptions, etc.
39
+ - **Feature Request**: Suggestions for new features or improvements
40
+ - **General Inquiry**: General questions about products or services
41
+ - **Technical Issue**: Technical problems, bugs, errors, etc.
42
+
43
+ The model uses DistilBERT as its base architecture - a distilled version of BERT that is smaller, faster, and more efficient while retaining good performance.
44
+
45
+ - **Developed by:** Dragneel
46
+ - **Model type:** Text Classification
47
+ - **Language(s):** English
48
+ - **License:** Apache 2.0
49
+ - **Finetuned from model:** [distilbert/distilbert-base-uncased](https://huggingface.co/distilbert/distilbert-base-uncased)
50
+
51
+ ## Uses
52
+
53
+ ### Direct Use
54
+
55
+ This model can be directly used for:
56
+ - Automated ticket routing and prioritization
57
+ - Customer support workflow optimization
58
+ - Analytics on ticket categories
59
+ - Real-time ticket classification
60
+
61
+ ### Out-of-Scope Use
62
+
63
+ This model should not be used for:
64
+ - Processing sensitive customer information without proper privacy measures
65
+ - Making final decisions without human review for complex or critical issues
66
+ - Classifying tickets in languages other than English
67
+ - Categorizing content outside the customer support domain
68
+
69
+ ## Bias, Risks, and Limitations
70
+
71
+ - The model was trained on a specific dataset and may not generalize well to significantly different customer support contexts
72
+ - Performance may degrade for very technical or domain-specific tickets not represented in the training data
73
+ - Very short or ambiguous tickets might be misclassified
74
+
75
+ ### Recommendations
76
+
77
+ Users should review classifications for accuracy, especially for tickets that fall on the boundary between categories. Consider retraining the model on domain-specific data if using in a specialized industry.
78
+
79
+ ## How to Get Started with the Model
80
+
81
+ Use the code below to get started with the model.
82
+
83
+ ```python
84
+ from transformers import pipeline
85
+
86
+ # Load the model
87
+ classifier = pipeline("text-classification", model="Dragneel/ticket-classification-v1")
88
+
89
+ # Example tickets
90
+ tickets = [
91
+ "I was charged twice for my subscription this month. Can you help?",
92
+ "The app keeps crashing whenever I try to upload a file",
93
+ "Would it be possible to add dark mode to the dashboard?",
94
+ "What are your business hours?"
95
+ ]
96
+
97
+ # Classify tickets
98
+ for ticket in tickets:
99
+ result = classifier(ticket)
100
+ print(f"Ticket: {ticket}")
101
+ print(f"Category: {result[0]['label']}")
102
+ print(f"Confidence: {result[0]['score']:.4f}")
103
+ print()
104
+ ```
105
+
106
+ ### ID to Label Mapping
107
+
108
+ ```python
109
+ id_to_label = {
110
+ 0: 'Billing Question',
111
+ 1: 'Feature Request',
112
+ 2: 'General Inquiry',
113
+ 3: 'Technical Issue'
114
+ }
115
+ ```
116
+
117
+ ## Training Details
118
+
119
+ ### Training Data
120
+
121
+ The model was trained on a dataset of customer support tickets that include diverse examples across all four categories. Each ticket typically contains a title and description detailing the customer's issue or request.
122
+
123
+ ### Training Procedure
124
+
125
+ #### Training Hyperparameters
126
+
127
+ - **Learning rate:** 0.001
128
+ - **Batch size:** 2
129
+ - **Epochs:** 10 (with early stopping)
130
+ - **Weight decay:** 0.01
131
+ - **Early stopping patience:** 2 epochs
132
+ - **Optimizer:** AdamW
133
+ - **Training regime:** fp32
134
+
135
+ ## Evaluation
136
+
137
+ ### Testing Data, Factors & Metrics
138
+
139
+ #### Metrics
140
+
141
+ The model is evaluated using the following metrics:
142
+ - Accuracy: Percentage of correctly classified tickets
143
+ - Loss: Cross-entropy loss on the evaluation dataset
144
+
145
+ ### Results
146
+
147
+ The model achieved the following metrics on the evaluation dataset:
148
+
149
+ | Metric | Value |
150
+ |--------|-------|
151
+ | Accuracy | 94.85% |
152
+ | Loss | 0.248 |
153
+ | Runtime | 16.01s |
154
+ | Samples/second | 23.05 |
155
+
156
+ ## Technical Specifications
157
+
158
+ ### Model Architecture and Objective
159
+
160
+ The model architecture is based on DistilBERT, a distilled version of BERT. It consists of the base DistilBERT model with a classification head layer on top. The model was fine-tuned using cross-entropy loss to predict the correct category for each ticket.
161
+
162
+ ## Model Card Contact
163
+
164
+ For inquiries about this model, please open an issue on the model repository.
165
+ ```