Joe Davison
commited on
Commit
•
8f145be
1
Parent(s):
7e7f809
Update README.md
Browse files
README.md
CHANGED
@@ -7,6 +7,8 @@ tags:
|
|
7 |
datasets:
|
8 |
- go_emotions
|
9 |
license: mit
|
|
|
|
|
10 |
---
|
11 |
|
12 |
# distilbert-base-uncased-go-emotions-student
|
@@ -15,15 +17,12 @@ license: mit
|
|
15 |
|
16 |
This model is distilled from the zero-shot classification pipeline on the unlabeled GoEmotions dataset using [this
|
17 |
script](https://github.com/huggingface/transformers/tree/master/examples/research_projects/zero-shot-distillation).
|
18 |
-
It
|
19 |
-
[here](https://colab.research.google.com/drive/1mjBjd0cR8G57ZpsnFCS3ngGyo5nCa9ya?usp=sharing), where more details
|
20 |
-
about the model can be found.
|
21 |
-
|
22 |
-
- Teacher model: [roberta-large-mnli](https://huggingface.co/roberta-large-mnli)
|
23 |
|
24 |
## Intended Usage
|
25 |
|
26 |
The model can be used like any other model trained on GoEmotions, but will likely not perform as well as a model
|
27 |
trained with full supervision. It is primarily intended as a demo of how an expensive NLI-based zero-shot model
|
28 |
-
can be distilled to a more efficient student
|
29 |
-
per instance, the teacher used single-label
|
|
|
|
7 |
datasets:
|
8 |
- go_emotions
|
9 |
license: mit
|
10 |
+
widget:
|
11 |
+
- text: "I feel lucky to be here."
|
12 |
---
|
13 |
|
14 |
# distilbert-base-uncased-go-emotions-student
|
|
|
17 |
|
18 |
This model is distilled from the zero-shot classification pipeline on the unlabeled GoEmotions dataset using [this
|
19 |
script](https://github.com/huggingface/transformers/tree/master/examples/research_projects/zero-shot-distillation).
|
20 |
+
It was trained with mixed precision for 10 epochs and otherwise used the default script arguments.
|
|
|
|
|
|
|
|
|
21 |
|
22 |
## Intended Usage
|
23 |
|
24 |
The model can be used like any other model trained on GoEmotions, but will likely not perform as well as a model
|
25 |
trained with full supervision. It is primarily intended as a demo of how an expensive NLI-based zero-shot model
|
26 |
+
can be distilled to a more efficient student, allowing a classifier to be trained with only unlabeled data. Note
|
27 |
+
that although the GoEmotions dataset allow multiple labels per instance, the teacher used single-label
|
28 |
+
classification to create psuedo-labels.
|