Joe Davison commited on
Commit
8f145be
1 Parent(s): 7e7f809

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -7
README.md CHANGED
@@ -7,6 +7,8 @@ tags:
7
  datasets:
8
  - go_emotions
9
  license: mit
 
 
10
  ---
11
 
12
  # distilbert-base-uncased-go-emotions-student
@@ -15,15 +17,12 @@ license: mit
15
 
16
  This model is distilled from the zero-shot classification pipeline on the unlabeled GoEmotions dataset using [this
17
  script](https://github.com/huggingface/transformers/tree/master/examples/research_projects/zero-shot-distillation).
18
- It is the result of the demo notebook
19
- [here](https://colab.research.google.com/drive/1mjBjd0cR8G57ZpsnFCS3ngGyo5nCa9ya?usp=sharing), where more details
20
- about the model can be found.
21
-
22
- - Teacher model: [roberta-large-mnli](https://huggingface.co/roberta-large-mnli)
23
 
24
  ## Intended Usage
25
 
26
  The model can be used like any other model trained on GoEmotions, but will likely not perform as well as a model
27
  trained with full supervision. It is primarily intended as a demo of how an expensive NLI-based zero-shot model
28
- can be distilled to a more efficient student. Note that although the GoEmotions dataset allow multiple labels
29
- per instance, the teacher used single-label classification to create psuedo-labels.
 
 
7
  datasets:
8
  - go_emotions
9
  license: mit
10
+ widget:
11
+ - text: "I feel lucky to be here."
12
  ---
13
 
14
  # distilbert-base-uncased-go-emotions-student
 
17
 
18
  This model is distilled from the zero-shot classification pipeline on the unlabeled GoEmotions dataset using [this
19
  script](https://github.com/huggingface/transformers/tree/master/examples/research_projects/zero-shot-distillation).
20
+ It was trained with mixed precision for 10 epochs and otherwise used the default script arguments.
 
 
 
 
21
 
22
  ## Intended Usage
23
 
24
  The model can be used like any other model trained on GoEmotions, but will likely not perform as well as a model
25
  trained with full supervision. It is primarily intended as a demo of how an expensive NLI-based zero-shot model
26
+ can be distilled to a more efficient student, allowing a classifier to be trained with only unlabeled data. Note
27
+ that although the GoEmotions dataset allow multiple labels per instance, the teacher used single-label
28
+ classification to create psuedo-labels.