File size: 2,306 Bytes
092a428 7a76c4f 092a428 ada1ec8 092a428 e06f86e 092a428 75b9a09 4f2ad24 75b9a09 ab82646 04392f3 75b9a09 f7dc7cd c1f87b8 f7dc7cd c1f87b8 80a6c9d 93618c7 5a9e9a6 040cd5d 5a9e9a6 155a92b 4f2ad24 ab82646 67e2b84 4f2ad24 bcb853b 7f8f33e e70dbbf |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 |
---
language: "en"
tags:
- sentiment
- emotion
- twitter
widget:
- text: "Oh wow. I didn't know that."
- text: "This movie always makes me cry.."
- text: "Oh Happy Day"
---
## Description
With this model, you can classify emotions in English text data. The model was trained on 6 diverse datasets and predicts 7 emotions:
1) anger
2) disgust
3) fear
4) joy
5) neutral
6) sadness
7) surprise
The model is a fine-tuned checkpoint of DistilRoBERTa-base. The emotions reflect Ekman's 6 basic emotions, plus a neutral class.
## Application
a) Run emotion model with 3 lines of code on single text example using Hugging Face's pipeline command on Google Colab:
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/j-hartmann/emotion-english-distilroberta-base/blob/main/simple_emotion_pipeline.ipynb)
b) Run emotion model on multiple examples and full datasets (e.g., .csv files) on Google Colab:
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/j-hartmann/emotion-english-distilroberta-base/blob/main/emotion_prediction_example.ipynb)
## Contact
Please reach out to jochen.hartmann@uni-hamburg.de if you have any questions or feedback.
Thanks to Samuel Domdey and chrsiebert for their support in making this model available.
## Appendix
Please find an overview of the datasets used for training below. All datasets contain English text. The table summarizes which emotions are available in each of the datasets.
|Name|anger|disgust|fear|joy|neutral|sadness|surprise|
|---|---|---|---|---|---|---|---|
|Crowdflower (2016)|Yes|-|-|Yes|Yes|Yes|Yes|
|Emotion Dataset, Elvis et al. (2018)|Yes|-|Yes|Yes|-|Yes|Yes|
|GoEmotions, Demszky et al. (2020)|Yes|Yes|Yes|Yes|Yes|Yes|Yes|
|ISEAR, Vikash (2018)|Yes|Yes|Yes|Yes|-|Yes|-|
|MELD, Poria et al. (2019)|Yes|Yes|Yes|Yes|Yes|Yes|Yes|
|SemEval-2018, EI-reg (Mohammad et al. 2018) |Yes|-|Yes|Yes|-|Yes|-|
The datasets represent a diverse collection of text types. Specifically, they contain emotion labels for texts from Twitter, Reddit, student self-reports, and utterances from TV dialogues. As MELD (Multimodal EmotionLines Dataset) extends the popular EmotionLines dataset, EmotionLines itself is not included here. |