dahongj commited on
Commit
eca74fe
1 Parent(s): 05d66dd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -1
README.md CHANGED
@@ -44,6 +44,20 @@ https://huggingface.co/finiteautomata/bertweet-base-sentiment-analysis
44
 
45
  https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment-latest
46
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
47
  Milestone 3
48
 
49
  Finetuned Model URL: https://huggingface.co/dahongj/finetuned_toxictweets
@@ -57,4 +71,18 @@ into variables and ran through a Dataset class. A tokenizer for Distilbert was c
57
  Then using the multivariable version of the distilbert-base-uncased model because there are 6 forms
58
  of toxicity included in the dataset that we want to finetune for. Using the native pytorch method
59
  of training as demonstrated on the HuggingFace documentation, the model was trained and evaluated.
60
- Both the finetuned model and its tokenizer are saved and uploaded onto HuggingFace.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
44
 
45
  https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment-latest
46
 
47
+ In order to use the HuggingFace space for our application, we had to create an empty model on Huggingface
48
+ initially. From there, we included the information box as shown on the top of this README. We created a
49
+ secret key token on github that is linked with our HuggingFace account and used that key to create an
50
+ action file or .yml. This file ensures that everytime there is an update to main, the website on HuggingFace
51
+ would start building based on the updated code.
52
+
53
+ By using the streamlit library, we were able to incorporate the pipeline function which allows us to access
54
+ and use a pretrained model from HuggingFace with ease. We created an app interface with includes a textbox
55
+ and a selection menu which allows users to input any text into the textbox before selecting the model that
56
+ they would like to use. The models used are listed above. Each of the models output a resulting sentiment
57
+ analysis of the input text as well as a probability score, which we used with the help of the pipeline
58
+ functionality to output back onto HuggingFace's interface for the user to see. This was done for all three
59
+ models.
60
+
61
  Milestone 3
62
 
63
  Finetuned Model URL: https://huggingface.co/dahongj/finetuned_toxictweets
 
71
  Then using the multivariable version of the distilbert-base-uncased model because there are 6 forms
72
  of toxicity included in the dataset that we want to finetune for. Using the native pytorch method
73
  of training as demonstrated on the HuggingFace documentation, the model was trained and evaluated.
74
+ Both the finetuned model and its tokenizer are saved and uploaded onto HuggingFace.
75
+
76
+ Milestone 4
77
+
78
+ Results:
79
+ The resulting web application on HuggingFace is a sentiment analysis application that allows users
80
+ to input a text of any kind and receive results of the toxicity levels. The first three pretrained
81
+ model has only two variances in the output, stating whether the text is majority positive or negative
82
+ as well as the degree that it is so. The fourth option on the selection bar allows users to select
83
+ our finetuned model, which determines six levels of toxicity: toxic, severe_toxic, obscene, insult,
84
+ threat, and identity_hate. This option determines the initial toxicity level as well as the second
85
+ highest level of toxicity. An example of 10 texts and their results are shown as an image on the website.
86
+
87
+ The landing page for the application and a video to demonstrate how to use the application are included
88
+ on this github.