File size: 10,810 Bytes
ee9ec43
 
 
 
65fa4f5
ee9ec43
 
 
 
 
 
 
 
 
65fa4f5
ee9ec43
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
65fa4f5
ee9ec43
 
 
 
 
65fa4f5
 
ee9ec43
b351586
 
 
 
 
 
465ab59
 
 
b351586
ee9ec43
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import matplotlib.pyplot as plt\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>label</th>\n",
       "      <th>comment</th>\n",
       "      <th>user</th>\n",
       "      <th>subreddit</th>\n",
       "      <th>date</th>\n",
       "      <th>sup_comment</th>\n",
       "      <th>prediction</th>\n",
       "      <th>confidence</th>\n",
       "      <th>Topic_key_word</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0</td>\n",
       "      <td>Actually most of her supporters and sane peopl...</td>\n",
       "      <td>Quinnjester</td>\n",
       "      <td>politics</td>\n",
       "      <td>2016-09</td>\n",
       "      <td>Hillary's Surrogotes Told to Blame Media for '...</td>\n",
       "      <td>0</td>\n",
       "      <td>0.974983</td>\n",
       "      <td>TODO 2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0</td>\n",
       "      <td>They can't survive without an echo chamber whi...</td>\n",
       "      <td>TheGettysburgAddress</td>\n",
       "      <td>The_Donald</td>\n",
       "      <td>2016-11</td>\n",
       "      <td>Thank God Liberals like to live in concentrate...</td>\n",
       "      <td>1</td>\n",
       "      <td>0.956885</td>\n",
       "      <td>TODO 2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0</td>\n",
       "      <td>you're pretty cute yourself 1729 total</td>\n",
       "      <td>Sempiternally_free</td>\n",
       "      <td>2007scape</td>\n",
       "      <td>2016-11</td>\n",
       "      <td>Saw this cutie training his Attack today...</td>\n",
       "      <td>0</td>\n",
       "      <td>0.899885</td>\n",
       "      <td>TODO 2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0</td>\n",
       "      <td>If you kill me you'll crash the meme market</td>\n",
       "      <td>Catacomb82</td>\n",
       "      <td>AskReddit</td>\n",
       "      <td>2016-10</td>\n",
       "      <td>If you were locked in a room with 49 other peo...</td>\n",
       "      <td>0</td>\n",
       "      <td>0.905721</td>\n",
       "      <td>TODO 2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0</td>\n",
       "      <td>I bet he wrote that last message as he was sob...</td>\n",
       "      <td>Dorian-throwaway</td>\n",
       "      <td>niceguys</td>\n",
       "      <td>2016-11</td>\n",
       "      <td>You're not even that pretty!</td>\n",
       "      <td>1</td>\n",
       "      <td>0.589593</td>\n",
       "      <td>TODO 2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>251603</th>\n",
       "      <td>1</td>\n",
       "      <td>Respect your elders you little snot.</td>\n",
       "      <td>Tiffany_Butler</td>\n",
       "      <td>sports</td>\n",
       "      <td>2009-06</td>\n",
       "      <td>Aren't you a little old to be on the internet,...</td>\n",
       "      <td>1</td>\n",
       "      <td>0.852649</td>\n",
       "      <td>TODO 1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>251604</th>\n",
       "      <td>1</td>\n",
       "      <td>I'm just glad they won't be using taxpayer mon...</td>\n",
       "      <td>harryballsagna</td>\n",
       "      <td>canada</td>\n",
       "      <td>2009-06</td>\n",
       "      <td>\"I'm sorry, I can't hear you over the sound of...</td>\n",
       "      <td>1</td>\n",
       "      <td>0.974458</td>\n",
       "      <td>TODO 0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>251605</th>\n",
       "      <td>1</td>\n",
       "      <td>what.. with this awesome narration?</td>\n",
       "      <td>aberant</td>\n",
       "      <td>lost</td>\n",
       "      <td>2009-04</td>\n",
       "      <td>So far, so lame.</td>\n",
       "      <td>1</td>\n",
       "      <td>0.809398</td>\n",
       "      <td>TODO 1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>251606</th>\n",
       "      <td>1</td>\n",
       "      <td>He looks trustworthy.</td>\n",
       "      <td>permaculture</td>\n",
       "      <td>unitedkingdom</td>\n",
       "      <td>2009-01</td>\n",
       "      <td>\"I don't care\" says Lapland boss</td>\n",
       "      <td>1</td>\n",
       "      <td>0.979738</td>\n",
       "      <td>TODO 4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>251607</th>\n",
       "      <td>1</td>\n",
       "      <td>Well yeah, but it'll work this time.</td>\n",
       "      <td>SovereignMan</td>\n",
       "      <td>politics</td>\n",
       "      <td>2009-02</td>\n",
       "      <td>When their efforts failed, as they usually did...</td>\n",
       "      <td>1</td>\n",
       "      <td>0.975283</td>\n",
       "      <td>TODO 1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5000 rows × 9 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "        label                                            comment  \\\n",
       "0           0  Actually most of her supporters and sane peopl...   \n",
       "1           0  They can't survive without an echo chamber whi...   \n",
       "2           0             you're pretty cute yourself 1729 total   \n",
       "3           0        If you kill me you'll crash the meme market   \n",
       "4           0  I bet he wrote that last message as he was sob...   \n",
       "...       ...                                                ...   \n",
       "251603      1               Respect your elders you little snot.   \n",
       "251604      1  I'm just glad they won't be using taxpayer mon...   \n",
       "251605      1                what.. with this awesome narration?   \n",
       "251606      1                              He looks trustworthy.   \n",
       "251607      1               Well yeah, but it'll work this time.   \n",
       "\n",
       "                        user      subreddit     date  \\\n",
       "0                Quinnjester       politics  2016-09   \n",
       "1       TheGettysburgAddress     The_Donald  2016-11   \n",
       "2         Sempiternally_free      2007scape  2016-11   \n",
       "3                 Catacomb82      AskReddit  2016-10   \n",
       "4           Dorian-throwaway       niceguys  2016-11   \n",
       "...                      ...            ...      ...   \n",
       "251603        Tiffany_Butler         sports  2009-06   \n",
       "251604        harryballsagna         canada  2009-06   \n",
       "251605               aberant           lost  2009-04   \n",
       "251606          permaculture  unitedkingdom  2009-01   \n",
       "251607          SovereignMan       politics  2009-02   \n",
       "\n",
       "                                              sup_comment  prediction  \\\n",
       "0       Hillary's Surrogotes Told to Blame Media for '...           0   \n",
       "1       Thank God Liberals like to live in concentrate...           1   \n",
       "2             Saw this cutie training his Attack today...           0   \n",
       "3       If you were locked in a room with 49 other peo...           0   \n",
       "4                            You're not even that pretty!           1   \n",
       "...                                                   ...         ...   \n",
       "251603  Aren't you a little old to be on the internet,...           1   \n",
       "251604  \"I'm sorry, I can't hear you over the sound of...           1   \n",
       "251605                                   So far, so lame.           1   \n",
       "251606                   \"I don't care\" says Lapland boss           1   \n",
       "251607  When their efforts failed, as they usually did...           1   \n",
       "\n",
       "        confidence Topic_key_word  \n",
       "0         0.974983         TODO 2  \n",
       "1         0.956885         TODO 2  \n",
       "2         0.899885         TODO 2  \n",
       "3         0.905721         TODO 2  \n",
       "4         0.589593         TODO 2  \n",
       "...            ...            ...  \n",
       "251603    0.852649         TODO 1  \n",
       "251604    0.974458         TODO 0  \n",
       "251605    0.809398         TODO 1  \n",
       "251606    0.979738         TODO 4  \n",
       "251607    0.975283         TODO 1  \n",
       "\n",
       "[5000 rows x 9 columns]"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "d = pd.read_csv('./data/results extended.csv', index_col=0)\n",
    "d"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "TODO:\n",
    "- [x] Show LDA top words for each topic\n",
    "- [ ] I topic con una bassa percentuale di ironia sono i topic considerati più \"seri\" (?)\n",
    "- [x] Per ora sto utilizzando le label assegnate dal dataset, se non avessi le label e dovessi prevedere l'ironia LDA è cmq affidabile?"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "torch_new",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.16"
  },
  "orig_nbformat": 4
 },
 "nbformat": 4,
 "nbformat_minor": 2
}