KnutJaegersberg
commited on
Commit
•
3534230
1
Parent(s):
a40f15e
Update README.md
Browse files
README.md
CHANGED
@@ -12,6 +12,9 @@ tags:
|
|
12 |
|
13 |
A SetFit model fit on 166 downlsampled multilingual IPTC Subject labels (concatenated for the lowest hierarchy level into artificial sentences of keywords) to predict the mid level news categories.
|
14 |
The purpose of this classifier is to support exploring corpora as weak labeler, since the representations of these descriptions are only approximations of real documents from those topics.
|
|
|
|
|
|
|
15 |
Accuracy on highest level labels in eval:
|
16 |
0.9779412
|
17 |
Accuracy/F1/mcc on mid level labels in eval:
|
|
|
12 |
|
13 |
A SetFit model fit on 166 downlsampled multilingual IPTC Subject labels (concatenated for the lowest hierarchy level into artificial sentences of keywords) to predict the mid level news categories.
|
14 |
The purpose of this classifier is to support exploring corpora as weak labeler, since the representations of these descriptions are only approximations of real documents from those topics.
|
15 |
+
The dataset I used to train the model is based on this file:
|
16 |
+
https://huggingface.co/datasets/KnutJaegersberg/News_topics_IPTC_codes_long
|
17 |
+
|
18 |
Accuracy on highest level labels in eval:
|
19 |
0.9779412
|
20 |
Accuracy/F1/mcc on mid level labels in eval:
|