finiteautomata commited on
Commit
6f2ecbb
1 Parent(s): 2c5fed9

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +66 -0
README.md ADDED
@@ -0,0 +1,66 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - es
4
+
5
+ tags:
6
+ - twitter
7
+ - sentiment-analysis
8
+
9
+ ---
10
+ # Named Entity Recognition model for Spanish/English
11
+ ## robertuito-ner
12
+
13
+ Repository: [https://github.com/pysentimiento/pysentimiento/](https://github.com/finiteautomata/pysentimiento/)
14
+
15
+
16
+ Model trained with the Spanish/English split of the [LinCE NER corpus](https://ritual.uh.edu/lince/), a code-switched benchmark . Base model is [RoBERTuito](https://github.com/pysentimiento/robertuito), a RoBERTa model trained in Spanish tweets.
17
+
18
+
19
+
20
+ ## Results
21
+
22
+ Results are taken from the LinCE
23
+
24
+
25
+ | Model | Sentiment | NER | POS |
26
+ |:-----------------------|:----------------|:-------------------|:--------|
27
+ | \robertuito{} | *60.6* | 68.5 | 97.2 |
28
+ | \xlmlarge{} | -- | *69.5* | *97.2 |
29
+ | \xlmbase{} | -- | 64.9 | 97.0 |
30
+ | C2S \mbert{} | 59.1 | 64.6 | 96.9 |
31
+ | \mbert{} | 56.4 | 64.0 | 97.1 |
32
+ | \bert{} | 58.4 | 61.1 | 96.9 |
33
+ | \beto{} | 56.5 | -- | -- |
34
+
35
+
36
+
37
+ ## Citation
38
+
39
+ If you use this model in your research, please cite pysentimiento, RoBERTuito and LinCE papers:
40
+
41
+ ```
42
+ @misc{perez2021pysentimiento,
43
+ title={pysentimiento: A Python Toolkit for Sentiment Analysis and SocialNLP tasks},
44
+ author={Juan Manuel Pérez and Juan Carlos Giudici and Franco Luque},
45
+ year={2021},
46
+ eprint={2106.09462},
47
+ archivePrefix={arXiv},
48
+ primaryClass={cs.CL}
49
+ }
50
+ @misc{perez2021robertuito,
51
+ title={RoBERTuito: a pre-trained language model for social media text in Spanish},
52
+ author={Juan Manuel Pérez and Damián A. Furman and Laura Alonso Alemany and Franco Luque},
53
+ year={2021},
54
+ eprint={2111.09453},
55
+ archivePrefix={arXiv},
56
+ primaryClass={cs.CL}
57
+ }
58
+
59
+ @inproceedings{aguilar2020lince,
60
+ title={LinCE: A Centralized Benchmark for Linguistic Code-switching Evaluation},
61
+ author={Aguilar, Gustavo and Kar, Sudipta and Solorio, Thamar},
62
+ booktitle={Proceedings of the 12th Language Resources and Evaluation Conference},
63
+ pages={1803--1813},
64
+ year={2020}
65
+ }
66
+ ```