finiteautomata
commited on
Commit
•
6f2ecbb
1
Parent(s):
2c5fed9
Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,66 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language:
|
3 |
+
- es
|
4 |
+
|
5 |
+
tags:
|
6 |
+
- twitter
|
7 |
+
- sentiment-analysis
|
8 |
+
|
9 |
+
---
|
10 |
+
# Named Entity Recognition model for Spanish/English
|
11 |
+
## robertuito-ner
|
12 |
+
|
13 |
+
Repository: [https://github.com/pysentimiento/pysentimiento/](https://github.com/finiteautomata/pysentimiento/)
|
14 |
+
|
15 |
+
|
16 |
+
Model trained with the Spanish/English split of the [LinCE NER corpus](https://ritual.uh.edu/lince/), a code-switched benchmark . Base model is [RoBERTuito](https://github.com/pysentimiento/robertuito), a RoBERTa model trained in Spanish tweets.
|
17 |
+
|
18 |
+
|
19 |
+
|
20 |
+
## Results
|
21 |
+
|
22 |
+
Results are taken from the LinCE
|
23 |
+
|
24 |
+
|
25 |
+
| Model | Sentiment | NER | POS |
|
26 |
+
|:-----------------------|:----------------|:-------------------|:--------|
|
27 |
+
| \robertuito{} | *60.6* | 68.5 | 97.2 |
|
28 |
+
| \xlmlarge{} | -- | *69.5* | *97.2 |
|
29 |
+
| \xlmbase{} | -- | 64.9 | 97.0 |
|
30 |
+
| C2S \mbert{} | 59.1 | 64.6 | 96.9 |
|
31 |
+
| \mbert{} | 56.4 | 64.0 | 97.1 |
|
32 |
+
| \bert{} | 58.4 | 61.1 | 96.9 |
|
33 |
+
| \beto{} | 56.5 | -- | -- |
|
34 |
+
|
35 |
+
|
36 |
+
|
37 |
+
## Citation
|
38 |
+
|
39 |
+
If you use this model in your research, please cite pysentimiento, RoBERTuito and LinCE papers:
|
40 |
+
|
41 |
+
```
|
42 |
+
@misc{perez2021pysentimiento,
|
43 |
+
title={pysentimiento: A Python Toolkit for Sentiment Analysis and SocialNLP tasks},
|
44 |
+
author={Juan Manuel Pérez and Juan Carlos Giudici and Franco Luque},
|
45 |
+
year={2021},
|
46 |
+
eprint={2106.09462},
|
47 |
+
archivePrefix={arXiv},
|
48 |
+
primaryClass={cs.CL}
|
49 |
+
}
|
50 |
+
@misc{perez2021robertuito,
|
51 |
+
title={RoBERTuito: a pre-trained language model for social media text in Spanish},
|
52 |
+
author={Juan Manuel Pérez and Damián A. Furman and Laura Alonso Alemany and Franco Luque},
|
53 |
+
year={2021},
|
54 |
+
eprint={2111.09453},
|
55 |
+
archivePrefix={arXiv},
|
56 |
+
primaryClass={cs.CL}
|
57 |
+
}
|
58 |
+
|
59 |
+
@inproceedings{aguilar2020lince,
|
60 |
+
title={LinCE: A Centralized Benchmark for Linguistic Code-switching Evaluation},
|
61 |
+
author={Aguilar, Gustavo and Kar, Sudipta and Solorio, Thamar},
|
62 |
+
booktitle={Proceedings of the 12th Language Resources and Evaluation Conference},
|
63 |
+
pages={1803--1813},
|
64 |
+
year={2020}
|
65 |
+
}
|
66 |
+
```
|