This is a flair sequence tagger trained with a corpus of 44 case reports from the European Court of Human Rights (ECHR) in Spanish that were built and manually annotated for anonymization as part of the work presented in the Master's thesis "Anonymization of case reports from the ECHR in Spanish and French: exploration of two alternative annotation approaches".
It predicts 11 tags: DATE, TIME, CODE, PER, LEGAL_PROFESSIONAL, NATIONALITY, ETHNIC_CATEGORY, ORG, LOC, QUANTITY, CURRENCY.
The corpus and the code used for training this sequence tagger are available on GitHub: https://github.com/mariasierro/automatic-anonymization-ECHR-French-Spanish.
- Downloads last month
- 16