porttagger / top.html
Emanuel Huber
Updated project description
c541339
raw
history blame
1.67 kB
<div style="text-align: center; max-width: 650px; margin: 0 auto;">
<div>
<h1 style="font-weight: 900; font-size: 3rem; margin: 20px;">
Porttagger
</h1>
<p class="slogan">A Brazilian Portuguese part of speech tagger according to the <a
href="https://universaldependencies.org/">Universal Dependencies</a> model
</p>
</div>
<p style="margin-top: 30px; margin-bottom: 10px; font-size: 94%; text-align: left;">
Porttagger is a state of the art part of speech tagger for Brazilian Portuguese that automatically assigns
morphosyntactic classes to the words of sentences, following the Universal Dependencies international model. You
may provide single sentences or multiple sentences (using plain text files with several sentences) to be tagged.
You may also choose which trained model to use. The options include a model trained on news texts (using the
<a href="https://sites.google.com/icmc.usp.br/poetisa/resources-and-tools">Porttinari-base</a> corpus), on stock
market tweets (from the <a
href="https://www.kaggle.com/datasets/fernandojvdasilva/stock-tweets-ptbr-emotions">DANTE</a> corpus), on
academic texts from the oil & gas
domain (from the <a
href="https://github.com/UniversalDependencies/UD_Portuguese-PetroGold/blob/master/README.md">PetroGold</a>
corpus), and on all of them together. To the interested reader, this initiative is
part of the <a href="https://sites.google.com/icmc.usp.br/poetisa/">POeTiSA</a> project, where much more
information is available.
</p>
</div>