porttagger-DANTE / top.html
Emanuel Huber
Fixed model name in top.html
f48f476
raw
history blame
1.09 kB
<div style="text-align: center; max-width: 650px; margin: 0 auto;">
<div>
<h1 style="font-weight: 900; font-size: 3rem; margin: 20px;">
Porttagger
</h1>
<p class="slogan">A Brazilian Portuguese part-of-speech tagger according to Universal
Dependencies</p>
</div>
<p style="margin-top: 30px; margin-bottom: 10px; font-size: 94%; text-align: left;">
Porttagger (Porttinari Part-Of-Speech) tagger was trained on the <a
href="https://sites.google.com/icmc.usp.br/poetisa/resources-and-tools">Porttinari-base</a> corpus which is
a collection of news extracted from the Folha de São Paulo newspaper site. The trained model is a fine-tuned
version
of <a href="https://huggingface.co/neuralmind/bert-base-portuguese-cased">Bertimbau</a> that receives tokens and
outputs part-of-speech tags. Since the model expects a sequence of
tokens
for its inputs, <a src="https://spacy.io/models/pt">Spacy's</a> tokenization is used to tokenize the input text.
</p>
</div>