Spaces:

Emanuel
/

porttagger

Sleeping

porttagger / top.html

Emanuel Huber

Updated project description

c541339 about 2 years ago

1.67 kB

	<div style="text-align: center; max-width: 650px; margin: 0 auto;">
	<div>
	<h1 style="font-weight: 900; font-size: 3rem; margin: 20px;">
	Porttagger
	</h1>
	<p class="slogan">A Brazilian Portuguese part of speech tagger according to the <a
	href="https://universaldependencies.org/">Universal Dependencies</a> model
	</p>
	</div>
	<p style="margin-top: 30px; margin-bottom: 10px; font-size: 94%; text-align: left;">
	Porttagger is a state of the art part of speech tagger for Brazilian Portuguese that automatically assigns
	morphosyntactic classes to the words of sentences, following the Universal Dependencies international model. You
	may provide single sentences or multiple sentences (using plain text files with several sentences) to be tagged.
	You may also choose which trained model to use. The options include a model trained on news texts (using the
	<a href="https://sites.google.com/icmc.usp.br/poetisa/resources-and-tools">Porttinari-base</a> corpus), on stock
	market tweets (from the <a
	href="https://www.kaggle.com/datasets/fernandojvdasilva/stock-tweets-ptbr-emotions">DANTE</a> corpus), on
	academic texts from the oil & gas
	domain (from the <a
	href="https://github.com/UniversalDependencies/UD_Portuguese-PetroGold/blob/master/README.md">PetroGold</a>
	corpus), and on all of them together. To the interested reader, this initiative is
	part of the <a href="https://sites.google.com/icmc.usp.br/poetisa/">POeTiSA</a> project, where much more
	information is available.
	</p>
	</div>