jarodrigues
commited on
Commit
•
bf5da32
1
Parent(s):
2dc7d77
Update README.md
Browse files
README.md
CHANGED
@@ -84,7 +84,7 @@ DeBERTa is distributed under an [MIT license](https://github.com/microsoft/DeBER
|
|
84 |
- [ParlamentoPT](https://huggingface.co/datasets/PORTULAN/parlamento-pt): the ParlamentoPT is a data set we obtained by gathering the publicly available documents with the transcription of the debates in the Portuguese Parliament.
|
85 |
|
86 |
|
87 |
-
[**Albertina PT-BR Base**](https://huggingface.co/PORTULAN/albertina-ptbr), in turn, was trained over a 3.7 billion token curated selection of documents from the [OSCAR](https://huggingface.co/datasets/oscar-corpus/OSCAR-2301) data set, specifically filtered by the Internet country code top-level domain of Brazil.
|
88 |
|
89 |
|
90 |
## Preprocessing
|
|
|
84 |
- [ParlamentoPT](https://huggingface.co/datasets/PORTULAN/parlamento-pt): the ParlamentoPT is a data set we obtained by gathering the publicly available documents with the transcription of the debates in the Portuguese Parliament.
|
85 |
|
86 |
|
87 |
+
[**Albertina PT-BR Base**](https://huggingface.co/PORTULAN/albertina-ptbr-base), in turn, was trained over a 3.7 billion token curated selection of documents from the [OSCAR](https://huggingface.co/datasets/oscar-corpus/OSCAR-2301) data set, specifically filtered by the Internet country code top-level domain of Brazil.
|
88 |
|
89 |
|
90 |
## Preprocessing
|