LuisAVasquez
commited on
Commit
•
0e4e895
1
Parent(s):
e5aaad7
Update training_notebooks/README.md
Browse files
training_notebooks/README.md
CHANGED
@@ -1 +1,8 @@
|
|
1 |
-
#
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Training notebooks for simple Latin BERT uncased
|
2 |
+
|
3 |
+
These notebooks and scripts include the code to train this Masked Language Model and its tokenizer, from scratch.
|
4 |
+
|
5 |
+
The notebooks should be ready to execute in any computer with a GPU, with minimal changes.
|
6 |
+
|
7 |
+
|
8 |
+
Note: The scripts will create a file `03_full_corpus.txt` with the combination of all the corpora into a single raw text file.
|