Update README.md
Browse files
README.md
CHANGED
@@ -10,7 +10,12 @@ model-index:
|
|
10 |
|
11 |
A GPT model for Estonian (large-size), trained from scratch on 2.2 billion words (Estonian National Corpus + News Crawl + Common Crawl). Currently trained for 1 epoch (but already better than gpt-4-est-base :-) to be updated)
|
12 |
|
13 |
-
|
|
|
|
|
|
|
|
|
|
|
14 |
- num. of layers: 24
|
15 |
- num. of heads: 24
|
16 |
- embedding size: 1536
|
|
|
10 |
|
11 |
A GPT model for Estonian (large-size), trained from scratch on 2.2 billion words (Estonian National Corpus + News Crawl + Common Crawl). Currently trained for 1 epoch (but already better than gpt-4-est-base :-) to be updated)
|
12 |
|
13 |
+
### Format
|
14 |
+
|
15 |
+
For training data was prepended with a text domain tag, and it should be added as prefix when using the model: >general<, >web<, >news<, >doaj< and >wiki< (standing for general texts, web crawled texts, news, article abstracts and wikipedia texts). Use the prefixes like this, e.g: ">web< Kas tead, et".
|
16 |
+
|
17 |
+
### Model details
|
18 |
+
|
19 |
- num. of layers: 24
|
20 |
- num. of heads: 24
|
21 |
- embedding size: 1536
|