Update README.md
Browse files
README.md
CHANGED
@@ -6,7 +6,7 @@ language:
|
|
6 |
- en
|
7 |
---
|
8 |
|
9 |
-
#
|
10 |
The **RetrievaBERT** is the pre-trained Transformer Encoder using Megatron-LM.
|
11 |
It is designed for use in Japanese.
|
12 |
|
@@ -70,7 +70,7 @@ For detailed configuration, refer to the config.json file.
|
|
70 |
## Training Details
|
71 |
|
72 |
### Training Data
|
73 |
-
The
|
74 |
- [Japanese CommonCrawl Dataset by LLM-jp](https://gitlab.llm-jp.nii.ac.jp/datasets/llm-jp-corpus-v2).
|
75 |
- [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb).
|
76 |
- Chinese Wikipedia dumped on 20240120.
|
@@ -112,7 +112,7 @@ We adjusted the learning rate and training epochs for each model and task in acc
|
|
112 |
## Technical Specifications
|
113 |
|
114 |
### Model Architectures
|
115 |
-
The
|
116 |
|
117 |
- Number of layers: 48
|
118 |
- Hidden layer size: 1536
|
|
|
6 |
- en
|
7 |
---
|
8 |
|
9 |
+
# RetrievaBERT Model
|
10 |
The **RetrievaBERT** is the pre-trained Transformer Encoder using Megatron-LM.
|
11 |
It is designed for use in Japanese.
|
12 |
|
|
|
70 |
## Training Details
|
71 |
|
72 |
### Training Data
|
73 |
+
The RetrievaBERT model was pre-trained on the reunion of five datasets:
|
74 |
- [Japanese CommonCrawl Dataset by LLM-jp](https://gitlab.llm-jp.nii.ac.jp/datasets/llm-jp-corpus-v2).
|
75 |
- [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb).
|
76 |
- Chinese Wikipedia dumped on 20240120.
|
|
|
112 |
## Technical Specifications
|
113 |
|
114 |
### Model Architectures
|
115 |
+
The RetrievaBERT model is based on BERT with the following hyperparameters:
|
116 |
|
117 |
- Number of layers: 48
|
118 |
- Hidden layer size: 1536
|