Update README.md
Browse files
README.md
CHANGED
@@ -23,9 +23,9 @@ widget:
|
|
23 |
|
24 |
SEC-BERT is a family of BERT models for the financial domain, intended to assist financial NLP research and FinTech applications.
|
25 |
SEC-BERT consists of the following models:
|
26 |
-
* SEC-BERT-BASE (this model): Same architecture as BERT-BASE trained on financial documents.
|
27 |
-
* [SEC-BERT-NUM](https://huggingface.co/nlpaueb/sec-bert-num): Same as SEC-BERT-BASE but we replace every number token with a [NUM] pseudo-token handling all numeric expressions in a uniform manner, disallowing their fragmentation
|
28 |
-
* [SEC-BERT-SHAPE](https://huggingface.co/nlpaueb/sec-bert-shape): Same as SEC-BERT-BASE but we replace numbers with pseudo-tokens that represent the number’s shape, so numeric expressions (of known shapes) are no longer fragmented, e.g., '53.2' becomes '[XX.X]' and '40,200.5' becomes '[XX,XXX.X]'.
|
29 |
</div>
|
30 |
|
31 |
## Pre-training corpus
|
|
|
23 |
|
24 |
SEC-BERT is a family of BERT models for the financial domain, intended to assist financial NLP research and FinTech applications.
|
25 |
SEC-BERT consists of the following models:
|
26 |
+
* **SEC-BERT-BASE** (this model): Same architecture as BERT-BASE trained on financial documents.
|
27 |
+
* [**SEC-BERT-NUM**](https://huggingface.co/nlpaueb/sec-bert-num): Same as SEC-BERT-BASE but we replace every number token with a [NUM] pseudo-token handling all numeric expressions in a uniform manner, disallowing their fragmentation
|
28 |
+
* [**SEC-BERT-SHAPE**](https://huggingface.co/nlpaueb/sec-bert-shape): Same as SEC-BERT-BASE but we replace numbers with pseudo-tokens that represent the number’s shape, so numeric expressions (of known shapes) are no longer fragmented, e.g., '53.2' becomes '[XX.X]' and '40,200.5' becomes '[XX,XXX.X]'.
|
29 |
</div>
|
30 |
|
31 |
## Pre-training corpus
|