Update README.md
Browse files
README.md
CHANGED
@@ -22,12 +22,17 @@ Primary Users: Financial analysts, NLP researchers, and developers working on fi
|
|
22 |
Training Dataset: The model was fine-tuned on a custom dataset of financial communication texts. The dataset was split into training, validation, and test sets as follows:
|
23 |
|
24 |
Training Set: 10,918,272 tokens
|
|
|
25 |
Validation Set: 1,213,184 tokens
|
|
|
26 |
Test Set: 1,347,968 tokens
|
27 |
|
28 |
Pre-training Dataset: FinBERT was pre-trained on a large financial corpus totaling 4.9 billion tokens, including:
|
|
|
29 |
Corporate Reports (10-K & 10-Q): 2.5 billion tokens
|
|
|
30 |
Earnings Call Transcripts: 1.3 billion tokens
|
|
|
31 |
Analyst Reports: 1.1 billion tokens
|
32 |
|
33 |
## Evaluation
|
|
|
22 |
Training Dataset: The model was fine-tuned on a custom dataset of financial communication texts. The dataset was split into training, validation, and test sets as follows:
|
23 |
|
24 |
Training Set: 10,918,272 tokens
|
25 |
+
|
26 |
Validation Set: 1,213,184 tokens
|
27 |
+
|
28 |
Test Set: 1,347,968 tokens
|
29 |
|
30 |
Pre-training Dataset: FinBERT was pre-trained on a large financial corpus totaling 4.9 billion tokens, including:
|
31 |
+
|
32 |
Corporate Reports (10-K & 10-Q): 2.5 billion tokens
|
33 |
+
|
34 |
Earnings Call Transcripts: 1.3 billion tokens
|
35 |
+
|
36 |
Analyst Reports: 1.1 billion tokens
|
37 |
|
38 |
## Evaluation
|