Update README.md
Browse files
README.md
CHANGED
@@ -40,21 +40,6 @@ model-index:
|
|
40 |
A historical Swedish Bert model is released from the National Swedish Archives to better generalise to Swedish historical text. Researches are well-aware that the Swedish language has been subject to change over time which means that present-day point-of-view models less ideal candidates for the job.
|
41 |
However, this model can be used to interpret and analyse historical textual material and be fine-tuned for different downstream tasks.
|
42 |
|
43 |
-
## Model Dscription
|
44 |
-
|
45 |
-
The following hyperparameters were used during training:
|
46 |
-
- learning_rate: 3e-05
|
47 |
-
- train_batch_size: 8
|
48 |
-
- eval_batch_size: 8
|
49 |
-
- seed: 42
|
50 |
-
- gradient_accumulation_steps: 0
|
51 |
-
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
52 |
-
- lr_scheduler_type: linear
|
53 |
-
- num_epochs: 6
|
54 |
-
- fp16: False
|
55 |
-
|
56 |
-
Dataset:
|
57 |
-
- Khubist2, which has been cleaned and chunked
|
58 |
|
59 |
## Intended uses & limitations
|
60 |
This model should primarly be used to fine-tune further on and downstream tasks.
|
@@ -70,8 +55,24 @@ print(summarizer(historical_text))
|
|
70 |
```
|
71 |
|
72 |
|
|
|
|
|
|
|
73 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
74 |
|
|
|
|
|
75 |
|
76 |
## Acknowledgements
|
77 |
We gratefully acknowledge EuroHPC (https://eurohpc-ju.europa.eu) for funding this research by providing computing resources of the HPC system Vega at the Institute of Information Science (https://www.izum.si)
|
|
|
40 |
A historical Swedish Bert model is released from the National Swedish Archives to better generalise to Swedish historical text. Researches are well-aware that the Swedish language has been subject to change over time which means that present-day point-of-view models less ideal candidates for the job.
|
41 |
However, this model can be used to interpret and analyse historical textual material and be fine-tuned for different downstream tasks.
|
42 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
43 |
|
44 |
## Intended uses & limitations
|
45 |
This model should primarly be used to fine-tune further on and downstream tasks.
|
|
|
55 |
```
|
56 |
|
57 |
|
58 |
+
## Model Dscription
|
59 |
+
The training procedure can be recreated from here: https://github.com/Borg93/kbuhist2/tree/main
|
60 |
+
The preprocessing procedure can be recreated from here: https://github.com/Borg93/kbuhist2/tree/main
|
61 |
|
62 |
+
### Model
|
63 |
+
The following hyperparameters were used during training:
|
64 |
+
- learning_rate: 3e-05
|
65 |
+
- train_batch_size: 8
|
66 |
+
- eval_batch_size: 8
|
67 |
+
- seed: 42
|
68 |
+
- gradient_accumulation_steps: 0
|
69 |
+
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
70 |
+
- lr_scheduler_type: linear
|
71 |
+
- num_epochs: 6
|
72 |
+
- fp16: False
|
73 |
|
74 |
+
### Dataset (WIP)
|
75 |
+
- Khubist2, which has been cleaned and chunked.
|
76 |
|
77 |
## Acknowledgements
|
78 |
We gratefully acknowledge EuroHPC (https://eurohpc-ju.europa.eu) for funding this research by providing computing resources of the HPC system Vega at the Institute of Information Science (https://www.izum.si)
|