Philip May
commited on
Commit
•
151bbf4
1
Parent(s):
87fae06
Update README.md
Browse files
README.md
CHANGED
@@ -62,7 +62,7 @@ The resulting model called `xlm-r-distilroberta-base-paraphrase-v1` has been rel
|
|
62 |
|
63 |
Building on this cross language model we fine-tuned it for English and German language on the [STSbenchmark](http://ixa2.si.ehu.es/stswiki/index.php/STSbenchmark) dataset. For German language we used the dataset of our [German STSbenchmark dataset](https://github.com/t-systems-on-site-services-gmbh/german-STSbenchmark) which has been translated with [deepl.com](https://www.deepl.com/translator). Additionally to the German and English training samples we generated samples of English and German crossed. We call this _multilingual finetuning with language-crossing_. It doubled the traing-datasize and tests show that it further improves performance.
|
64 |
|
65 |
-
We did an automatic hyperparameter search for 33 trials with [Optuna](https://github.com/optuna/optuna). Using 10-fold crossvalidation on the deepl.com test and dev dataset we found the following best
|
66 |
- batch_size = 8
|
67 |
- num_epochs = 2
|
68 |
- lr = 1.026343323298136e-05,
|
|
|
62 |
|
63 |
Building on this cross language model we fine-tuned it for English and German language on the [STSbenchmark](http://ixa2.si.ehu.es/stswiki/index.php/STSbenchmark) dataset. For German language we used the dataset of our [German STSbenchmark dataset](https://github.com/t-systems-on-site-services-gmbh/german-STSbenchmark) which has been translated with [deepl.com](https://www.deepl.com/translator). Additionally to the German and English training samples we generated samples of English and German crossed. We call this _multilingual finetuning with language-crossing_. It doubled the traing-datasize and tests show that it further improves performance.
|
64 |
|
65 |
+
We did an automatic hyperparameter search for 33 trials with [Optuna](https://github.com/optuna/optuna). Using 10-fold crossvalidation on the deepl.com test and dev dataset we found the following best hyperparameters:
|
66 |
- batch_size = 8
|
67 |
- num_epochs = 2
|
68 |
- lr = 1.026343323298136e-05,
|