Update README.md
Browse files
README.md
CHANGED
@@ -16,18 +16,25 @@ widget:
|
|
16 |
|
17 |
# t5-base-dutch-demo
|
18 |
|
19 |
-
|
20 |
-
[
|
21 |
-
Want to give it a try? Then head over to the Hugging Face Spaces for the [Netherformer](https://huggingface.co/spaces/flax-community/netherformer) example application.
|
22 |
|
23 |
This model is based on [t5-base-dutch](https://huggingface.co/flax-community/t5-base-dutch)
|
24 |
and fine-tuned to create summaries of news articles.
|
25 |
|
|
|
|
|
26 |
## Dataset
|
27 |
|
28 |
`t5-base-dutch-demo` is fine-tuned on three mixed news sources:
|
29 |
|
30 |
-
1. CNN
|
31 |
-
2. XSUM translated to Dutch with MarianMt.
|
32 |
3. News article summaries distilled from the nu.nl website.
|
33 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
16 |
|
17 |
# t5-base-dutch-demo
|
18 |
|
19 |
+
|
20 |
+
Created by [Yeb Havinga](https://www.linkedin.com/in/yeb-havinga-86530825/) & [Dat Nguyen](https://www.linkedin.com/in/dat-nguyen-49a641138/) during the [Hugging Face community week](https://discuss.huggingface.co/t/open-to-the-community-community-week-using-jax-flax-for-nlp-cv/7104)
|
|
|
21 |
|
22 |
This model is based on [t5-base-dutch](https://huggingface.co/flax-community/t5-base-dutch)
|
23 |
and fine-tuned to create summaries of news articles.
|
24 |
|
25 |
+
For a demo of the model, head over to the Hugging Face Spaces for the [Netherformer](https://huggingface.co/spaces/flax-community/netherformer) example application!
|
26 |
+
|
27 |
## Dataset
|
28 |
|
29 |
`t5-base-dutch-demo` is fine-tuned on three mixed news sources:
|
30 |
|
31 |
+
1. **CNN DailyMail** translated to Dutch with MarianMT.
|
32 |
+
2. **XSUM** translated to Dutch with MarianMt.
|
33 |
3. News article summaries distilled from the nu.nl website.
|
34 |
|
35 |
+
## Training
|
36 |
+
|
37 |
+
The pre-trained model [t5-base-dutch](https://huggingface.co/flax-community/t5-base-dutch) was fine-tuned with a constant learning rate of 0.0005, a batch size of 64, for 10.000 steps.
|
38 |
+
The performance of this model can be improved with longer training. Unfortunately due to a bug, an earlier training script would not save intermediate checkpoints, and had been started for 6 epochs, and would finish past the availability of the TPU-VM. Since there was limited time left, the fine-tuning was restarted without evaluation and for only half an epoch (10.000 steps).
|
39 |
+
|
40 |
+
|