Marvin
Initial commit
078ae09 unverified
---
language:
- de
tags:
- question-generation
- german
- text2text-generation
- generated_from_trainer
datasets:
- lmqg/qg_dequad
metrics:
- bleu4
- f1
- rouge
- exact_match
model-index:
- name: german-jeopardy-mt5-base-128
results:
- task:
name: Sequence-to-sequence Language Modeling
type: text2text-generation
dataset:
name: lmqg/qg_dequad
type: default
args: default
metrics:
- name: BLEU-4
type: bleu4
value: 14.62
- name: F1
type: f1
value: 39.47
- name: ROUGE-1
type: rouge1
value: 40.45
- name: ROUGE-2
type: rouge2
value: 21.49
- name: ROUGE-L
type: rougel
value: 39.02
- name: ROUGE-Lsum
type: rougelsum
value: 39.01
- name: Exact Match
type: exact_match
value: 2.68
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# german-jeopardy-mt5-base-128
This model is a fine-tuned version of [google/mt5-base](https://huggingface.co/google/mt5-base) on the [lmqg/qg_dequad](https://huggingface.co/datasets/lmqg/qg_dequad) dataset.
It achieves the following results on the evaluation set:
- Loss: 1.56
- Brevity Penalty: 0.8709
- System Length: 18267
- Reference Length: 20793
- ROUGE-1: 40.45
- ROUGE-2: 21.49
- ROUGE-L: 39.02
- ROUGE-Lsum: 39.01
- Exact Match: 2.68
- BLEU: 14.62
- F1: 39.47
## Model description
See [google/mt5-base](https://huggingface.co/google/mt5-base) for the model architecture.
The model was trained on a single NVIDIA RTX 3090 GPU with 24GB of VRAM.
## Intended uses & limitations
This model can be used for question generation on German text.
## Training and evaluation data
See [lmqg/qg_dequad](https://huggingface.co/datasets/lmqg/qg_dequad).
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 4
- eval_batch_size: 4
- seed: 7
- gradient_accumulation_steps: 32
- total_train_batch_size: 128
- optimizer: Adafactor
- lr_scheduler_type: constant
- num_epochs: 20
### Training results
| Training Loss | Epoch | Step | Validation Loss | Counts 1 | Counts 2 | Counts 3 | Counts 4 | Totals 1 | Totals 2 | Totals 3 | Totals 4 | Precisions 1 | Precisions 2 | Precisions 3 | Precisions 4 | Brevity Penalty | System Length | Reference Length | ROUGE-1 | ROUGE-2 | ROUGE-L | ROUGE-Lsum | Exact Match | BLEU | Mean Generated Length | F1 |
|:-------------:|:-----:|:----:|:---------------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:------------:|:------------:|:------------:|:------------:|:---------------:|:-------------:|:----------------:|:-------:|:-------:|:-------:|:----------:|:-----------:|:-------:|:---------------------:|:------:|
| 6.6905 | 0.99 | 72 | 2.0972 | 5515 | 1394 | 522 | 191 | 28172 | 25968 | 23764 | 21560 | 19.5762 | 5.3681 | 2.1966 | 0.8859 | 1.0 | 28172 | 21250 | 0.1942 | 0.0761 | 0.1837 | 0.1841 | 0.0 | 3.7816 | 11.2786 | 0.2106 |
| 2.4978 | 1.99 | 145 | 1.6211 | 7079 | 2339 | 1027 | 446 | 16544 | 14340 | 12136 | 9932 | 42.7889 | 16.311 | 8.4624 | 4.4905 | 0.7524 | 16544 | 21250 | 0.3097 | 0.1455 | 0.2971 | 0.2969 | 0.01 | 9.6021 | 12.0159 | 0.3032 |
| 2.1021 | 3.0 | 218 | 1.5342 | 7507 | 2637 | 1222 | 575 | 17211 | 15007 | 12803 | 10599 | 43.6175 | 17.5718 | 9.5446 | 5.425 | 0.7908 | 17211 | 21250 | 0.3304 | 0.1642 | 0.3172 | 0.3171 | 0.0141 | 11.162 | 12.6375 | 0.3228 |
| 1.9208 | 4.0 | 291 | 1.4862 | 7599 | 2755 | 1296 | 620 | 16871 | 14667 | 12463 | 10259 | 45.0418 | 18.7837 | 10.3988 | 6.0435 | 0.7714 | 16871 | 21250 | 0.3377 | 0.1721 | 0.3232 | 0.3229 | 0.015 | 11.7136 | 12.3938 | 0.33 |
| 1.8135 | 4.99 | 363 | 1.4626 | 7831 | 2955 | 1424 | 694 | 17184 | 14980 | 12776 | 10572 | 45.5715 | 19.7263 | 11.1459 | 6.5645 | 0.7893 | 17184 | 21250 | 0.3497 | 0.1837 | 0.3358 | 0.3354 | 0.0177 | 12.6402 | 12.6366 | 0.3417 |
| 1.6907 | 5.99 | 436 | 1.4392 | 7872 | 3023 | 1482 | 740 | 16907 | 14703 | 12499 | 10295 | 46.5606 | 20.5604 | 11.8569 | 7.188 | 0.7735 | 16907 | 21250 | 0.3566 | 0.1896 | 0.3432 | 0.343 | 0.0177 | 13.0722 | 12.564 | 0.3483 |
| 1.6159 | 6.99 | 509 | 1.4288 | 7981 | 3128 | 1542 | 773 | 17016 | 14812 | 12608 | 10404 | 46.9029 | 21.118 | 12.2303 | 7.4298 | 0.7797 | 17016 | 21250 | 0.363 | 0.1952 | 0.3504 | 0.3502 | 0.0191 | 13.5053 | 12.5749 | 0.3543 |
| 1.556 | 8.0 | 582 | 1.4132 | 8014 | 3046 | 1496 | 748 | 17320 | 15116 | 12912 | 10708 | 46.2702 | 20.1508 | 11.5861 | 6.9854 | 0.797 | 17320 | 21250 | 0.3632 | 0.1903 | 0.3489 | 0.3491 | 0.0222 | 13.2095 | 12.7641 | 0.355 |
| 1.4951 | 9.0 | 655 | 1.3926 | 8342 | 3271 | 1622 | 819 | 17178 | 14974 | 12770 | 10566 | 48.5621 | 21.8445 | 12.7016 | 7.7513 | 0.789 | 17178 | 21250 | 0.3843 | 0.2059 | 0.3704 | 0.3704 | 0.0218 | 14.1831 | 12.7654 | 0.3769 |
| 1.4522 | 9.99 | 727 | 1.3769 | 8639 | 3449 | 1740 | 891 | 17708 | 15504 | 13300 | 11096 | 48.7859 | 22.2459 | 13.0827 | 8.0299 | 0.8187 | 17708 | 21250 | 0.3972 | 0.2129 | 0.3821 | 0.3823 | 0.024 | 15.0442 | 13.1016 | 0.3895 |
| 1.3663 | 10.99 | 800 | 1.3677 | 8736 | 3468 | 1747 | 924 | 17674 | 15470 | 13266 | 11062 | 49.4285 | 22.4176 | 13.169 | 8.3529 | 0.8168 | 17674 | 21250 | 0.4027 | 0.215 | 0.3871 | 0.387 | 0.0245 | 15.2622 | 13.0399 | 0.3946 |
| 1.3122 | 11.99 | 873 | 1.3521 | 8833 | 3533 | 1780 | 915 | 17927 | 15723 | 13519 | 11315 | 49.272 | 22.4703 | 13.1667 | 8.0866 | 0.8308 | 17927 | 21250 | 0.4055 | 0.219 | 0.3915 | 0.3915 | 0.0222 | 15.3943 | 13.3494 | 0.3975 |
| 1.2641 | 13.0 | 946 | 1.3494 | 9048 | 3668 | 1864 | 989 | 18242 | 16038 | 13834 | 11630 | 49.5998 | 22.8707 | 13.474 | 8.5039 | 0.848 | 18242 | 21250 | 0.4165 | 0.2265 | 0.4011 | 0.401 | 0.0268 | 16.1011 | 13.5508 | 0.408 |
| 1.2359 | 13.99 | 1018 | 1.3488 | 9075 | 3709 | 1907 | 1013 | 18098 | 15894 | 13690 | 11486 | 50.1437 | 23.3359 | 13.9299 | 8.8194 | 0.8402 | 18098 | 21250 | 0.4195 | 0.2298 | 0.4041 | 0.4038 | 0.0259 | 16.3595 | 13.5681 | 0.4113 |
| 1.1754 | 14.99 | 1091 | 1.3482 | 9182 | 3777 | 1957 | 1048 | 18366 | 16162 | 13958 | 11754 | 49.9946 | 23.3696 | 14.0206 | 8.9161 | 0.8547 | 18366 | 21250 | 0.4227 | 0.2314 | 0.406 | 0.4058 | 0.0268 | 16.7083 | 13.6534 | 0.4145 |
| 1.1367 | 15.99 | 1164 | 1.3501 | 9164 | 3761 | 1935 | 1033 | 18310 | 16106 | 13902 | 11698 | 50.0492 | 23.3515 | 13.9189 | 8.8306 | 0.8517 | 18310 | 21250 | 0.4225 | 0.2316 | 0.4078 | 0.4079 | 0.0245 | 16.5803 | 13.6152 | 0.4147 |
| 1.096 | 17.0 | 1237 | 1.3586 | 9126 | 3712 | 1922 | 1050 | 18277 | 16073 | 13869 | 11665 | 49.9316 | 23.0946 | 13.8582 | 9.0013 | 0.8499 | 18277 | 21250 | 0.4217 | 0.2304 | 0.4066 | 0.4066 | 0.0295 | 16.5513 | 13.6325 | 0.4141 |
| 1.0571 | 18.0 | 1310 | 1.3658 | 9087 | 3707 | 1923 | 1033 | 18179 | 15975 | 13771 | 11567 | 49.9862 | 23.205 | 13.9641 | 8.9306 | 0.8446 | 18179 | 21250 | 0.4196 | 0.2301 | 0.4049 | 0.4049 | 0.029 | 16.4708 | 13.5172 | 0.4116 |
| 1.036 | 18.99 | 1382 | 1.3672 | 9206 | 3806 | 1976 | 1059 | 18332 | 16128 | 13924 | 11720 | 50.2182 | 23.5987 | 14.1913 | 9.0358 | 0.8528 | 18332 | 21250 | 0.4254 | 0.2348 | 0.4106 | 0.4107 | 0.0309 | 16.8386 | 13.7205 | 0.4174 |
| 0.9785 | 19.79 | 1440 | 1.3819 | 9180 | 3796 | 1973 | 1059 | 18164 | 15960 | 13756 | 11552 | 50.5395 | 23.7845 | 14.3428 | 9.1672 | 0.8438 | 18164 | 21250 | 0.4254 | 0.2344 | 0.4116 | 0.4117 | 0.0327 | 16.8234 | 13.5113 | 0.4172 |
### Framework versions
- Transformers 4.32.1
- Pytorch 2.1.0
- Datasets 2.12.0
- Tokenizers 0.13.3