metadata
language:
- de
tags:
- question-generation
- german
- text2text-generation
- generated_from_trainer
datasets:
- lmqg/qg_dequad
metrics:
- bleu4
- f1
- rouge
- exact_match
model-index:
- name: german-jeopardy-mt5-large
results:
- task:
name: Sequence-to-sequence Language Modeling
type: text2text-generation
dataset:
name: lmqg/qg_dequad
type: default
args: default
metrics:
- name: BLEU-4
type: bleu4
value: 15.09
- name: F1
type: f1
value: 40.69
- name: ROUGE-1
type: rouge1
value: 41.68
- name: ROUGE-2
type: rouge2
value: 22.07
- name: ROUGE-L
type: rougel
value: 40.2
- name: ROUGE-Lsum
type: rougelsum
value: 40.19
- name: Exact Match
type: exact_match
value: 2.77
german-jeopardy-mt5-large-1k-64-constant
This model is a fine-tuned version of google/mt5-large on the lmqg/qg_dequad dataset. It achieves the following results on the evaluation set:
- Loss: 1.8162
- Brevity Penalty: 0.9152
- System Length: 19102
- Reference Length: 20793
- ROUGE-1: 41.68
- ROUGE-2: 22.07
- ROUGE-L: 40.20
- ROUGE-Lsum: 40.19
- Exact Match: 2.77
- BLEU: 15.09
- F1: 40.69
Model description
See google/mt5-large for the model architecture.
The model was trained on a single NVIDIA RTX 3090 GPU with 24GB of VRAM.
Intended uses & limitations
This model can be used for question generation on German text.
Training and evaluation data
See lmqg/qg_dequad.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 1
- eval_batch_size: 1
- seed: 7
- gradient_accumulation_steps: 64
- total_train_batch_size: 64
- optimizer: Adafactor
- lr_scheduler_type: constant
- num_epochs: 20
Training results
Training Loss | Epoch | Step | BLEU | Brevity Penalty | Counts 1 | Counts 2 | Counts 3 | Counts 4 | Exact Match | F1 | Mean Generated Length | Validation Loss | Precisions 1 | Precisions 2 | Precisions 3 | Precisions 4 | Reference Length | ROUGE-1 | ROUGE-2 | ROUGE-L | ROUGE-Lsum | System Length | Totals 1 | Totals 2 | Totals 3 | Totals 4 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2.732 | 1.0 | 145 | 12.4473 | 0.7805 | 7779 | 2893 | 1393 | 685 | 0.0168 | 0.3393 | 12.2523 | 1.2989 | 45.6809 | 19.5143 | 11.0372 | 6.5758 | 21250 | 0.3487 | 0.1796 | 0.3329 | 0.3327 | 17029 | 17029 | 14825 | 12621 | 10417 |
1.5514 | 2.0 | 291 | 14.7663 | 0.7871 | 8297 | 3336 | 1711 | 899 | 0.025 | 0.3743 | 12.441 | 1.2100 | 48.3931 | 22.3278 | 13.4333 | 8.5351 | 21250 | 0.3839 | 0.2089 | 0.3688 | 0.369 | 17145 | 17145 | 14941 | 12737 | 10533 |
1.3546 | 3.0 | 435 | 1.1428 | 8930 | 3713 | 1905 | 1022 | 17018 | 14814 | 12610 | 10406 | 52.4739 | 25.0641 | 15.1071 | 9.8213 | 0.7798 | 17018 | 21250 | 0.4225 | 0.2345 | 0.4075 | 0.4074 | 0.034 | 16.3903 | 12.6021 | 0.4155 |
1.1969 | 4.0 | 581 | 1.1113 | 9456 | 3994 | 2096 | 1157 | 18171 | 15967 | 13763 | 11559 | 52.039 | 25.0141 | 15.2292 | 10.0095 | 0.8441 | 18171 | 21250 | 0.4409 | 0.246 | 0.4251 | 0.4251 | 0.0386 | 17.8161 | 13.4061 | 0.4334 |
1.0876 | 5.0 | 726 | 1.1032 | 9606 | 4162 | 2233 | 1243 | 18179 | 15975 | 13771 | 11567 | 52.8412 | 26.0532 | 16.2152 | 10.7461 | 0.8446 | 18179 | 21250 | 0.4504 | 0.2571 | 0.4356 | 0.4357 | 0.0377 | 18.6911 | 13.5599 | 0.443 |
0.9881 | 6.0 | 872 | 1.1119 | 9608 | 4167 | 2235 | 1246 | 18245 | 16041 | 13837 | 11633 | 52.661 | 25.9772 | 16.1523 | 10.7109 | 0.8481 | 18245 | 21250 | 0.4505 | 0.2567 | 0.4348 | 0.4349 | 0.044 | 18.7071 | 13.6978 | 0.4429 |
0.9142 | 7.0 | 1017 | 1.1106 | 9757 | 4285 | 2311 | 1310 | 18291 | 16087 | 13883 | 11679 | 53.3432 | 26.6364 | 16.6463 | 11.2167 | 0.8506 | 18291 | 21250 | 0.4587 | 0.2641 | 0.4427 | 0.443 | 0.0495 | 19.3053 | 13.5826 | 0.451 |
0.8323 | 8.0 | 1163 | 1.1327 | 9757 | 4300 | 2341 | 1317 | 18293 | 16089 | 13885 | 11681 | 53.3373 | 26.7263 | 16.8599 | 11.2747 | 0.8507 | 18293 | 21250 | 0.4587 | 0.2662 | 0.4429 | 0.4426 | 0.0472 | 19.4102 | 13.6239 | 0.4513 |
0.7742 | 9.0 | 1308 | 1.1574 | 9757 | 4273 | 2324 | 1320 | 18273 | 16069 | 13865 | 11661 | 53.3957 | 26.5916 | 16.7616 | 11.3198 | 0.8497 | 18273 | 21250 | 0.4585 | 0.2653 | 0.4431 | 0.443 | 0.049 | 19.3574 | 13.5944 | 0.451 |
0.7101 | 10.0 | 1454 | 1.1674 | 9861 | 4403 | 2438 | 1416 | 18641 | 16437 | 14233 | 12029 | 52.8995 | 26.7871 | 17.1292 | 11.7716 | 0.8694 | 18641 | 21250 | 0.4594 | 0.2689 | 0.444 | 0.4435 | 0.0531 | 20.1003 | 13.9133 | 0.4525 |
0.6642 | 10.99 | 1599 | 1.1889 | 9868 | 4380 | 2358 | 1337 | 18386 | 16182 | 13978 | 11774 | 53.6713 | 27.0671 | 16.8694 | 11.3555 | 0.8558 | 18386 | 21250 | 0.4622 | 0.2694 | 0.4469 | 0.4466 | 0.0476 | 19.655 | 13.9142 | 0.4551 |
0.6067 | 12.0 | 1745 | 1.2207 | 9872 | 4384 | 2408 | 1395 | 18894 | 16690 | 14486 | 12282 | 52.2494 | 26.2672 | 16.6229 | 11.3581 | 0.8828 | 18894 | 21250 | 0.4569 | 0.2667 | 0.441 | 0.4408 | 0.0472 | 19.9169 | 14.2482 | 0.4489 |
0.5684 | 12.99 | 1890 | 1.2587 | 9870 | 4356 | 2360 | 1329 | 18901 | 16697 | 14493 | 12289 | 52.2195 | 26.0885 | 16.2837 | 10.8145 | 0.8831 | 18901 | 21250 | 0.4581 | 0.2651 | 0.4414 | 0.4409 | 0.0485 | 19.5451 | 14.2432 | 0.4506 |
0.5288 | 14.0 | 2036 | 1.2804 | 9815 | 4360 | 2389 | 1335 | 18367 | 16163 | 13959 | 11755 | 53.4382 | 26.9752 | 17.1144 | 11.3569 | 0.8547 | 18367 | 21250 | 0.4592 | 0.2671 | 0.4443 | 0.4436 | 0.0454 | 19.6648 | 13.7432 | 0.4504 |
0.4902 | 14.99 | 2181 | 1.3211 | 9886 | 4407 | 2398 | 1359 | 18777 | 16573 | 14369 | 12165 | 52.6495 | 26.5914 | 16.6887 | 11.1714 | 0.8766 | 18777 | 21250 | 0.4582 | 0.2674 | 0.4426 | 0.4421 | 0.0495 | 19.8138 | 14.1225 | 0.451 |
0.4498 | 16.0 | 2327 | 1.3621 | 10008 | 4477 | 2456 | 1381 | 19399 | 17195 | 14991 | 12787 | 51.5903 | 26.0366 | 16.3832 | 10.8 | 0.909 | 19399 | 21250 | 0.4569 | 0.2679 | 0.4415 | 0.4412 | 0.0476 | 20.0703 | 14.3725 | 0.4491 |
0.4216 | 16.99 | 2472 | 1.3967 | 10016 | 4483 | 2455 | 1385 | 19125 | 16921 | 14717 | 12513 | 52.3712 | 26.4937 | 16.6814 | 11.0685 | 0.8948 | 19125 | 21250 | 0.4615 | 0.2705 | 0.4457 | 0.4451 | 0.0481 | 20.1319 | 14.3008 | 0.4531 |
0.3829 | 18.0 | 2618 | 1.4460 | 9976 | 4407 | 2412 | 1374 | 19464 | 17260 | 15056 | 12852 | 51.2536 | 25.533 | 16.0202 | 10.6909 | 0.9123 | 19464 | 21250 | 0.4556 | 0.2627 | 0.4387 | 0.4385 | 0.0476 | 19.8508 | 14.7046 | 0.4479 |
0.3551 | 19.0 | 2764 | 1.4725 | 10010 | 4451 | 2438 | 1385 | 19131 | 16927 | 14723 | 12519 | 52.3235 | 26.2953 | 16.5591 | 11.0632 | 0.8952 | 19131 | 21250 | 0.4606 | 0.2672 | 0.4438 | 0.4434 | 0.0463 | 20.0572 | 14.3807 | 0.4523 |
0.3301 | 19.93 | 2900 | 1.5030 | 9858 | 4378 | 2406 | 1368 | 18872 | 16668 | 14464 | 12260 | 52.2361 | 26.2659 | 16.6344 | 11.1582 | 0.8816 | 18872 | 21250 | 0.4569 | 0.2644 | 0.4412 | 0.4405 | 0.0495 | 19.8047 | 14.2795 | 0.4483 |
Framework versions
- Transformers 4.32.1
- Pytorch 2.1.0
- Datasets 2.12.0
- Tokenizers 0.13.3