oop-de-qg-flan-t5-base-v4
This model is a fine-tuned version of google/flan-t5-base on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.8388
- Rouge1: 59.6054
- Rouge2: 46.5045
- Rougel: 58.1566
- Rougelsum: 58.1981
- Gen Len: 14.5287
- Bleu: 0.3568
- Precisions: [0.6571719226856562, 0.4774637127578304, 0.3935286401399213, 0.32975460122699385]
- Brevity Penalty: 0.7943
- Length Ratio: 0.8128
- Translation Length: 2949
- Reference Length: 3628
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 10
- eval_batch_size: 10
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 40
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 8
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
No log | 1.0 | 58 | 0.9821 | 56.9689 | 42.8079 | 55.2974 | 55.3949 | 14.6133 | 0.3163 | [0.6255566974991436, 0.43508500772797526, 0.34603455914931325, 0.2808930425752856] | 0.7844 | 0.8046 | 2919 | 3628 |
No log | 1.99 | 116 | 0.9155 | 57.2587 | 43.5697 | 55.6919 | 55.8245 | 14.4018 | 0.3208 | [0.629286694101509, 0.43945841392649904, 0.35226264418811004, 0.28861154446177845] | 0.7834 | 0.8037 | 2916 | 3628 |
No log | 2.99 | 174 | 0.8738 | 58.2245 | 44.9681 | 56.8015 | 56.9123 | 14.2719 | 0.3356 | [0.6483021483021483, 0.461839530332681, 0.37634892086330934, 0.31484416270470156] | 0.7733 | 0.7955 | 2886 | 3628 |
No log | 4.0 | 233 | 0.8636 | 59.608 | 46.4603 | 58.1054 | 58.1991 | 14.4381 | 0.3494 | [0.6566037735849056, 0.47794117647058826, 0.38925876608965826, 0.32466181061394384] | 0.7830 | 0.8035 | 2915 | 3628 |
No log | 5.0 | 291 | 0.8460 | 59.0765 | 45.7893 | 57.418 | 57.573 | 14.5196 | 0.3450 | [0.6488601565158217, 0.46702453987730064, 0.3790074659639877, 0.31500513874614594] | 0.7910 | 0.8101 | 2939 | 3628 |
No log | 5.99 | 349 | 0.8427 | 58.3394 | 44.9653 | 56.6741 | 56.7472 | 14.6254 | 0.3439 | [0.6412933647692826, 0.4586808188021228, 0.3740788903337668, 0.30870445344129555] | 0.8009 | 0.8184 | 2969 | 3628 |
No log | 6.99 | 407 | 0.8398 | 59.5629 | 46.3188 | 58.0534 | 58.1472 | 14.5257 | 0.3549 | [0.65625, 0.47722923842326825, 0.39176161262050835, 0.32752434648898] | 0.7927 | 0.8115 | 2944 | 3628 |
No log | 7.97 | 464 | 0.8388 | 59.6054 | 46.5045 | 58.1566 | 58.1981 | 14.5287 | 0.3568 | [0.6571719226856562, 0.4774637127578304, 0.3935286401399213, 0.32975460122699385] | 0.7943 | 0.8128 | 2949 | 3628 |
Framework versions
- Transformers 4.35.2
- Pytorch 2.1.0+cu121
- Datasets 2.16.1
- Tokenizers 0.15.1
- Downloads last month
- 4
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
Model tree for LunaticTanuki/oop-de-qg-flan-t5-base-v4
Base model
google/flan-t5-base