metadata
license: apache-2.0
base_model: google-t5/t5-small
tags:
- generated_from_trainer
metrics:
- rouge
model-index:
- name: lm43-course
results: []
lm43-course
This model is a fine-tuned version of google-t5/t5-small on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.9161
- Rouge1: 0.4161
- Rouge2: 0.1903
- Rougel: 0.2908
- Rougelsum: 0.2907
- Gen Len: 79.0133
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5.6e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 8
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
1.3159 | 0.3195 | 100 | 1.8766 | 0.4184 | 0.1928 | 0.2864 | 0.2863 | 81.1667 |
1.3138 | 0.6390 | 200 | 1.8798 | 0.4202 | 0.1939 | 0.2903 | 0.2896 | 79.66 |
1.3551 | 0.9585 | 300 | 1.8812 | 0.4227 | 0.1944 | 0.2955 | 0.2949 | 78.4733 |
1.3084 | 1.2780 | 400 | 1.8913 | 0.4188 | 0.1901 | 0.2884 | 0.2877 | 81.12 |
1.2807 | 1.5974 | 500 | 1.9028 | 0.4155 | 0.1867 | 0.2832 | 0.2834 | 80.38 |
1.3219 | 1.9169 | 600 | 1.8966 | 0.4184 | 0.1935 | 0.2889 | 0.2886 | 80.56 |
1.3058 | 2.2364 | 700 | 1.9024 | 0.4114 | 0.1829 | 0.2857 | 0.2852 | 79.5 |
1.2941 | 2.5559 | 800 | 1.9028 | 0.4241 | 0.1911 | 0.2898 | 0.2894 | 82.3667 |
1.2649 | 2.8754 | 900 | 1.8978 | 0.4232 | 0.1954 | 0.2941 | 0.2939 | 79.2067 |
1.3272 | 3.1949 | 1000 | 1.9019 | 0.4235 | 0.1945 | 0.2917 | 0.2917 | 78.9667 |
1.2759 | 3.5144 | 1100 | 1.9102 | 0.4211 | 0.1955 | 0.2916 | 0.2915 | 79.24 |
1.2979 | 3.8339 | 1200 | 1.9041 | 0.4246 | 0.1964 | 0.2932 | 0.2926 | 79.5 |
1.2568 | 4.1534 | 1300 | 1.9104 | 0.4193 | 0.1919 | 0.2894 | 0.2892 | 80.6533 |
1.2749 | 4.4728 | 1400 | 1.9104 | 0.4157 | 0.1897 | 0.2863 | 0.2862 | 79.3667 |
1.2646 | 4.7923 | 1500 | 1.9126 | 0.4114 | 0.1827 | 0.281 | 0.2815 | 79.7333 |
1.2972 | 5.1118 | 1600 | 1.9099 | 0.4219 | 0.1937 | 0.29 | 0.29 | 80.4467 |
1.2578 | 5.4313 | 1700 | 1.9186 | 0.4219 | 0.193 | 0.2891 | 0.289 | 81.8733 |
1.3036 | 5.7508 | 1800 | 1.9180 | 0.4163 | 0.1885 | 0.2894 | 0.289 | 80.1333 |
1.2715 | 6.0703 | 1900 | 1.9160 | 0.4149 | 0.1886 | 0.2878 | 0.2877 | 80.3533 |
1.2504 | 6.3898 | 2000 | 1.9187 | 0.423 | 0.1953 | 0.2922 | 0.2922 | 80.22 |
1.3025 | 6.7093 | 2100 | 1.9166 | 0.4172 | 0.1884 | 0.2872 | 0.2871 | 80.5667 |
1.2842 | 7.0288 | 2200 | 1.9149 | 0.4147 | 0.1877 | 0.287 | 0.2873 | 79.22 |
1.2693 | 7.3482 | 2300 | 1.9171 | 0.4138 | 0.1883 | 0.2868 | 0.2868 | 80.4467 |
1.2936 | 7.6677 | 2400 | 1.9163 | 0.4122 | 0.1882 | 0.2883 | 0.2883 | 79.2533 |
1.2776 | 7.9872 | 2500 | 1.9161 | 0.4161 | 0.1903 | 0.2908 | 0.2907 | 79.0133 |
Framework versions
- Transformers 4.41.2
- Pytorch 2.3.0+cu121
- Datasets 2.20.0
- Tokenizers 0.19.1