Edit model card

lm43-course

This model is a fine-tuned version of google-t5/t5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.9161
  • Rouge1: 0.4161
  • Rouge2: 0.1903
  • Rougel: 0.2908
  • Rougelsum: 0.2907
  • Gen Len: 79.0133

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 8

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.3159 0.3195 100 1.8766 0.4184 0.1928 0.2864 0.2863 81.1667
1.3138 0.6390 200 1.8798 0.4202 0.1939 0.2903 0.2896 79.66
1.3551 0.9585 300 1.8812 0.4227 0.1944 0.2955 0.2949 78.4733
1.3084 1.2780 400 1.8913 0.4188 0.1901 0.2884 0.2877 81.12
1.2807 1.5974 500 1.9028 0.4155 0.1867 0.2832 0.2834 80.38
1.3219 1.9169 600 1.8966 0.4184 0.1935 0.2889 0.2886 80.56
1.3058 2.2364 700 1.9024 0.4114 0.1829 0.2857 0.2852 79.5
1.2941 2.5559 800 1.9028 0.4241 0.1911 0.2898 0.2894 82.3667
1.2649 2.8754 900 1.8978 0.4232 0.1954 0.2941 0.2939 79.2067
1.3272 3.1949 1000 1.9019 0.4235 0.1945 0.2917 0.2917 78.9667
1.2759 3.5144 1100 1.9102 0.4211 0.1955 0.2916 0.2915 79.24
1.2979 3.8339 1200 1.9041 0.4246 0.1964 0.2932 0.2926 79.5
1.2568 4.1534 1300 1.9104 0.4193 0.1919 0.2894 0.2892 80.6533
1.2749 4.4728 1400 1.9104 0.4157 0.1897 0.2863 0.2862 79.3667
1.2646 4.7923 1500 1.9126 0.4114 0.1827 0.281 0.2815 79.7333
1.2972 5.1118 1600 1.9099 0.4219 0.1937 0.29 0.29 80.4467
1.2578 5.4313 1700 1.9186 0.4219 0.193 0.2891 0.289 81.8733
1.3036 5.7508 1800 1.9180 0.4163 0.1885 0.2894 0.289 80.1333
1.2715 6.0703 1900 1.9160 0.4149 0.1886 0.2878 0.2877 80.3533
1.2504 6.3898 2000 1.9187 0.423 0.1953 0.2922 0.2922 80.22
1.3025 6.7093 2100 1.9166 0.4172 0.1884 0.2872 0.2871 80.5667
1.2842 7.0288 2200 1.9149 0.4147 0.1877 0.287 0.2873 79.22
1.2693 7.3482 2300 1.9171 0.4138 0.1883 0.2868 0.2868 80.4467
1.2936 7.6677 2400 1.9163 0.4122 0.1882 0.2883 0.2883 79.2533
1.2776 7.9872 2500 1.9161 0.4161 0.1903 0.2908 0.2907 79.0133

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
27
Safetensors
Model size
60.5M params
Tensor type
F32
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Finetuned from