sharoz's picture
End of training
d46ba40
metadata
license: mit
base_model: gpt2-medium
tags:
  - generated_from_trainer
model-index:
  - name: gpt2-medium-custom-functions-dataset-python
    results: []

gpt2-medium-custom-functions-dataset-python

This model is a fine-tuned version of gpt2-medium on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4735

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss
2.6637 0.02 1 2.1553
2.4565 0.05 2 2.0239
2.1968 0.07 3 1.9137
2.2327 0.09 4 1.8194
2.0672 0.12 5 1.7425
1.9292 0.14 6 1.6721
1.8293 0.16 7 1.6049
1.71 0.19 8 1.5385
1.8569 0.21 9 1.4786
1.7208 0.23 10 1.4236
1.6461 0.26 11 1.3770
1.6146 0.28 12 1.3361
1.5799 0.3 13 1.3006
1.515 0.33 14 1.2690
1.4448 0.35 15 1.2488
1.2871 0.37 16 1.2254
1.6566 0.4 17 1.1972
1.4823 0.42 18 1.1638
1.4655 0.44 19 1.1379
1.3227 0.47 20 1.1172
1.4135 0.49 21 1.0973
1.4835 0.51 22 1.0784
1.401 0.53 23 1.0607
1.3294 0.56 24 1.0455
1.4781 0.58 25 1.0302
1.1167 0.6 26 1.0153
1.3876 0.63 27 1.0017
1.1708 0.65 28 0.9911
1.2199 0.67 29 0.9833
1.2328 0.7 30 0.9709
1.5262 0.72 31 0.9599
1.1906 0.74 32 0.9501
1.191 0.77 33 0.9404
1.0422 0.79 34 0.9291
1.277 0.81 35 0.9183
1.1522 0.84 36 0.9092
1.1841 0.86 37 0.9006
1.2538 0.88 38 0.8931
1.1318 0.91 39 0.8862
1.012 0.93 40 0.8807
1.0553 0.95 41 0.8753
1.0566 0.98 42 0.8691
1.1235 1.0 43 0.8638
1.1207 1.02 44 0.8591
1.0835 1.05 45 0.8544
1.3731 1.07 46 0.8505
0.9843 1.09 47 0.8450
0.9201 1.12 48 0.8385
1.0392 1.14 49 0.8340
1.1158 1.16 50 0.8297
0.8518 1.19 51 0.8247
0.8871 1.21 52 0.8185
1.0378 1.23 53 0.8124
1.1116 1.26 54 0.8083
1.0364 1.28 55 0.8043
0.8949 1.3 56 0.7988
1.068 1.33 57 0.7925
0.9319 1.35 58 0.7859
0.7654 1.37 59 0.7818
0.8887 1.4 60 0.7787
1.0294 1.42 61 0.7748
1.1351 1.44 62 0.7711
0.998 1.47 63 0.7689
1.1106 1.49 64 0.7679
0.9606 1.51 65 0.7660
0.9273 1.53 66 0.7628
0.9725 1.56 67 0.7595
1.0205 1.58 68 0.7569
1.0131 1.6 69 0.7549
0.9203 1.63 70 0.7530
0.898 1.65 71 0.7508
0.817 1.67 72 0.7478
0.9439 1.7 73 0.7447
1.079 1.72 74 0.7427
0.9806 1.74 75 0.7398
1.261 1.77 76 0.7369
1.0824 1.79 77 0.7340
0.9523 1.81 78 0.7317
0.9734 1.84 79 0.7300
1.0786 1.86 80 0.7302
0.8675 1.88 81 0.7298
0.851 1.91 82 0.7279
1.066 1.93 83 0.7254
1.137 1.95 84 0.7239
1.1387 1.98 85 0.7224
0.739 2.0 86 0.7207
0.8809 2.02 87 0.7192
1.0253 2.05 88 0.7178
0.8942 2.07 89 0.7160
0.8436 2.09 90 0.7134
0.8356 2.12 91 0.7115
0.9951 2.14 92 0.7110
0.7637 2.16 93 0.7098
0.722 2.19 94 0.7087
1.023 2.21 95 0.7072
0.7015 2.23 96 0.7044
0.8949 2.26 97 0.7017
0.9573 2.28 98 0.6996
0.8989 2.3 99 0.6987
0.9738 2.33 100 0.6983
0.8317 2.35 101 0.6970
0.9778 2.37 102 0.6951
0.7919 2.4 103 0.6924
0.653 2.42 104 0.6898
0.9133 2.44 105 0.6873
0.8521 2.47 106 0.6841
0.8673 2.49 107 0.6808
0.8792 2.51 108 0.6777
0.8635 2.53 109 0.6747
1.0299 2.56 110 0.6719
0.7554 2.58 111 0.6694
0.9195 2.6 112 0.6671
0.8374 2.63 113 0.6649
0.8847 2.65 114 0.6628
0.938 2.67 115 0.6615
0.8967 2.7 116 0.6603
0.8264 2.72 117 0.6594
0.9195 2.74 118 0.6591
0.8584 2.77 119 0.6588
0.8058 2.79 120 0.6578
1.0978 2.81 121 0.6560
0.7889 2.84 122 0.6544
0.7865 2.86 123 0.6527
0.8553 2.88 124 0.6507
0.9134 2.91 125 0.6486
0.7911 2.93 126 0.6463
0.9675 2.95 127 0.6439
0.761 2.98 128 0.6417
0.6347 3.0 129 0.6394
0.7608 3.02 130 0.6368
0.7563 3.05 131 0.6352
0.8059 3.07 132 0.6333
0.8825 3.09 133 0.6320
0.7952 3.12 134 0.6307
0.9209 3.14 135 0.6299
0.8556 3.16 136 0.6295
0.8613 3.19 137 0.6289
0.7908 3.21 138 0.6288
0.7728 3.23 139 0.6285
0.707 3.26 140 0.6280
0.8353 3.28 141 0.6270
0.9482 3.3 142 0.6265
0.726 3.33 143 0.6260
0.7594 3.35 144 0.6250
0.9403 3.37 145 0.6237
0.8986 3.4 146 0.6218
0.7309 3.42 147 0.6204
0.8011 3.44 148 0.6197
0.7373 3.47 149 0.6193
0.6195 3.49 150 0.6174
0.8668 3.51 151 0.6154
0.8096 3.53 152 0.6136
0.9364 3.56 153 0.6116
0.7081 3.58 154 0.6105
0.7799 3.6 155 0.6091
0.7862 3.63 156 0.6090
0.7221 3.65 157 0.6097
0.7605 3.67 158 0.6090
0.7481 3.7 159 0.6071
0.776 3.72 160 0.6045
0.9396 3.74 161 0.6022
0.7166 3.77 162 0.6001
0.709 3.79 163 0.5985
0.8412 3.81 164 0.5970
0.7692 3.84 165 0.5956
0.7621 3.86 166 0.5942
0.7832 3.88 167 0.5930
0.7455 3.91 168 0.5919
0.7888 3.93 169 0.5913
0.7197 3.95 170 0.5908
0.7936 3.98 171 0.5900
0.5976 4.0 172 0.5890
0.6375 4.02 173 0.5874
0.7342 4.05 174 0.5859
0.644 4.07 175 0.5845
0.7232 4.09 176 0.5831
0.7743 4.12 177 0.5819
0.8015 4.14 178 0.5808
0.7475 4.16 179 0.5801
0.7005 4.19 180 0.5797
0.7032 4.21 181 0.5795
0.8204 4.23 182 0.5789
0.7674 4.26 183 0.5787
0.7219 4.28 184 0.5781
0.624 4.3 185 0.5771
0.7429 4.33 186 0.5755
0.6445 4.35 187 0.5730
0.7782 4.37 188 0.5712
0.7882 4.4 189 0.5698
0.7005 4.42 190 0.5687
0.7509 4.44 191 0.5678
0.6764 4.47 192 0.5671
0.6529 4.49 193 0.5667
0.6101 4.51 194 0.5668
0.8211 4.53 195 0.5674
0.7529 4.56 196 0.5667
0.8615 4.58 197 0.5651
0.8099 4.6 198 0.5641
0.7145 4.63 199 0.5635
0.7437 4.65 200 0.5632
0.873 4.67 201 0.5631
0.7937 4.7 202 0.5620
0.7493 4.72 203 0.5608
0.7614 4.74 204 0.5596
0.6642 4.77 205 0.5585
0.5854 4.79 206 0.5576
0.6442 4.81 207 0.5572
0.859 4.84 208 0.5562
0.6627 4.86 209 0.5553
0.8024 4.88 210 0.5540
0.7443 4.91 211 0.5526
0.6725 4.93 212 0.5520
0.749 4.95 213 0.5521
0.7687 4.98 214 0.5521
0.5998 5.0 215 0.5522
0.7578 5.02 216 0.5526
0.7074 5.05 217 0.5536
0.5647 5.07 218 0.5543
0.7475 5.09 219 0.5539
0.5776 5.12 220 0.5523
0.7232 5.14 221 0.5507
0.6487 5.16 222 0.5491
0.6446 5.19 223 0.5477
0.8951 5.21 224 0.5467
0.7706 5.23 225 0.5460
0.6351 5.26 226 0.5453
0.7336 5.28 227 0.5445
0.6329 5.3 228 0.5436
0.5795 5.33 229 0.5430
0.7553 5.35 230 0.5428
0.6959 5.37 231 0.5430
0.5945 5.4 232 0.5427
0.6274 5.42 233 0.5422
0.7024 5.44 234 0.5414
0.8223 5.47 235 0.5402
0.6441 5.49 236 0.5386
0.749 5.51 237 0.5368
0.6654 5.53 238 0.5357
0.8781 5.56 239 0.5346
0.7139 5.58 240 0.5340
0.587 5.6 241 0.5339
0.8308 5.63 242 0.5340
0.5613 5.65 243 0.5334
0.7108 5.67 244 0.5330
0.6884 5.7 245 0.5322
0.6955 5.72 246 0.5310
0.5989 5.74 247 0.5301
0.7517 5.77 248 0.5295
0.6765 5.79 249 0.5291
0.6223 5.81 250 0.5285
0.6694 5.84 251 0.5277
0.6235 5.86 252 0.5267
0.6591 5.88 253 0.5259
0.6832 5.91 254 0.5251
0.7346 5.93 255 0.5246
0.6574 5.95 256 0.5242
0.704 5.98 257 0.5236
0.7269 6.0 258 0.5234
0.6097 6.02 259 0.5231
0.5369 6.05 260 0.5224
0.7094 6.07 261 0.5214
0.608 6.09 262 0.5207
0.6112 6.12 263 0.5200
0.6414 6.14 264 0.5192
0.6254 6.16 265 0.5186
0.8219 6.19 266 0.5184
0.6536 6.21 267 0.5183
0.601 6.23 268 0.5184
0.672 6.26 269 0.5182
0.6646 6.28 270 0.5179
0.7228 6.3 271 0.5179
0.6542 6.33 272 0.5182
0.6003 6.35 273 0.5185
0.4799 6.37 274 0.5195
0.7062 6.4 275 0.5203
0.7557 6.42 276 0.5199
0.7419 6.44 277 0.5189
0.5468 6.47 278 0.5179
0.6142 6.49 279 0.5168
0.5953 6.51 280 0.5161
0.602 6.53 281 0.5152
0.6168 6.56 282 0.5146
0.815 6.58 283 0.5141
0.7738 6.6 284 0.5138
0.64 6.63 285 0.5136
0.6377 6.65 286 0.5133
0.7254 6.67 287 0.5131
0.6416 6.7 288 0.5128
0.6555 6.72 289 0.5123
0.6812 6.74 290 0.5118
0.7116 6.77 291 0.5113
0.6046 6.79 292 0.5104
0.7386 6.81 293 0.5095
0.733 6.84 294 0.5088
0.6579 6.86 295 0.5081
0.5418 6.88 296 0.5076
0.5853 6.91 297 0.5071
0.6488 6.93 298 0.5070
0.5726 6.95 299 0.5069
0.5821 6.98 300 0.5068
0.9157 7.0 301 0.5068
0.6769 7.02 302 0.5061
0.7632 7.05 303 0.5049
0.7479 7.07 304 0.5037
0.5632 7.09 305 0.5028
0.6493 7.12 306 0.5015
0.6517 7.14 307 0.5007
0.6944 7.16 308 0.5000
0.5862 7.19 309 0.4996
0.6161 7.21 310 0.4993
0.6396 7.23 311 0.4988
0.5506 7.26 312 0.4985
0.7518 7.28 313 0.4982
0.7445 7.3 314 0.4977
0.6228 7.33 315 0.4974
0.5555 7.35 316 0.4968
0.7457 7.37 317 0.4964
0.579 7.4 318 0.4961
0.528 7.42 319 0.4956
0.5286 7.44 320 0.4953
0.591 7.47 321 0.4952
0.5903 7.49 322 0.4953
0.6155 7.51 323 0.4955
0.5907 7.53 324 0.4954
0.6028 7.56 325 0.4949
0.5852 7.58 326 0.4943
0.6156 7.6 327 0.4934
0.582 7.63 328 0.4925
0.6091 7.65 329 0.4918
0.5877 7.67 330 0.4912
0.7017 7.7 331 0.4908
0.6496 7.72 332 0.4905
0.6089 7.74 333 0.4903
0.5807 7.77 334 0.4901
0.5553 7.79 335 0.4897
0.8058 7.81 336 0.4894
0.6147 7.84 337 0.4892
0.6289 7.86 338 0.4891
0.5883 7.88 339 0.4891
0.6048 7.91 340 0.4890
0.6411 7.93 341 0.4889
0.5575 7.95 342 0.4887
0.6509 7.98 343 0.4884
0.764 8.0 344 0.4882
0.6364 8.02 345 0.4880
0.561 8.05 346 0.4880
0.5949 8.07 347 0.4878
0.6904 8.09 348 0.4874
0.647 8.12 349 0.4868
0.6374 8.14 350 0.4862
0.7048 8.16 351 0.4859
0.6085 8.19 352 0.4854
0.5246 8.21 353 0.4852
0.531 8.23 354 0.4849
0.4605 8.26 355 0.4844
0.6132 8.28 356 0.4839
0.6378 8.3 357 0.4835
0.7885 8.33 358 0.4831
0.6008 8.35 359 0.4827
0.7118 8.37 360 0.4823
0.6792 8.4 361 0.4821
0.6317 8.42 362 0.4819
0.5942 8.44 363 0.4817
0.6184 8.47 364 0.4815
0.5902 8.49 365 0.4813
0.5353 8.51 366 0.4812
0.685 8.53 367 0.4812
0.5232 8.56 368 0.4811
0.6393 8.58 369 0.4812
0.5685 8.6 370 0.4812
0.6234 8.63 371 0.4813
0.5456 8.65 372 0.4810
0.6159 8.67 373 0.4807
0.6575 8.7 374 0.4804
0.5769 8.72 375 0.4803
0.5939 8.74 376 0.4801
0.5721 8.77 377 0.4800
0.5283 8.79 378 0.4797
0.5275 8.81 379 0.4795
0.5907 8.84 380 0.4794
0.6058 8.86 381 0.4792
0.7202 8.88 382 0.4790
0.6811 8.91 383 0.4787
0.5979 8.93 384 0.4785
0.5572 8.95 385 0.4783
0.5893 8.98 386 0.4781
0.6796 9.0 387 0.4779
0.5412 9.02 388 0.4780
0.5453 9.05 389 0.4781
0.7475 9.07 390 0.4782
0.6222 9.09 391 0.4781
0.5177 9.12 392 0.4778
0.6182 9.14 393 0.4775
0.6124 9.16 394 0.4772
0.6485 9.19 395 0.4769
0.5852 9.21 396 0.4765
0.5656 9.23 397 0.4761
0.6162 9.26 398 0.4758
0.6965 9.28 399 0.4755
0.5342 9.3 400 0.4753
0.718 9.33 401 0.4751
0.5089 9.35 402 0.4750
0.5738 9.37 403 0.4748
0.5612 9.4 404 0.4746
0.5628 9.42 405 0.4744
0.6512 9.44 406 0.4743
0.6717 9.47 407 0.4742
0.5937 9.49 408 0.4741
0.5906 9.51 409 0.4741
0.529 9.53 410 0.4741
0.6554 9.56 411 0.4741
0.5074 9.58 412 0.4741
0.6997 9.6 413 0.4741
0.5573 9.63 414 0.4741
0.6113 9.65 415 0.4741
0.5129 9.67 416 0.4741
0.5428 9.7 417 0.4740
0.5363 9.72 418 0.4739
0.5862 9.74 419 0.4739
0.6119 9.77 420 0.4738
0.6698 9.79 421 0.4738
0.5966 9.81 422 0.4737
0.5309 9.84 423 0.4737
0.5924 9.86 424 0.4736
0.6133 9.88 425 0.4736
0.6869 9.91 426 0.4736
0.5508 9.93 427 0.4735
0.6858 9.95 428 0.4735
0.5681 9.98 429 0.4735
0.7834 10.0 430 0.4735

Framework versions

  • Transformers 4.32.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.14.4
  • Tokenizers 0.13.3