Edit model card

flan-t5-base-sheldon-chat-v2

This model is a fine-tuned version of google/flan-t5-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.2012
  • Rouge 1: 12.02
  • Rouge 2: 2.8059
  • Rouge L: 11.2005
  • Avg Len: 14.229
  • Bertscore Prec: 0.8673
  • Bertscore Rec: 0.8568
  • Bertscore F1: 0.8616

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Rouge 1 Rouge 2 Rouge L Avg Len Bertscore Prec Bertscore Rec Bertscore F1
3.5731 0.07 200 3.2556 9.2886 1.7826 8.8558 10.8814 0.8661 0.8494 0.8573
3.421 0.13 400 3.1758 9.2559 1.9064 8.8243 10.7076 0.8714 0.8538 0.8622
3.3457 0.2 600 3.1197 9.0991 2.1854 8.7076 10.1063 0.8741 0.8546 0.8639
3.3013 0.26 800 3.0755 8.5523 1.8896 8.2224 12.3681 0.8706 0.8537 0.8617
3.2509 0.33 1000 3.0383 9.2396 2.0409 8.8314 12.0511 0.8729 0.8552 0.8636
3.2376 0.4 1200 3.0005 8.7356 1.8181 8.3039 12.9796 0.8702 0.8541 0.8617
3.2075 0.46 1400 2.9602 9.4303 2.0003 8.7571 16.0736 0.8633 0.8548 0.8587
3.1342 0.53 1600 2.9263 10.8148 2.2698 9.8735 15.4826 0.866 0.8571 0.8611
3.1376 0.6 1800 2.8915 9.9381 1.8569 9.1291 15.1718 0.8635 0.8553 0.859
3.0862 0.66 2000 2.8648 9.6534 1.7925 8.9539 14.9939 0.8603 0.8538 0.8567
3.0633 0.73 2200 2.8372 7.4573 1.6883 7.2688 15.411 0.8597 0.8512 0.855
3.0286 0.79 2400 2.8118 9.964 2.1269 9.281 13.5706 0.8708 0.8556 0.8628
3.0248 0.86 2600 2.7839 8.9688 1.901 8.5439 14.5665 0.8649 0.8527 0.8583
2.9777 0.93 2800 2.7637 10.2338 2.0737 9.67 13.3415 0.8692 0.8555 0.8619
2.9803 0.99 3000 2.7438 8.8705 1.8115 8.3491 15.7914 0.8644 0.8537 0.8586
2.9313 1.06 3200 2.7196 8.5001 1.8396 8.1555 14.4008 0.8684 0.854 0.8607
2.9094 1.12 3400 2.7013 9.9201 1.8958 9.1863 16.0777 0.8587 0.8542 0.856
2.8625 1.19 3600 2.6804 9.6675 2.0412 9.1107 13.5235 0.8691 0.8546 0.8614
2.8871 1.26 3800 2.6595 9.9853 2.0365 9.3471 14.9836 0.8655 0.8546 0.8596
2.8607 1.32 4000 2.6436 9.5726 1.625 8.9096 14.4151 0.8635 0.8541 0.8584
2.8028 1.39 4200 2.6298 9.6501 1.8484 9.1131 13.4765 0.8636 0.8544 0.8586
2.8357 1.46 4400 2.6080 8.961 1.9923 8.5776 13.6115 0.864 0.853 0.8581
2.8085 1.52 4600 2.5954 9.2942 1.8072 8.7018 15.9427 0.8611 0.8529 0.8566
2.7853 1.59 4800 2.5773 9.6865 2.001 9.1058 14.6626 0.8658 0.8548 0.8599
2.7934 1.65 5000 2.5613 9.3635 2.0014 8.8713 13.6401 0.8699 0.8533 0.8611
2.7697 1.72 5200 2.5508 8.6314 1.7812 8.2446 13.4417 0.8662 0.8523 0.8588
2.7451 1.79 5400 2.5361 9.7799 2.1524 9.2527 12.3926 0.8711 0.854 0.8621
2.7692 1.85 5600 2.5220 9.7707 2.258 9.2042 14.0102 0.8665 0.8539 0.8598
2.7435 1.92 5800 2.5110 9.6982 2.1742 9.2157 14.773 0.8621 0.8536 0.8574
2.7033 1.98 6000 2.4948 9.7766 2.0843 9.141 13.3926 0.8677 0.8537 0.8602
2.6664 2.05 6200 2.4792 10.4019 2.2247 9.8155 13.5174 0.8684 0.8547 0.861
2.6542 2.12 6400 2.4708 10.2339 2.2096 9.5914 14.7669 0.8668 0.8548 0.8603
2.6467 2.18 6600 2.4582 10.3021 2.3528 9.7578 12.9448 0.8723 0.8547 0.863
2.6376 2.25 6800 2.4477 10.3056 2.1854 9.8499 12.9489 0.8712 0.855 0.8626
2.6267 2.31 7000 2.4364 10.6358 2.2434 10.002 13.4254 0.8691 0.8559 0.8621
2.6489 2.38 7200 2.4246 10.5459 2.1935 9.8531 13.4949 0.8682 0.8549 0.861
2.6174 2.45 7400 2.4160 9.8943 2.2126 9.2111 15.0164 0.8649 0.854 0.859
2.6094 2.51 7600 2.4052 10.8941 2.3246 10.1182 14.5337 0.8676 0.8561 0.8614
2.5877 2.58 7800 2.3966 10.5092 2.2398 9.7792 14.454 0.8675 0.8553 0.861
2.5525 2.65 8000 2.3845 10.0878 2.1701 9.4132 13.9059 0.8678 0.8547 0.8608
2.5912 2.71 8200 2.3709 10.504 2.3081 9.8336 13.7076 0.8686 0.8558 0.8618
2.5748 2.78 8400 2.3625 10.6169 2.4216 10.0053 13.9202 0.8671 0.8554 0.8608
2.5921 2.84 8600 2.3550 10.4233 2.3283 9.9305 14.1963 0.8642 0.8547 0.859
2.5617 2.91 8800 2.3414 10.5877 2.3495 9.9807 13.7894 0.8669 0.8555 0.8608
2.5632 2.98 9000 2.3338 10.8744 2.5212 10.2281 14.1022 0.8665 0.8554 0.8605
2.5257 3.04 9200 2.3217 11.3215 2.4995 10.589 14.9264 0.8647 0.8562 0.8601
2.4889 3.11 9400 2.3154 11.2608 2.4103 10.5319 14.1043 0.8668 0.8554 0.8606
2.5173 3.17 9600 2.3077 10.9017 2.3331 10.3352 14.4315 0.8643 0.8553 0.8593
2.5151 3.24 9800 2.3006 10.7286 2.2598 10.2093 13.728 0.8666 0.8557 0.8607
2.4883 3.31 10000 2.2952 10.858 2.3735 10.3329 14.1227 0.8655 0.8555 0.8601
2.5006 3.37 10200 2.2861 10.7892 2.2598 10.1382 13.4254 0.8688 0.8563 0.8621
2.4661 3.44 10400 2.2827 10.917 2.3892 10.1933 13.9693 0.8685 0.8556 0.8616
2.4945 3.51 10600 2.2767 11.0385 2.551 10.3031 14.2004 0.867 0.8558 0.8609
2.4588 3.57 10800 2.2692 10.9102 2.3942 10.169 14.2331 0.8671 0.8562 0.8612
2.4649 3.64 11000 2.2632 10.9139 2.3421 10.2154 14.3374 0.8654 0.8556 0.86
2.4763 3.7 11200 2.2556 11.7755 2.6463 11.0983 14.2065 0.8684 0.8563 0.8619
2.4497 3.77 11400 2.2493 11.1075 2.372 10.4027 14.0777 0.8662 0.8556 0.8605
2.4567 3.84 11600 2.2467 11.1779 2.2802 10.4564 14.6646 0.8666 0.8565 0.8611
2.4421 3.9 11800 2.2395 11.0069 2.3448 10.3473 14.3681 0.8662 0.856 0.8607
2.4269 3.97 12000 2.2360 11.0892 2.3662 10.4705 14.9325 0.8651 0.8562 0.8602
2.4406 4.03 12200 2.2318 11.4857 2.4196 10.7859 14.3783 0.8669 0.8565 0.8613
2.4121 4.1 12400 2.2283 11.3331 2.3757 10.6601 14.3661 0.867 0.8561 0.8611
2.4076 4.17 12600 2.2246 11.7127 2.437 10.98 14.5583 0.8664 0.8559 0.8607
2.4133 4.23 12800 2.2199 11.3607 2.467 10.65 13.9877 0.8667 0.8557 0.8607
2.4241 4.3 13000 2.2158 11.4633 2.5719 10.7229 14.0757 0.8664 0.8562 0.8608
2.4116 4.37 13200 2.2137 11.6146 2.4491 10.8822 14.0777 0.8662 0.8559 0.8606
2.4188 4.43 13400 2.2130 11.2618 2.3805 10.5944 14.2883 0.866 0.8557 0.8604
2.4074 4.5 13600 2.2105 11.3653 2.4117 10.6008 14.1472 0.8668 0.8559 0.8609
2.4063 4.56 13800 2.2092 11.6648 2.7294 10.9428 14.3395 0.8665 0.8566 0.8611
2.3941 4.63 14000 2.2058 11.7881 2.7584 11.0606 14.2086 0.8661 0.8562 0.8607
2.3943 4.7 14200 2.2051 11.6491 2.7424 10.8839 14.2597 0.8665 0.8564 0.861
2.3948 4.76 14400 2.2038 11.6683 2.7736 10.8596 14.4315 0.8662 0.8565 0.8609
2.3716 4.83 14600 2.2024 11.8931 2.7546 11.0867 14.3088 0.8666 0.8565 0.8611
2.4017 4.89 14800 2.2017 11.973 2.7664 11.1738 14.2495 0.8671 0.8567 0.8615
2.4215 4.96 15000 2.2012 12.02 2.8059 11.2005 14.229 0.8673 0.8568 0.8616

Framework versions

  • Transformers 4.38.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
2
Safetensors
Model size
248M params
Tensor type
F32
ยท
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Shakhovak/flan-t5-base-sheldon-chat-v2

Finetuned
(621)
this model

Space using Shakhovak/flan-t5-base-sheldon-chat-v2 1