lapp0 commited on
Commit
d5fca29
·
verified ·
1 Parent(s): 5dc2a6a

End of training

Browse files
README.md CHANGED
@@ -16,13 +16,13 @@ This student model is distilled from the teacher model [gpt2](https://huggingfac
16
  The [Distily](https://github.com/lapp0/distily) library was used for this distillation.
17
 
18
  It achieves the following results on the evaluation set:
19
- - eval_enwikippl: 611.0229
20
- - eval_frwikippl: 4057.1116
21
- - eval_zhwikippl: 17857.3535
22
- - eval_loss: 7319.6802
23
- - eval_runtime: 21.6635
24
- - eval_samples_per_second: 46.161
25
- - eval_steps_per_second: 11.54
26
 
27
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
28
  should probably proofread and complete it, then remove this comment.
@@ -45,10 +45,10 @@ More information needed
45
  ### Training hyperparameters
46
 
47
  The following hyperparameters were used during training:
48
- - distillation_objective: <distily.objectives.LegacyObjective object at 0x7f7f68372f20>
49
  - train_embeddings: True
50
  - learning_rate: 4e-05
51
- - train_batch_size: 4
52
  - eval_batch_size: 4
53
  - seed: 42
54
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
@@ -56,63 +56,112 @@ The following hyperparameters were used during training:
56
  - num_epochs: 1.0
57
 
58
  ### Resource Usage
59
- Peak GPU Memory: 15.7299 GB
60
 
61
  ### Eval-Phase Metrics
62
  | step | epoch | enwikippl | frwikippl | loss | runtime | samples_per_second | steps_per_second | zhwikippl |
63
  | --- | --- | --- | --- | --- | --- | --- | --- | --- |
64
  | **teacher eval** | | 30.2385 | 57.2728 | | | | | 18.1772 |
65
- | 0 | 0 | 54063.0 | 57132.0859 | 330301.4375 | 21.4102 | 46.707 | 11.677 | 54288.6797 |
66
- | 500 | 0.0202 | 2200.8281 | 11269.5508 | 12821.6318 | 21.5139 | 46.482 | 11.62 | 56764.8438 |
67
- | 1000 | 0.0404 | 1665.7762 | 6195.2891 | 10439.8076 | 21.4414 | 46.639 | 11.66 | 26684.8594 |
68
- | 1500 | 0.0606 | 1395.5389 | 6104.2388 | 9813.1201 | 21.6061 | 46.283 | 11.571 | 24228.9746 |
69
- | 2000 | 0.0808 | 1240.8671 | 5769.0522 | 9419.9043 | 21.4266 | 46.671 | 11.668 | 25033.0820 |
70
- | 2500 | 0.1010 | 1090.4764 | 4890.7593 | 9141.3115 | 21.621 | 46.251 | 11.563 | 20563.0996 |
71
- | 3000 | 0.1212 | 1002.9867 | 4993.7222 | 8898.1758 | 21.5727 | 46.355 | 11.589 | 19028.0254 |
72
- | 3500 | 0.1414 | 938.0837 | 5002.0063 | 8604.9277 | 21.6847 | 46.115 | 11.529 | 16472.4883 |
73
- | 4000 | 0.1616 | 887.3153 | 5051.4448 | 8486.5283 | 21.6081 | 46.279 | 11.57 | 16085.5645 |
74
- | 4500 | 0.1818 | 845.5585 | 4589.0835 | 8366.6562 | 21.4849 | 46.544 | 11.636 | 15081.8213 |
75
- | 5000 | 0.2020 | 793.8573 | 4711.3706 | 8115.8398 | 21.5826 | 46.334 | 11.583 | 17669.9570 |
76
- | 5500 | 0.2222 | 769.3093 | 4488.2871 | 8050.4321 | 21.6441 | 46.202 | 11.55 | 14243.0586 |
77
- | 6000 | 0.2424 | 728.3535 | 4591.8325 | 7877.5039 | 21.5753 | 46.349 | 11.587 | 17997.4102 |
78
- | 6500 | 0.2626 | 710.1467 | 4239.1436 | 7786.5601 | 21.6028 | 46.29 | 11.573 | 13550.0938 |
79
- | 7000 | 0.2828 | 687.4926 | 4355.0205 | 7698.4961 | 21.646 | 46.198 | 11.549 | 17154.9648 |
80
- | 7500 | 0.3030 | 671.2838 | 4329.4575 | 7575.6479 | 21.6404 | 46.21 | 11.552 | 14209.8184 |
81
- | 8000 | 0.3232 | 651.2464 | 4281.1934 | 7469.9521 | 21.5348 | 46.437 | 11.609 | 11029.1934 |
82
- | 8500 | 0.3434 | 644.4048 | 4005.5222 | 7476.8638 | 21.677 | 46.132 | 11.533 | 14790.6338 |
83
- | 9000 | 0.3636 | 617.8220 | 3900.7300 | 7369.7598 | 21.5714 | 46.358 | 11.589 | 14878.7871 |
84
- | 9500 | 0.3838 | 611.0229 | 4057.1116 | 7319.6802 | 21.6635 | 46.161 | 11.54 | 17857.3535 |
85
- | 10000 | 0.4040 | 607.8168 | 4144.8633 | 7314.2402 | 21.6408 | 46.209 | 11.552 | 8414.2871 |
86
- | 10500 | 0.4242 | 597.2198 | 3698.7766 | 7203.5840 | 21.435 | 46.653 | 11.663 | 6787.8618 |
87
- | 11000 | 0.4444 | 583.0488 | 3682.1257 | 7166.8481 | 21.6697 | 46.147 | 11.537 | 8091.4233 |
88
- | 11500 | 0.4646 | 571.3620 | 3917.5435 | 7146.4961 | 21.6445 | 46.201 | 11.55 | 6319.4146 |
89
- | 12000 | 0.4848 | 573.1504 | 3768.2695 | 7058.4961 | 21.5215 | 46.465 | 11.616 | 7570.3271 |
90
- | 12500 | 0.5051 | 566.9861 | 3949.1577 | 7061.6318 | 22.0963 | 45.256 | 11.314 | 8727.2812 |
91
- | 13000 | 0.5253 | 558.8017 | 3803.5042 | 6961.4082 | 21.5517 | 46.4 | 11.6 | 9476.1670 |
92
- | 13500 | 0.5455 | 550.8849 | 3855.4763 | 7017.2798 | 21.6135 | 46.267 | 11.567 | 11842.0234 |
93
- | 14000 | 0.5657 | 546.9738 | 3748.3950 | 6947.0718 | 21.5141 | 46.481 | 11.62 | 11424.9463 |
94
- | 14500 | 0.5859 | 535.9684 | 3708.3093 | 6870.6240 | 21.6892 | 46.106 | 11.526 | 9886.5801 |
95
- | 15000 | 0.6061 | 528.7446 | 3590.0920 | 6851.8398 | 21.6211 | 46.251 | 11.563 | 14917.5742 |
96
- | 15500 | 0.6263 | 521.6382 | 3602.8992 | 6849.3442 | 21.4696 | 46.577 | 11.644 | 10334.7578 |
97
- | 16000 | 0.6465 | 517.2215 | 3595.4131 | 6779.7759 | 21.4612 | 46.596 | 11.649 | 13237.1143 |
98
- | 16500 | 0.6667 | 518.6794 | 3385.7922 | 6766.4639 | 21.7793 | 45.915 | 11.479 | 10972.6348 |
99
- | 17000 | 0.6869 | 515.1771 | 3393.7964 | 6732.6719 | 21.9049 | 45.652 | 11.413 | 9149.7510 |
100
- | 17500 | 0.7071 | 501.9872 | 3414.7986 | 6733.2480 | 21.6388 | 46.213 | 11.553 | 6785.5962 |
101
- | 18000 | 0.7273 | 501.7729 | 3411.7908 | 6693.3442 | 21.6999 | 46.083 | 11.521 | 5611.2793 |
102
- | 18500 | 0.7475 | 497.0320 | 3345.2139 | 6665.5361 | 21.7517 | 45.973 | 11.493 | 6820.1172 |
103
- | 19000 | 0.7677 | 494.9137 | 3287.4609 | 6658.4321 | 21.5993 | 46.298 | 11.574 | 8307.6650 |
104
- | 19500 | 0.7879 | 489.8856 | 3310.1404 | 6703.3921 | 21.703 | 46.077 | 11.519 | 7612.3970 |
105
- | 20000 | 0.8081 | 485.6058 | 3133.0156 | 6542.2720 | 21.6927 | 46.098 | 11.525 | 6796.0293 |
106
- | 20500 | 0.8283 | 482.7485 | 3159.1875 | 6622.9761 | 21.8265 | 45.816 | 11.454 | 6659.4741 |
107
- | 21000 | 0.8485 | 469.4856 | 3162.3074 | 6523.4561 | 21.7088 | 46.064 | 11.516 | 11412.7490 |
108
- | 21500 | 0.8687 | 475.0967 | 3214.5735 | 6542.4639 | 21.8362 | 45.796 | 11.449 | 8429.4746 |
109
- | 22000 | 0.8889 | 471.5591 | 3336.0288 | 6522.2080 | 21.7328 | 46.013 | 11.503 | 5576.1724 |
110
- | 22500 | 0.9091 | 467.7843 | 3244.9744 | 6460.9600 | 21.7546 | 45.967 | 11.492 | 3847.1572 |
111
- | 23000 | 0.9293 | 459.4413 | 3435.3245 | 6459.0400 | 21.7279 | 46.024 | 11.506 | 5404.9829 |
112
- | 23500 | 0.9495 | 466.1707 | 3223.0857 | 6445.9839 | 21.8014 | 45.869 | 11.467 | 6019.9951 |
113
- | 24000 | 0.9697 | 451.2469 | 3094.4868 | 6465.6318 | 21.8221 | 45.825 | 11.456 | 5374.3945 |
114
- | 24500 | 0.9899 | 461.2016 | 3168.5593 | 6462.3042 | 21.5588 | 46.385 | 11.596 | 7845.6411 |
115
- | 24750 | 1.0 | 453.6186 | 3034.7439 | 6427.3281 | 21.8641 | 45.737 | 11.434 | 7226.5781 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
116
 
117
  ### Framework versions
118
  - Distily 0.2.0
 
16
  The [Distily](https://github.com/lapp0/distily) library was used for this distillation.
17
 
18
  It achieves the following results on the evaluation set:
19
+ - eval_enwikippl: 579.5842
20
+ - eval_frwikippl: 3891.8010
21
+ - eval_zhwikippl: 6702.2964
22
+ - eval_loss: 7658.3999
23
+ - eval_runtime: 21.5573
24
+ - eval_samples_per_second: 46.388
25
+ - eval_steps_per_second: 11.597
26
 
27
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
28
  should probably proofread and complete it, then remove this comment.
 
45
  ### Training hyperparameters
46
 
47
  The following hyperparameters were used during training:
48
+ - distillation_objective: <distily.objectives.LegacyObjective object at 0x7fd56ca85c90>
49
  - train_embeddings: True
50
  - learning_rate: 4e-05
51
+ - train_batch_size: 2
52
  - eval_batch_size: 4
53
  - seed: 42
54
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 
56
  - num_epochs: 1.0
57
 
58
  ### Resource Usage
59
+ Peak GPU Memory: 4.0814 GB
60
 
61
  ### Eval-Phase Metrics
62
  | step | epoch | enwikippl | frwikippl | loss | runtime | samples_per_second | steps_per_second | zhwikippl |
63
  | --- | --- | --- | --- | --- | --- | --- | --- | --- |
64
  | **teacher eval** | | 30.2385 | 57.2728 | | | | | 18.1772 |
65
+ | 0 | 0 | 56994.4609 | 58386.3438 | 333144.0625 | 21.6098 | 46.275 | 11.569 | 60802.0039 |
66
+ | 500 | 0.0101 | 2099.8235 | 11678.0371 | 13260.6084 | 21.3836 | 46.765 | 11.691 | 69590.5391 |
67
+ | 1000 | 0.0202 | 1574.0366 | 8011.8809 | 10600.8320 | 21.2508 | 47.057 | 11.764 | 52850.8906 |
68
+ | 1500 | 0.0303 | 1301.3883 | 6674.1611 | 10162.4316 | 21.459 | 46.601 | 11.65 | 34488.375 |
69
+ | 2000 | 0.0404 | 1113.5813 | 5583.1753 | 9478.5283 | 21.4684 | 46.58 | 11.645 | 27443.7676 |
70
+ | 2500 | 0.0505 | 1004.2922 | 5359.0864 | 9228.7998 | 21.3125 | 46.921 | 11.73 | 26546.2461 |
71
+ | 3000 | 0.0606 | 914.3858 | 4987.7397 | 9218.7520 | 21.3671 | 46.801 | 11.7 | 13178.9082 |
72
+ | 3500 | 0.0707 | 860.5787 | 4993.3696 | 8780.2881 | 21.3231 | 46.898 | 11.724 | 20241.6133 |
73
+ | 4000 | 0.0808 | 810.8665 | 4433.4043 | 8697.4404 | 21.2626 | 47.031 | 11.758 | 18648.1777 |
74
+ | 4500 | 0.0909 | 769.4886 | 4542.2461 | 8522.4639 | 21.4471 | 46.626 | 11.657 | 14555.5088 |
75
+ | 5000 | 0.1010 | 741.9254 | 4665.9185 | 8346.4316 | 21.1682 | 47.241 | 11.81 | 10137.9199 |
76
+ | 5500 | 0.1111 | 714.7664 | 4329.6104 | 8166.9438 | 21.4303 | 46.663 | 11.666 | 13222.1006 |
77
+ | 6000 | 0.1212 | 692.0859 | 4471.0703 | 8177.6001 | 21.4078 | 46.712 | 11.678 | 10649.9824 |
78
+ | 6500 | 0.1313 | 659.9261 | 4580.1948 | 8073.7598 | 21.198 | 47.174 | 11.794 | 12113.9268 |
79
+ | 7000 | 0.1414 | 636.1021 | 4219.9077 | 7905.0562 | 21.2741 | 47.005 | 11.751 | 11793.8877 |
80
+ | 7500 | 0.1515 | 623.1702 | 4116.0293 | 7826.2402 | 21.2569 | 47.044 | 11.761 | 11638.9893 |
81
+ | 8000 | 0.1616 | 614.8783 | 4148.5176 | 7826.7520 | 21.2964 | 46.956 | 11.739 | 13476.1084 |
82
+ | 8500 | 0.1717 | 601.8520 | 4003.9678 | 7738.1118 | 21.4281 | 46.668 | 11.667 | 11412.7490 |
83
+ | 9000 | 0.1818 | 580.8234 | 3757.6580 | 7625.6001 | 21.6505 | 46.188 | 11.547 | 6242.6709 |
84
+ | 9500 | 0.1919 | 579.5842 | 3891.8010 | 7658.3999 | 21.5573 | 46.388 | 11.597 | 6702.2964 |
85
+ | 10000 | 0.2020 | 563.3217 | 3843.6697 | 7557.6641 | 21.4934 | 46.526 | 11.631 | 6892.9072 |
86
+ | 10500 | 0.2121 | 554.2101 | 3611.4167 | 7487.4878 | 21.5876 | 46.323 | 11.581 | 6533.5151 |
87
+ | 11000 | 0.2222 | 533.5391 | 3924.4539 | 7479.3599 | 21.5677 | 46.366 | 11.591 | 4041.2058 |
88
+ | 11500 | 0.2323 | 539.1932 | 3840.4197 | 7422.5601 | 21.3741 | 46.786 | 11.696 | 2984.6418 |
89
+ | 12000 | 0.2424 | 530.1937 | 3717.7319 | 7437.3760 | 21.5909 | 46.316 | 11.579 | 4198.5317 |
90
+ | 12500 | 0.2525 | 517.9953 | 3501.5972 | 7306.0801 | 21.5337 | 46.439 | 11.61 | 3271.1892 |
91
+ | 13000 | 0.2626 | 515.5474 | 3430.8489 | 7287.4878 | 21.6197 | 46.254 | 11.564 | 4228.9199 |
92
+ | 13500 | 0.2727 | 516.9103 | 3583.5176 | 7331.5840 | 21.6005 | 46.295 | 11.574 | 6539.6245 |
93
+ | 14000 | 0.2828 | 496.1355 | 3821.2432 | 7329.7920 | 21.4982 | 46.516 | 11.629 | 5327.2339 |
94
+ | 14500 | 0.2929 | 498.4330 | 3740.2107 | 7232.8960 | 21.5819 | 46.335 | 11.584 | 5059.5977 |
95
+ | 15000 | 0.3030 | 495.7023 | 3717.9944 | 7149.5361 | 21.4158 | 46.694 | 11.674 | 2332.2563 |
96
+ | 15500 | 0.3131 | 491.6768 | 3593.3838 | 7156.4482 | 21.2342 | 47.094 | 11.773 | 3195.2048 |
97
+ | 16000 | 0.3232 | 483.2642 | 3478.8335 | 7121.9521 | 21.2238 | 47.117 | 11.779 | 3729.5500 |
98
+ | 16500 | 0.3333 | 477.9181 | 3424.2036 | 7113.9839 | 21.3606 | 46.815 | 11.704 | 4778.8506 |
99
+ | 17000 | 0.3434 | 473.8991 | 3581.3721 | 7150.6240 | 21.2836 | 46.985 | 11.746 | 2268.9734 |
100
+ | 17500 | 0.3535 | 471.4035 | 3375.7810 | 7056.4482 | 21.4184 | 46.689 | 11.672 | 2958.4526 |
101
+ | 18000 | 0.3636 | 466.1978 | 3323.2354 | 7070.1118 | 21.3173 | 46.91 | 11.728 | 3852.8152 |
102
+ | 18500 | 0.3737 | 464.6797 | 3391.8843 | 6952.3521 | 21.5144 | 46.481 | 11.62 | 6839.7295 |
103
+ | 19000 | 0.3838 | 462.5197 | 3305.7080 | 6933.4399 | 21.3481 | 46.843 | 11.711 | 3396.2700 |
104
+ | 19500 | 0.3939 | 456.2503 | 3340.5020 | 6974.1440 | 21.3181 | 46.909 | 11.727 | 4338.4556 |
105
+ | 20000 | 0.4040 | 453.3807 | 3245.5469 | 6936.5439 | 21.3635 | 46.809 | 11.702 | 3513.4419 |
106
+ | 20500 | 0.4141 | 453.9622 | 3146.9612 | 6961.3442 | 21.3014 | 46.945 | 11.736 | 10044.2734 |
107
+ | 21000 | 0.4242 | 452.8354 | 2937.5862 | 6912.8638 | 21.428 | 46.668 | 11.667 | 4067.4631 |
108
+ | 21500 | 0.4343 | 441.9103 | 2893.3921 | 6879.7119 | 21.3113 | 46.923 | 11.731 | 5412.9268 |
109
+ | 22000 | 0.4444 | 445.0268 | 2878.3350 | 6833.9839 | 21.5124 | 46.485 | 11.621 | 3586.4441 |
110
+ | 22500 | 0.4545 | 433.9949 | 3140.9766 | 6801.0562 | 21.4889 | 46.536 | 11.634 | 4264.9297 |
111
+ | 23000 | 0.4646 | 432.1537 | 3241.2009 | 6835.2002 | 21.4958 | 46.521 | 11.63 | 7089.4131 |
112
+ | 23500 | 0.4747 | 438.6622 | 3099.2891 | 6846.0479 | 21.3978 | 46.734 | 11.683 | 2764.0474 |
113
+ | 24000 | 0.4848 | 434.6780 | 3037.6338 | 6746.4639 | 21.4299 | 46.664 | 11.666 | 6095.2222 |
114
+ | 24500 | 0.4949 | 433.0188 | 3190.7532 | 6871.6479 | 21.4752 | 46.565 | 11.641 | 6818.7515 |
115
+ | 25000 | 0.5051 | 424.1827 | 2884.4297 | 6806.0479 | 21.2002 | 47.169 | 11.792 | 5655.6611 |
116
+ | 25500 | 0.5152 | 427.9544 | 2899.9268 | 6739.4878 | 21.4326 | 46.658 | 11.664 | 10928.7627 |
117
+ | 26000 | 0.5253 | 418.4491 | 2792.2812 | 6741.0562 | 21.4399 | 46.642 | 11.661 | 4652.5972 |
118
+ | 26500 | 0.5354 | 420.5338 | 2771.0999 | 6723.6162 | 21.5377 | 46.43 | 11.608 | 5530.9321 |
119
+ | 27000 | 0.5455 | 414.0452 | 2715.1108 | 6704.3521 | 21.8117 | 45.847 | 11.462 | 4411.1870 |
120
+ | 27500 | 0.5556 | 405.4073 | 2623.3743 | 6684.0 | 21.6362 | 46.219 | 11.555 | 4443.4106 |
121
+ | 28000 | 0.5657 | 410.8664 | 2691.8567 | 6677.0562 | 21.5795 | 46.34 | 11.585 | 1948.9584 |
122
+ | 28500 | 0.5758 | 418.1162 | 2795.4333 | 6772.7041 | 21.5011 | 46.509 | 11.627 | 2152.1055 |
123
+ | 29000 | 0.5859 | 407.0003 | 2837.7319 | 6612.7358 | 21.6658 | 46.156 | 11.539 | 2232.7546 |
124
+ | 29500 | 0.5960 | 407.4271 | 2949.1045 | 6649.2158 | 21.6025 | 46.291 | 11.573 | 3101.2493 |
125
+ | 30000 | 0.6061 | 406.1163 | 2778.8286 | 6607.7759 | 21.5146 | 46.48 | 11.62 | 3840.7419 |
126
+ | 30500 | 0.6162 | 397.9757 | 2956.0779 | 6601.0562 | 21.4872 | 46.539 | 11.635 | 2564.0315 |
127
+ | 31000 | 0.6263 | 398.2077 | 2838.1323 | 6594.9121 | 22.1693 | 45.107 | 11.277 | 2501.1306 |
128
+ | 31500 | 0.6364 | 393.3900 | 2667.1082 | 6559.9360 | 21.4915 | 46.53 | 11.633 | 5743.9526 |
129
+ | 32000 | 0.6465 | 393.8561 | 2583.0869 | 6566.1758 | 21.5166 | 46.476 | 11.619 | 8028.9990 |
130
+ | 32500 | 0.6566 | 391.7058 | 2675.8672 | 6583.2002 | 21.6273 | 46.238 | 11.559 | 5334.7124 |
131
+ | 33000 | 0.6667 | 396.9419 | 2743.4949 | 6698.2402 | 21.5042 | 46.503 | 11.626 | 11934.8896 |
132
+ | 33500 | 0.6768 | 388.6004 | 2891.6582 | 6570.7520 | 21.2945 | 46.961 | 11.74 | 4139.7988 |
133
+ | 34000 | 0.6869 | 386.5763 | 2826.3506 | 6525.6318 | 21.3684 | 46.798 | 11.7 | 3156.8203 |
134
+ | 34500 | 0.6970 | 387.0721 | 2805.7012 | 6572.9600 | 21.2897 | 46.971 | 11.743 | 2896.1072 |
135
+ | 35000 | 0.7071 | 386.0813 | 2637.3757 | 6580.5439 | 21.2409 | 47.079 | 11.77 | 7566.7905 |
136
+ | 35500 | 0.7172 | 381.5364 | 3025.4507 | 6588.3198 | 21.5446 | 46.415 | 11.604 | 4902.9575 |
137
+ | 36000 | 0.7273 | 386.6814 | 2880.9741 | 6570.8481 | 21.3516 | 46.835 | 11.709 | 3154.9243 |
138
+ | 36500 | 0.7374 | 379.9471 | 2795.0400 | 6521.5679 | 21.4418 | 46.638 | 11.659 | 3810.8567 |
139
+ | 37000 | 0.7475 | 383.0058 | 2805.8992 | 6537.6641 | 21.3615 | 46.813 | 11.703 | 5655.2837 |
140
+ | 37500 | 0.7576 | 375.7296 | 2787.7578 | 6456.9922 | 21.3662 | 46.803 | 11.701 | 3055.8257 |
141
+ | 38000 | 0.7677 | 374.0701 | 2868.8132 | 6484.3198 | 21.3768 | 46.78 | 11.695 | 2952.7307 |
142
+ | 38500 | 0.7778 | 377.5502 | 2659.9729 | 6455.3921 | 21.3661 | 46.803 | 11.701 | 3218.3279 |
143
+ | 39000 | 0.7879 | 370.5863 | 2806.0972 | 6473.3120 | 21.2561 | 47.045 | 11.761 | 2280.2119 |
144
+ | 39500 | 0.7980 | 371.9195 | 2613.6814 | 6536.6719 | 21.3516 | 46.835 | 11.709 | 2672.7583 |
145
+ | 40000 | 0.8081 | 377.1619 | 2487.1150 | 6439.7441 | 21.4296 | 46.664 | 11.666 | 2315.8076 |
146
+ | 40500 | 0.8182 | 370.4856 | 2678.1318 | 6437.2798 | 21.3153 | 46.915 | 11.729 | 1819.0656 |
147
+ | 41000 | 0.8283 | 369.2075 | 2614.6948 | 6462.3999 | 21.4041 | 46.72 | 11.68 | 2854.2568 |
148
+ | 41500 | 0.8384 | 372.8739 | 2305.3298 | 6431.2002 | 21.4425 | 46.636 | 11.659 | 3267.0427 |
149
+ | 42000 | 0.8485 | 368.2697 | 2281.5596 | 6418.3042 | 21.2858 | 46.98 | 11.745 | 2240.3704 |
150
+ | 42500 | 0.8586 | 365.9109 | 2410.3772 | 6468.8638 | 21.4759 | 46.564 | 11.641 | 3584.7686 |
151
+ | 43000 | 0.8687 | 367.1704 | 2442.8845 | 6401.3760 | 21.5525 | 46.398 | 11.6 | 2345.6868 |
152
+ | 43500 | 0.8788 | 363.9908 | 2523.0574 | 6458.4961 | 21.7663 | 45.943 | 11.486 | 3812.3833 |
153
+ | 44000 | 0.8889 | 363.7012 | 2468.5098 | 6388.8638 | 21.7639 | 45.948 | 11.487 | 4788.1108 |
154
+ | 44500 | 0.8990 | 363.1368 | 2572.5454 | 6479.6479 | 21.67 | 46.147 | 11.537 | 3193.9253 |
155
+ | 45000 | 0.9091 | 356.2796 | 2622.3564 | 6405.2158 | 21.6556 | 46.177 | 11.544 | 1944.5388 |
156
+ | 45500 | 0.9192 | 360.0483 | 2560.6021 | 6401.0239 | 21.3614 | 46.813 | 11.703 | 6363.8784 |
157
+ | 46000 | 0.9293 | 358.6112 | 2230.1096 | 6385.6958 | 21.3445 | 46.85 | 11.713 | 2245.4624 |
158
+ | 46500 | 0.9394 | 359.0361 | 2364.5928 | 6378.6558 | 21.4319 | 46.659 | 11.665 | 2161.8982 |
159
+ | 47000 | 0.9495 | 356.5909 | 2449.0066 | 6407.8081 | 21.4857 | 46.543 | 11.636 | 3063.7917 |
160
+ | 47500 | 0.9596 | 359.0292 | 2401.2183 | 6344.3521 | 21.5028 | 46.505 | 11.626 | 3229.5225 |
161
+ | 48000 | 0.9697 | 359.6570 | 2497.3064 | 6563.9038 | 21.3228 | 46.898 | 11.725 | 3209.3140 |
162
+ | 48500 | 0.9798 | 353.2013 | 2481.0728 | 6333.3442 | 21.4465 | 46.628 | 11.657 | 2960.4282 |
163
+ | 49000 | 0.9899 | 355.4300 | 2554.2913 | 6356.8638 | 21.2635 | 47.029 | 11.757 | 3479.5901 |
164
+ | 49500 | 1.0 | 352.3520 | 2577.0833 | 6367.2959 | 21.3211 | 46.902 | 11.725 | 3190.5127 |
165
 
166
  ### Framework versions
167
  - Distily 0.2.0
logs/per_device_train_batch_size=2/events.out.tfevents.1723393725.93d6cbb3ad53 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:59e85d8a57e2231a9165065c8623fc45b1012dbb11629cf45b73c3c1d0131402
3
+ size 253