0%|                                                                                         | 0/10701 [00:00<?, ?it/s][WARNING|modeling_utils.py:388] 2022-03-02 09:44:42,427 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 10.6092, 'learning_rate': 6.000000000000001e-08, 'epoch': 0.0}
[WARNING|modeling_utils.py:388] 2022-03-02 09:44:43,577 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                               | 1/10701 [00:02<8:17:45,  2.79s/it]
  0%|                                                                               | 1/10701 [00:02<8:17:45,  2.79s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:44:44,936 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:44:46,010 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                               | 2/10701 [00:05<7:26:39,  2.50s/it]
  0%|                                                                               | 2/10701 [00:05<7:26:39,  2.50s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:44:47,245 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:44:48,319 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                               | 3/10701 [00:07<7:10:19,  2.41s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:44:49,490 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:44:50,553 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                               | 4/10701 [00:09<6:57:56,  2.34s/it]
  0%|                                                                               | 4/10701 [00:09<6:57:56,  2.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:44:51,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:44:52,834 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                               | 5/10701 [00:11<6:53:45,  2.32s/it]
  0%|                                                                               | 5/10701 [00:11<6:53:45,  2.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:44:54,015 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:44:55,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 10.5912, 'learning_rate': 3.6e-07, 'epoch': 0.0}
  0%|                                                                               | 6/10701 [00:14<6:49:17,  2.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:44:56,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 10.5078, 'learning_rate': 4.2e-07, 'epoch': 0.0}
[WARNING|modeling_utils.py:388] 2022-03-02 09:44:57,310 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                               | 7/10701 [00:16<6:45:16,  2.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:44:58,455 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 10.5746, 'learning_rate': 4.800000000000001e-07, 'epoch': 0.0}
[WARNING|modeling_utils.py:388] 2022-03-02 09:44:59,496 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                               | 8/10701 [00:18<6:40:17,  2.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:45:00,666 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:45:01,715 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                               | 9/10701 [00:20<6:38:39,  2.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:45:02,857 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 10.4953, 'learning_rate': 5.4e-07, 'epoch': 0.0}
[WARNING|modeling_utils.py:388] 2022-03-02 09:45:03,911 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                              | 10/10701 [00:23<6:36:34,  2.23s/it]
  0%|                                                                              | 10/10701 [00:23<6:36:34,  2.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:45:05,661 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:45:06,680 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 10.4469, 'learning_rate': 6.599999999999999e-07, 'epoch': 0.0}
{'loss': 10.4409, 'learning_rate': 7.2e-07, 'epoch': 0.0}
  0%|                                                                              | 11/10701 [00:25<7:05:59,  2.39s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:45:07,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:45:08,889 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                              | 12/10701 [00:27<6:56:09,  2.34s/it]
  0%|                                                                              | 12/10701 [00:27<6:56:09,  2.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:45:10,046 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:45:11,064 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                              | 13/10701 [00:30<6:47:31,  2.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:45:12,211 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:45:13,233 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                              | 14/10701 [00:32<6:40:48,  2.25s/it]
  0%|                                                                              | 14/10701 [00:32<6:40:48,  2.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:45:14,395 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:45:15,422 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 10.2975, 'learning_rate': 9e-07, 'epoch': 0.0}
  0%|                                                                              | 15/10701 [00:34<6:37:34,  2.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:45:16,513 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:45:17,524 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                              | 16/10701 [00:36<6:30:33,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:45:18,640 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 10.3363, 'learning_rate': 9.600000000000001e-07, 'epoch': 0.0}
{'loss': 10.2833, 'learning_rate': 1.0200000000000002e-06, 'epoch': 0.0}
[WARNING|modeling_utils.py:388] 2022-03-02 09:45:19,647 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                              | 17/10701 [00:38<6:26:50,  2.17s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:45:20,796 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:45:21,804 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|▏                                                                             | 18/10701 [00:40<6:25:56,  2.17s/it]
  0%|▏                                                                             | 18/10701 [00:40<6:25:56,  2.17s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:45:22,934 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:45:23,940 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|▏                                                                             | 19/10701 [00:43<6:24:12,  2.16s/it]
  0%|▏                                                                             | 19/10701 [00:43<6:24:12,  2.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:45:25,071 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:45:26,054 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 10.1454, 'learning_rate': 1.2000000000000002e-06, 'epoch': 0.01}
  0%|▏                                                                             | 20/10701 [00:45<6:21:56,  2.15s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:45:27,164 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:45:28,142 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 10.0815, 'learning_rate': 1.26e-06, 'epoch': 0.01}
  0%|▏                                                                             | 21/10701 [00:47<6:18:42,  2.13s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:45:29,221 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 10.1507, 'learning_rate': 1.3199999999999999e-06, 'epoch': 0.01}
[WARNING|modeling_utils.py:388] 2022-03-02 09:45:30,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|▏                                                                             | 22/10701 [00:49<6:14:40,  2.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:45:31,288 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:45:32,261 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed

  0%|▏                                                                             | 23/10701 [00:51<6:12:34,  2.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:45:33,334 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 10.0909, 'learning_rate': 1.38e-06, 'epoch': 0.01}
{'loss': 10.0759, 'learning_rate': 1.44e-06, 'epoch': 0.01}
[WARNING|modeling_utils.py:388] 2022-03-02 09:45:34,307 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|▏                                                                             | 24/10701 [00:53<6:09:53,  2.08s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:45:35,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 10.0475, 'learning_rate': 1.5e-06, 'epoch': 0.01}
[WARNING|modeling_utils.py:388] 2022-03-02 09:45:36,344 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|▏                                                                             | 25/10701 [00:55<6:07:43,  2.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:45:37,424 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:45:38,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|▏                                                                             | 26/10701 [00:57<6:06:37,  2.06s/it]

  0%|▏                                                                             | 26/10701 [00:57<6:06:37,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:45:39,468 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 10.0029, 'learning_rate': 1.62e-06, 'epoch': 0.01}
[WARNING|modeling_utils.py:388] 2022-03-02 09:45:40,431 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|▏                                                                             | 27/10701 [00:59<6:05:33,  2.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:45:41,492 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:45:42,441 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|▏                                                                             | 28/10701 [01:01<6:02:58,  2.04s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:45:43,482 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 10.0417, 'learning_rate': 1.68e-06, 'epoch': 0.01}
{'loss': 9.8293, 'learning_rate': 1.74e-06, 'epoch': 0.01}
[WARNING|modeling_utils.py:388] 2022-03-02 09:45:44,410 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|▏                                                                             | 29/10701 [01:03<5:59:15,  2.02s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:45:45,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:45:46,388 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.9804, 'learning_rate': 1.8e-06, 'epoch': 0.01}
  0%|▏                                                                             | 30/10701 [01:05<5:56:52,  2.01s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:45:47,423 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.8897, 'learning_rate': 1.86e-06, 'epoch': 0.01}
{'loss': 9.7528, 'learning_rate': 1.9200000000000003e-06, 'epoch': 0.01}
[WARNING|modeling_utils.py:388] 2022-03-02 09:45:48,421 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|▏                                                                             | 31/10701 [01:07<5:58:21,  2.02s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:45:49,498 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:45:50,450 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|▏                                                                             | 32/10701 [01:09<5:59:04,  2.02s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:45:51,492 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:45:52,380 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.9566, 'learning_rate': 1.98e-06, 'epoch': 0.01}
  0%|▏                                                                             | 33/10701 [01:11<5:54:15,  1.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:45:53,382 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:45:54,265 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.7642, 'learning_rate': 2.0400000000000004e-06, 'epoch': 0.01}
{'loss': 9.8169, 'learning_rate': 2.1000000000000002e-06, 'epoch': 0.01}
  0%|▏                                                                             | 34/10701 [01:13<5:48:37,  1.96s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:45:55,260 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:45:56,132 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|▎                                                                             | 35/10701 [01:15<5:43:23,  1.93s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:45:57,086 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:45:57,945 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|▎                                                                             | 36/10701 [01:17<5:37:06,  1.90s/it]
{'loss': 9.678, 'learning_rate': 2.16e-06, 'epoch': 0.01}
  0%|▎                                                                             | 36/10701 [01:17<5:37:06,  1.90s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:45:58,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:45:59,733 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|▎                                                                             | 37/10701 [01:18<5:31:26,  1.86s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:46:00,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.6785, 'learning_rate': 2.28e-06, 'epoch': 0.01}
[WARNING|modeling_utils.py:388] 2022-03-02 09:46:01,437 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|▎                                                                             | 38/10701 [01:20<5:22:42,  1.82s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:46:02,347 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.8736, 'learning_rate': 2.34e-06, 'epoch': 0.01}
[WARNING|modeling_utils.py:388] 2022-03-02 09:46:03,145 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|▎                                                                             | 39/10701 [01:22<5:16:57,  1.78s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:46:04,059 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.6238, 'learning_rate': 2.4000000000000003e-06, 'epoch': 0.01}
{'loss': 9.6723, 'learning_rate': 2.46e-06, 'epoch': 0.01}
[WARNING|modeling_utils.py:388] 2022-03-02 09:46:04,849 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|▎                                                                             | 40/10701 [01:23<5:12:29,  1.76s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:46:05,715 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:46:06,484 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|▎                                                                             | 41/10701 [01:25<5:06:03,  1.72s/it]
  0%|▎                                                                             | 41/10701 [01:25<5:06:03,  1.72s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:46:07,377 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:46:08,126 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|▎                                                                             | 42/10701 [01:27<5:01:34,  1.70s/it]
  0%|▎                                                                             | 42/10701 [01:27<5:01:34,  1.70s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:46:08,970 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:46:09,691 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|▎                                                                             | 43/10701 [01:28<4:54:37,  1.66s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:46:10,494 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.5222, 'learning_rate': 2.6399999999999997e-06, 'epoch': 0.01}
{'loss': 9.7702, 'learning_rate': 2.7e-06, 'epoch': 0.01}
[WARNING|modeling_utils.py:388] 2022-03-02 09:46:11,151 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|▎                                                                             | 44/10701 [01:30<4:43:49,  1.60s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:46:11,877 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:46:12,512 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|▎                                                                             | 45/10701 [01:31<4:31:22,  1.53s/it]
  0%|▎                                                                             | 45/10701 [01:31<4:31:22,  1.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:46:13,233 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:46:13,832 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|▎                                                                             | 46/10701 [01:32<4:20:15,  1.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:46:14,498 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:46:15,030 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.5447, 'learning_rate': 2.82e-06, 'epoch': 0.01}
{'loss': 9.5612, 'learning_rate': 2.88e-06, 'epoch': 0.01}
  0%|▎                                                                             | 47/10701 [01:34<4:05:54,  1.38s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:46:15,644 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:46:16,135 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|▎                                                                             | 48/10701 [01:35<3:50:51,  1.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:46:16,713 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:46:17,161 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|▎                                                                             | 49/10701 [01:36<3:36:33,  1.22s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:46:17,739 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:46:18,756 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.1771, 'learning_rate': 2.9400000000000002e-06, 'epoch': 0.01}
  0%|▎                                                                             | 50/10701 [01:37<3:56:26,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:46:20,074 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.3124, 'learning_rate': 3e-06, 'epoch': 0.01}
[WARNING|modeling_utils.py:388] 2022-03-02 09:46:21,175 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|▎                                                                             | 51/10701 [01:40<4:54:22,  1.66s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:46:22,369 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.6376, 'learning_rate': 3.06e-06, 'epoch': 0.01}
[WARNING|modeling_utils.py:388] 2022-03-02 09:46:23,446 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|▍                                                                             | 52/10701 [01:42<5:26:56,  1.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:46:24,635 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.3595, 'learning_rate': 3.1199999999999998e-06, 'epoch': 0.01}
[WARNING|modeling_utils.py:388] 2022-03-02 09:46:25,739 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|▍                                                                             | 53/10701 [01:44<5:50:48,  1.98s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:46:26,911 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.4499, 'learning_rate': 3.18e-06, 'epoch': 0.01}
[WARNING|modeling_utils.py:388] 2022-03-02 09:46:27,995 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▍                                                                             | 54/10701 [01:47<6:05:56,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:46:29,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.4092, 'learning_rate': 3.24e-06, 'epoch': 0.02}
[WARNING|modeling_utils.py:388] 2022-03-02 09:46:30,271 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▍                                                                             | 55/10701 [01:49<6:16:54,  2.12s/it]
  1%|▍                                                                             | 55/10701 [01:49<6:16:54,  2.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:46:31,449 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:46:32,489 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▍                                                                             | 56/10701 [01:51<6:21:53,  2.15s/it]
  1%|▍                                                                             | 56/10701 [01:51<6:21:53,  2.15s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:46:33,659 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:46:34,776 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▍                                                                             | 57/10701 [01:53<6:28:57,  2.19s/it]
  1%|▍                                                                             | 57/10701 [01:53<6:28:57,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:46:35,940 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:46:37,000 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▍                                                                             | 58/10701 [01:56<6:30:40,  2.20s/it]
  1%|▍                                                                             | 58/10701 [01:56<6:30:40,  2.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:46:38,160 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:46:39,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▍                                                                             | 59/10701 [01:58<6:32:16,  2.21s/it]
  1%|▍                                                                             | 59/10701 [01:58<6:32:16,  2.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:46:40,383 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:46:41,412 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▍                                                                             | 60/10701 [02:00<6:30:41,  2.20s/it]
  1%|▍                                                                             | 60/10701 [02:00<6:30:41,  2.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:46:42,580 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:46:43,638 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.1439, 'learning_rate': 3.66e-06, 'epoch': 0.02}
  1%|▍                                                                             | 61/10701 [02:02<6:31:40,  2.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:46:44,798 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.3793, 'learning_rate': 3.72e-06, 'epoch': 0.02}
[WARNING|modeling_utils.py:388] 2022-03-02 09:46:45,847 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▍                                                                             | 62/10701 [02:04<6:31:42,  2.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:46:46,992 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:46:48,017 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▍                                                                             | 63/10701 [02:07<6:29:34,  2.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:46:49,145 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.6052, 'learning_rate': 3.7800000000000002e-06, 'epoch': 0.02}
[WARNING|modeling_utils.py:388] 2022-03-02 09:46:50,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▍                                                                             | 64/10701 [02:09<6:26:14,  2.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:46:51,280 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.2966, 'learning_rate': 3.8400000000000005e-06, 'epoch': 0.02}
[WARNING|modeling_utils.py:388] 2022-03-02 09:46:52,316 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▍                                                                             | 65/10701 [02:11<6:25:15,  2.17s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:46:53,423 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.3686, 'learning_rate': 3.9e-06, 'epoch': 0.02}
{'loss': 9.1426, 'learning_rate': 3.96e-06, 'epoch': 0.02}
[WARNING|modeling_utils.py:388] 2022-03-02 09:46:54,427 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▍                                                                             | 66/10701 [02:13<6:21:57,  2.15s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:46:55,576 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.3207, 'learning_rate': 4.0200000000000005e-06, 'epoch': 0.02}
[WARNING|modeling_utils.py:388] 2022-03-02 09:46:56,587 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▍                                                                             | 67/10701 [02:15<6:22:19,  2.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:46:57,731 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.1614, 'learning_rate': 4.080000000000001e-06, 'epoch': 0.02}
[WARNING|modeling_utils.py:388] 2022-03-02 09:46:58,764 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▍                                                                             | 68/10701 [02:17<6:23:22,  2.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:46:59,867 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.2762, 'learning_rate': 4.14e-06, 'epoch': 0.02}
[WARNING|modeling_utils.py:388] 2022-03-02 09:47:00,841 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▌                                                                             | 69/10701 [02:19<6:18:36,  2.14s/it]
  1%|▌                                                                             | 69/10701 [02:19<6:18:36,  2.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:01,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:47:02,962 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▌                                                                             | 70/10701 [02:22<6:17:58,  2.13s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:04,135 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:47:05,125 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▌                                                                             | 71/10701 [02:24<6:19:17,  2.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:06,220 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.157, 'learning_rate': 4.26e-06, 'epoch': 0.02}
[WARNING|modeling_utils.py:388] 2022-03-02 09:47:07,220 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▌                                                                             | 72/10701 [02:26<6:16:52,  2.13s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:08,298 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.4008, 'learning_rate': 4.32e-06, 'epoch': 0.02}
[WARNING|modeling_utils.py:388] 2022-03-02 09:47:09,285 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▌                                                                             | 73/10701 [02:28<6:13:32,  2.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:10,372 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.297, 'learning_rate': 4.3799999999999996e-06, 'epoch': 0.02}
[WARNING|modeling_utils.py:388] 2022-03-02 09:47:11,355 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▌                                                                             | 74/10701 [02:30<6:16:29,  2.13s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:12,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.3809, 'learning_rate': 4.44e-06, 'epoch': 0.02}
{'loss': 9.257, 'learning_rate': 4.5e-06, 'epoch': 0.02}
[WARNING|modeling_utils.py:388] 2022-03-02 09:47:13,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▌                                                                             | 75/10701 [02:32<6:11:10,  2.10s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:14,518 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:47:15,478 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▌                                                                             | 76/10701 [02:34<6:06:05,  2.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:16,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.0765, 'learning_rate': 4.56e-06, 'epoch': 0.02}
[WARNING|modeling_utils.py:388] 2022-03-02 09:47:17,506 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▌                                                                             | 77/10701 [02:36<6:03:57,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:18,587 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.5525, 'learning_rate': 4.62e-06, 'epoch': 0.02}
{'loss': 9.1723, 'learning_rate': 4.68e-06, 'epoch': 0.02}
[WARNING|modeling_utils.py:388] 2022-03-02 09:47:19,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▌                                                                             | 78/10701 [02:38<6:02:44,  2.05s/it]
  1%|▌                                                                             | 78/10701 [02:38<6:02:44,  2.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:20,588 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:47:21,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▌                                                                             | 79/10701 [02:40<6:00:44,  2.04s/it]
  1%|▌                                                                             | 79/10701 [02:40<6:00:44,  2.04s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:22,613 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:47:23,544 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▌                                                                             | 80/10701 [02:42<5:58:16,  2.02s/it]
  1%|▌                                                                             | 80/10701 [02:42<5:58:16,  2.02s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:24,603 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:47:25,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▌                                                                             | 81/10701 [02:44<5:56:53,  2.02s/it]
  1%|▌                                                                             | 81/10701 [02:44<5:56:53,  2.02s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:26,579 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:47:27,478 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▌                                                                             | 82/10701 [02:46<5:52:42,  1.99s/it]
  1%|▌                                                                             | 82/10701 [02:46<5:52:42,  1.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:28,505 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:47:29,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▌                                                                             | 83/10701 [02:48<5:49:36,  1.98s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:30,447 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.1196, 'learning_rate': 5.04e-06, 'epoch': 0.02}
[WARNING|modeling_utils.py:388] 2022-03-02 09:47:31,356 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▌                                                                             | 84/10701 [02:50<5:47:42,  1.97s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:32,342 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.025, 'learning_rate': 5.1e-06, 'epoch': 0.02}
[WARNING|modeling_utils.py:388] 2022-03-02 09:47:33,250 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▌                                                                             | 85/10701 [02:52<5:44:06,  1.94s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:34,250 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.1792, 'learning_rate': 5.16e-06, 'epoch': 0.02}
[WARNING|modeling_utils.py:388] 2022-03-02 09:47:35,106 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▋                                                                             | 86/10701 [02:54<5:39:21,  1.92s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:36,085 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.1091, 'learning_rate': 5.22e-06, 'epoch': 0.02}
[WARNING|modeling_utils.py:388] 2022-03-02 09:47:36,931 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▋                                                                             | 87/10701 [02:56<5:34:15,  1.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:37,884 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:47:38,729 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.9929, 'learning_rate': 5.279999999999999e-06, 'epoch': 0.02}
  1%|▋                                                                             | 88/10701 [02:57<5:29:22,  1.86s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:39,678 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:47:40,507 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.0523, 'learning_rate': 5.34e-06, 'epoch': 0.02}
  1%|▋                                                                             | 89/10701 [02:59<5:24:44,  1.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:41,439 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:47:42,240 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▋                                                                             | 90/10701 [03:01<5:19:29,  1.81s/it]
  1%|▋                                                                             | 90/10701 [03:01<5:19:29,  1.81s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:43,174 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:47:43,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▋                                                                             | 91/10701 [03:03<5:14:09,  1.78s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:44,799 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.8228, 'learning_rate': 5.46e-06, 'epoch': 0.03}
{'loss': 9.01, 'learning_rate': 5.52e-06, 'epoch': 0.03}
[WARNING|modeling_utils.py:388] 2022-03-02 09:47:45,532 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▋                                                                             | 92/10701 [03:04<5:03:52,  1.72s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:46,347 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.1774, 'learning_rate': 5.58e-06, 'epoch': 0.03}
[WARNING|modeling_utils.py:388] 2022-03-02 09:47:47,025 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▋                                                                             | 93/10701 [03:06<4:51:57,  1.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:47,810 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:47:48,466 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▋                                                                             | 94/10701 [03:07<4:40:42,  1.59s/it]
{'loss': 8.7715, 'learning_rate': 5.64e-06, 'epoch': 0.03}
  1%|▋                                                                             | 94/10701 [03:07<4:40:42,  1.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:49,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:47:49,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▋                                                                             | 95/10701 [03:08<4:30:24,  1.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:50,578 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▋                                                                             | 94/10701 [03:07<4:40:42,  1.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:49,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▋                                                                             | 96/10701 [03:10<4:17:48,  1.46s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:51,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:47:52,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▋                                                                             | 97/10701 [03:11<4:06:23,  1.39s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:53,027 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▋                                                                             | 96/10701 [03:10<4:17:48,  1.46s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:51,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.0813, 'learning_rate': 5.82e-06, 'epoch': 0.03}
  1%|▋                                                                             | 98/10701 [03:12<3:52:43,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:54,156 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:47:54,621 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▋                                                                             | 99/10701 [03:13<3:40:27,  1.25s/it]
  1%|▋                                                                             | 98/10701 [03:12<3:52:43,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:54,156 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▋                                                                            | 100/10701 [03:15<4:00:48,  1.36s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:54,156 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▋                                                                            | 100/10701 [03:15<4:00:48,  1.36s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:54,156 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▋                                                                            | 100/10701 [03:15<4:00:48,  1.36s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:57,565 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▋                                                                            | 100/10701 [03:15<4:00:48,  1.36s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:57,565 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.1754, 'learning_rate': 6.0600000000000004e-06, 'epoch': 0.03}
  1%|▋                                                                            | 100/10701 [03:15<4:00:48,  1.36s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:47:57,565 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▋                                                                            | 102/10701 [03:19<5:26:42,  1.85s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:02,093 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▋                                                                            | 102/10701 [03:19<5:26:42,  1.85s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:02,093 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.0879, 'learning_rate': 6.18e-06, 'epoch': 0.03}
  1%|▋                                                                            | 102/10701 [03:19<5:26:42,  1.85s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:02,093 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▋                                                                            | 104/10701 [03:24<6:04:46,  2.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:06,637 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▋                                                                            | 104/10701 [03:24<6:04:46,  2.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:06,637 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 9.124, 'learning_rate': 6.3e-06, 'epoch': 0.03}
  1%|▋                                                                            | 104/10701 [03:24<6:04:46,  2.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:06,637 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▊                                                                            | 106/10701 [03:29<6:21:46,  2.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:11,144 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▊                                                                            | 106/10701 [03:29<6:21:46,  2.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:11,144 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▊                                                                            | 106/10701 [03:29<6:21:46,  2.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:11,144 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▊                                                                            | 108/10701 [03:33<6:26:08,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:11,144 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▊                                                                            | 108/10701 [03:33<6:26:08,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:11,144 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.9893, 'learning_rate': 6.48e-06, 'epoch': 0.03}
  1%|▊                                                                            | 108/10701 [03:33<6:26:08,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:15,574 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▊                                                                            | 108/10701 [03:33<6:26:08,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:15,574 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▊                                                                            | 110/10701 [03:37<6:29:45,  2.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:15,574 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▊                                                                            | 110/10701 [03:37<6:29:45,  2.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:20,034 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▊                                                                            | 110/10701 [03:37<6:29:45,  2.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:20,034 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▊                                                                            | 112/10701 [03:42<6:27:38,  2.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:20,034 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▊                                                                            | 112/10701 [03:42<6:27:38,  2.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:20,034 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.8539, 'learning_rate': 6.72e-06, 'epoch': 0.03}
  1%|▊                                                                            | 112/10701 [03:42<6:27:38,  2.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:24,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▊                                                                            | 114/10701 [03:46<6:23:28,  2.17s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:24,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▊                                                                            | 114/10701 [03:46<6:23:28,  2.17s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:24,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▊                                                                            | 114/10701 [03:46<6:23:28,  2.17s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:28,687 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▊                                                                            | 114/10701 [03:46<6:23:28,  2.17s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:28,687 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▊                                                                            | 116/10701 [03:50<6:20:58,  2.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:28,687 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▊                                                                            | 116/10701 [03:50<6:20:58,  2.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:28,687 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▊                                                                            | 116/10701 [03:50<6:20:58,  2.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:32,966 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▊                                                                            | 116/10701 [03:50<6:20:58,  2.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:32,966 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▊                                                                            | 118/10701 [03:55<6:17:14,  2.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:37,179 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▊                                                                            | 118/10701 [03:55<6:17:14,  2.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:37,179 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.9842, 'learning_rate': 7.08e-06, 'epoch': 0.03}
  1%|▊                                                                            | 118/10701 [03:55<6:17:14,  2.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:37,179 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▊                                                                            | 120/10701 [03:59<6:13:50,  2.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:37,179 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▊                                                                            | 120/10701 [03:59<6:13:50,  2.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:37,179 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.984, 'learning_rate': 7.2e-06, 'epoch': 0.03}
  1%|▊                                                                            | 120/10701 [03:59<6:13:50,  2.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:41,379 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▉                                                                            | 122/10701 [04:03<6:12:51,  2.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:41,379 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▉                                                                            | 122/10701 [04:03<6:12:51,  2.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:41,379 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.6948, 'learning_rate': 7.32e-06, 'epoch': 0.03}
  1%|▉                                                                            | 122/10701 [04:03<6:12:51,  2.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:45,593 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▉                                                                            | 122/10701 [04:03<6:12:51,  2.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:45,593 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▉                                                                            | 124/10701 [04:07<6:09:05,  2.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:45,593 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▉                                                                            | 124/10701 [04:07<6:09:05,  2.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:45,593 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▉                                                                            | 124/10701 [04:07<6:09:05,  2.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:49,733 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▉                                                                            | 124/10701 [04:07<6:09:05,  2.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:49,733 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▉                                                                            | 126/10701 [04:11<6:04:30,  2.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:49,733 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▉                                                                            | 126/10701 [04:11<6:04:30,  2.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:53,795 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▉                                                                            | 126/10701 [04:11<6:04:30,  2.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:53,795 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▉                                                                            | 128/10701 [04:15<5:57:29,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:53,795 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▉                                                                            | 128/10701 [04:15<5:57:29,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:53,795 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▉                                                                            | 128/10701 [04:15<5:57:29,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:57,763 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▉                                                                            | 128/10701 [04:15<5:57:29,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:48:57,763 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▉                                                                            | 130/10701 [04:19<5:51:20,  1.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:01,648 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▉                                                                            | 130/10701 [04:19<5:51:20,  1.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:01,648 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.5824, 'learning_rate': 7.8e-06, 'epoch': 0.04}
  1%|▉                                                                            | 130/10701 [04:19<5:51:20,  1.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:01,648 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▉                                                                            | 132/10701 [04:23<5:47:11,  1.97s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:05,552 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▉                                                                            | 132/10701 [04:23<5:47:11,  1.97s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:05,552 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.5085, 'learning_rate': 7.92e-06, 'epoch': 0.04}
  1%|▉                                                                            | 132/10701 [04:23<5:47:11,  1.97s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:05,552 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.6462, 'learning_rate': 7.98e-06, 'epoch': 0.04}
  1%|▉                                                                            | 134/10701 [04:27<5:41:21,  1.94s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:09,329 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▉                                                                            | 134/10701 [04:27<5:41:21,  1.94s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:09,329 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▉                                                                            | 134/10701 [04:27<5:41:21,  1.94s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:09,329 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▉                                                                            | 136/10701 [04:31<5:33:29,  1.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:13,062 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▉                                                                            | 136/10701 [04:31<5:33:29,  1.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:13,062 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▉                                                                            | 138/10701 [04:34<5:25:30,  1.85s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:13,062 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▉                                                                            | 138/10701 [04:34<5:25:30,  1.85s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:13,062 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.7108, 'learning_rate': 8.220000000000001e-06, 'epoch': 0.04}
  1%|▉                                                                            | 138/10701 [04:34<5:25:30,  1.85s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:16,629 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|▉                                                                            | 138/10701 [04:34<5:25:30,  1.85s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:16,629 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|█                                                                            | 140/10701 [04:38<5:15:07,  1.79s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:20,062 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|█                                                                            | 140/10701 [04:38<5:15:07,  1.79s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:20,062 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.6197, 'learning_rate': 8.400000000000001e-06, 'epoch': 0.04}
  1%|█                                                                            | 140/10701 [04:38<5:15:07,  1.79s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:20,062 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|█                                                                            | 142/10701 [04:41<4:59:33,  1.70s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:23,189 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|█                                                                            | 142/10701 [04:41<4:59:33,  1.70s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:23,189 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.3621, 'learning_rate': 8.52e-06, 'epoch': 0.04}
  1%|█                                                                            | 144/10701 [04:44<4:32:31,  1.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:25,952 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|█                                                                            | 144/10701 [04:44<4:32:31,  1.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:25,952 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|█                                                                            | 146/10701 [04:46<4:07:51,  1.41s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:25,952 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|█                                                                            | 146/10701 [04:46<4:07:51,  1.41s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:25,952 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.7401, 'learning_rate': 8.7e-06, 'epoch': 0.04}
  1%|█                                                                            | 148/10701 [04:49<3:43:06,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:28,449 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|█                                                                            | 148/10701 [04:49<3:43:06,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:28,449 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.5904, 'learning_rate': 8.82e-06, 'epoch': 0.04}
  1%|█                                                                            | 148/10701 [04:49<3:43:06,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:30,650 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|█                                                                            | 148/10701 [04:49<3:43:06,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:30,650 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|█                                                                            | 150/10701 [04:51<3:48:03,  1.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:30,650 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|█                                                                            | 150/10701 [04:51<3:48:03,  1.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:30,650 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.7839, 'learning_rate': 9e-06, 'epoch': 0.04}
  1%|█                                                                            | 150/10701 [04:51<3:48:03,  1.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:33,944 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|█                                                                            | 150/10701 [04:51<3:48:03,  1.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:33,944 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|█                                                                            | 152/10701 [04:56<5:23:50,  1.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:33,944 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|█                                                                            | 152/10701 [04:56<5:23:50,  1.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:33,944 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|█                                                                            | 152/10701 [04:56<5:23:50,  1.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|█                                                                            | 152/10701 [04:56<5:23:50,  1.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|█                                                                            | 154/10701 [05:01<6:02:49,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|█                                                                            | 154/10701 [05:01<6:02:49,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|█                                                                            | 154/10701 [05:01<6:02:49,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.4699, 'learning_rate': 9.3e-06, 'epoch': 0.04}
  1%|█                                                                            | 154/10701 [05:01<6:02:49,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.4494, 'learning_rate': 9.36e-06, 'epoch': 0.04}
  1%|█                                                                            | 154/10701 [05:01<6:02:49,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.4977, 'learning_rate': 9.42e-06, 'epoch': 0.04}
  1%|█                                                                            | 154/10701 [05:01<6:02:49,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.2952, 'learning_rate': 9.48e-06, 'epoch': 0.04}
  1%|█                                                                            | 154/10701 [05:01<6:02:49,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.5739, 'learning_rate': 9.54e-06, 'epoch': 0.04}
  1%|█                                                                            | 154/10701 [05:01<6:02:49,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.4479, 'learning_rate': 9.600000000000001e-06, 'epoch': 0.04}
  1%|█                                                                            | 154/10701 [05:01<6:02:49,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.2847, 'learning_rate': 9.66e-06, 'epoch': 0.05}
  1%|█                                                                            | 154/10701 [05:01<6:02:49,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.4738, 'learning_rate': 9.72e-06, 'epoch': 0.05}
  1%|█                                                                            | 154/10701 [05:01<6:02:49,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.3257, 'learning_rate': 9.780000000000001e-06, 'epoch': 0.05}
  1%|█                                                                            | 154/10701 [05:01<6:02:49,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.343, 'learning_rate': 9.84e-06, 'epoch': 0.05}
  1%|█                                                                            | 154/10701 [05:01<6:02:49,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|█                                                                            | 154/10701 [05:01<6:02:49,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.2969, 'learning_rate': 9.9e-06, 'epoch': 0.05}
{'loss': 8.467, 'learning_rate': 9.960000000000001e-06, 'epoch': 0.05}
  1%|█                                                                            | 154/10701 [05:01<6:02:49,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.3063, 'learning_rate': 1.002e-05, 'epoch': 0.05}
  1%|█                                                                            | 154/10701 [05:01<6:02:49,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.1785, 'learning_rate': 1.008e-05, 'epoch': 0.05}
  1%|█                                                                            | 154/10701 [05:01<6:02:49,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.223, 'learning_rate': 1.0140000000000001e-05, 'epoch': 0.05}
  1%|█                                                                            | 154/10701 [05:01<6:02:49,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.3273, 'learning_rate': 1.02e-05, 'epoch': 0.05}
  1%|█                                                                            | 154/10701 [05:01<6:02:49,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.3348, 'learning_rate': 1.0260000000000002e-05, 'epoch': 0.05}
  1%|█                                                                            | 154/10701 [05:01<6:02:49,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.2142, 'learning_rate': 1.032e-05, 'epoch': 0.05}
  1%|█                                                                            | 154/10701 [05:01<6:02:49,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.3112, 'learning_rate': 1.0379999999999999e-05, 'epoch': 0.05}
  1%|█                                                                            | 154/10701 [05:01<6:02:49,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|█                                                                            | 154/10701 [05:01<6:02:49,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.499, 'learning_rate': 1.044e-05, 'epoch': 0.05}
  1%|█                                                                            | 154/10701 [05:01<6:02:49,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.2831, 'learning_rate': 1.05e-05, 'epoch': 0.05}
  1%|█                                                                            | 154/10701 [05:01<6:02:49,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.2228, 'learning_rate': 1.0559999999999999e-05, 'epoch': 0.05}
  1%|█                                                                            | 154/10701 [05:01<6:02:49,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.2267, 'learning_rate': 1.062e-05, 'epoch': 0.05}
  1%|█                                                                            | 154/10701 [05:01<6:02:49,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.1144, 'learning_rate': 1.068e-05, 'epoch': 0.05}
{'loss': 8.2748, 'learning_rate': 1.074e-05, 'epoch': 0.05}
  1%|█                                                                            | 154/10701 [05:01<6:02:49,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  1%|█                                                                            | 154/10701 [05:01<6:02:49,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.1962, 'learning_rate': 1.08e-05, 'epoch': 0.05}
  1%|█                                                                            | 154/10701 [05:01<6:02:49,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.2714, 'learning_rate': 1.086e-05, 'epoch': 0.05}
  1%|█                                                                            | 154/10701 [05:01<6:02:49,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.9911, 'learning_rate': 1.092e-05, 'epoch': 0.05}
  1%|█                                                                            | 154/10701 [05:01<6:02:49,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.0264, 'learning_rate': 1.098e-05, 'epoch': 0.05}
  1%|█                                                                            | 154/10701 [05:01<6:02:49,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.1794, 'learning_rate': 1.104e-05, 'epoch': 0.05}
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.0667, 'learning_rate': 1.116e-05, 'epoch': 0.05}
{'loss': 8.1043, 'learning_rate': 1.1220000000000001e-05, 'epoch': 0.05}
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.9771, 'learning_rate': 1.128e-05, 'epoch': 0.05}
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.9608, 'learning_rate': 1.134e-05, 'epoch': 0.05}
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.8684, 'learning_rate': 1.1400000000000001e-05, 'epoch': 0.05}
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  2%|█▍                                                                           | 192/10701 [06:17<4:45:37,  1.63s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  2%|█▍                                                                           | 192/10701 [06:17<4:45:37,  1.63s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.7387, 'learning_rate': 1.152e-05, 'epoch': 0.05}
{'loss': 7.7955, 'learning_rate': 1.1580000000000001e-05, 'epoch': 0.05}
  2%|█▍                                                                           | 192/10701 [06:17<4:45:37,  1.63s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  2%|█▍                                                                           | 195/10701 [06:21<4:09:08,  1.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  2%|█▍                                                                           | 195/10701 [06:21<4:09:08,  1.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.8601, 'learning_rate': 1.1700000000000001e-05, 'epoch': 0.05}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:04,475 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:04,475 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.6168, 'learning_rate': 1.182e-05, 'epoch': 0.06}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.008, 'learning_rate': 1.1940000000000001e-05, 'epoch': 0.06}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.6062, 'learning_rate': 1.2e-05, 'epoch': 0.06}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.0935, 'learning_rate': 1.2060000000000001e-05, 'epoch': 0.06}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.1204, 'learning_rate': 1.2120000000000001e-05, 'epoch': 0.06}
{'loss': 7.9509, 'learning_rate': 1.2180000000000002e-05, 'epoch': 0.06}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.0621, 'learning_rate': 1.224e-05, 'epoch': 0.06}
{'loss': 7.9402, 'learning_rate': 1.2299999999999999e-05, 'epoch': 0.06}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.9452, 'learning_rate': 1.236e-05, 'epoch': 0.06}
{'loss': 7.6453, 'learning_rate': 1.242e-05, 'epoch': 0.06}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.0714, 'learning_rate': 1.2479999999999999e-05, 'epoch': 0.06}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.7938, 'learning_rate': 1.254e-05, 'epoch': 0.06}
{'loss': 7.7403, 'learning_rate': 1.26e-05, 'epoch': 0.06}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.9862, 'learning_rate': 1.2659999999999999e-05, 'epoch': 0.06}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.776, 'learning_rate': 1.272e-05, 'epoch': 0.06}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.7963, 'learning_rate': 1.278e-05, 'epoch': 0.06}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.8802, 'learning_rate': 1.284e-05, 'epoch': 0.06}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.8575, 'learning_rate': 1.29e-05, 'epoch': 0.06}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.9569, 'learning_rate': 1.296e-05, 'epoch': 0.06}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.0029, 'learning_rate': 1.302e-05, 'epoch': 0.06}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.8815, 'learning_rate': 1.308e-05, 'epoch': 0.06}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.6815, 'learning_rate': 1.314e-05, 'epoch': 0.06}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.8216, 'learning_rate': 1.32e-05, 'epoch': 0.06}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.7761, 'learning_rate': 1.326e-05, 'epoch': 0.06}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 8.0696, 'learning_rate': 1.3320000000000001e-05, 'epoch': 0.06}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.8774, 'learning_rate': 1.338e-05, 'epoch': 0.06}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.7813, 'learning_rate': 1.344e-05, 'epoch': 0.06}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.8391, 'learning_rate': 1.3500000000000001e-05, 'epoch': 0.06}
{'loss': 7.9505, 'learning_rate': 1.356e-05, 'epoch': 0.06}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.6234, 'learning_rate': 1.362e-05, 'epoch': 0.06}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.5466, 'learning_rate': 1.3680000000000001e-05, 'epoch': 0.06}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.6627, 'learning_rate': 1.374e-05, 'epoch': 0.06}
{'loss': 7.662, 'learning_rate': 1.3800000000000002e-05, 'epoch': 0.06}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.8887, 'learning_rate': 1.3860000000000001e-05, 'epoch': 0.06}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.8448, 'learning_rate': 1.392e-05, 'epoch': 0.07}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.7958, 'learning_rate': 1.3980000000000002e-05, 'epoch': 0.07}
{'loss': 7.4826, 'learning_rate': 1.4040000000000001e-05, 'epoch': 0.07}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.5913, 'learning_rate': 1.4099999999999999e-05, 'epoch': 0.07}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.7514, 'learning_rate': 1.416e-05, 'epoch': 0.07}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.661, 'learning_rate': 1.422e-05, 'epoch': 0.07}
{'loss': 7.5707, 'learning_rate': 1.428e-05, 'epoch': 0.07}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.5866, 'learning_rate': 1.434e-05, 'epoch': 0.07}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.6309, 'learning_rate': 1.44e-05, 'epoch': 0.07}
[WARNING|modeling_utils.py:388] 2022-03-02 09:51:06,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  2%|█▋                                                                           | 242/10701 [07:53<4:51:17,  1.67s/it]g-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  2%|█▋                                                                           | 242/10701 [07:53<4:51:17,  1.67s/it]g-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.9284, 'learning_rate': 1.452e-05, 'epoch': 0.07}
{'loss': 7.7646, 'learning_rate': 1.458e-05, 'epoch': 0.07}
  2%|█▋                                                                           | 242/10701 [07:53<4:51:17,  1.67s/it]g-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3227, 'learning_rate': 1.464e-05, 'epoch': 0.07}
  2%|█▋                                                                           | 242/10701 [07:53<4:51:17,  1.67s/it]g-point operations will not be computed-02 09:49:38,589 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.5144, 'learning_rate': 1.47e-05, 'epoch': 0.07}
  2%|█▊                                                                           | 246/10701 [07:58<4:09:30,  1.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  2%|█▊                                                                           | 246/10701 [07:58<4:09:30,  1.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.6973, 'learning_rate': 1.488e-05, 'epoch': 0.07}
{'loss': 7.4415, 'learning_rate': 1.4940000000000001e-05, 'epoch': 0.07}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8044, 'learning_rate': 1.5e-05, 'epoch': 0.07}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.7265, 'learning_rate': 1.506e-05, 'epoch': 0.07}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.6512, 'learning_rate': 1.5120000000000001e-05, 'epoch': 0.07}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.8493, 'learning_rate': 1.518e-05, 'epoch': 0.07}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.4895, 'learning_rate': 1.524e-05, 'epoch': 0.07}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.6161, 'learning_rate': 1.53e-05, 'epoch': 0.07}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.8947, 'learning_rate': 1.5360000000000002e-05, 'epoch': 0.07}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.7195, 'learning_rate': 1.542e-05, 'epoch': 0.07}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2944, 'learning_rate': 1.548e-05, 'epoch': 0.07}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.6675, 'learning_rate': 1.554e-05, 'epoch': 0.07}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.342, 'learning_rate': 1.56e-05, 'epoch': 0.07}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.4145, 'learning_rate': 1.5660000000000003e-05, 'epoch': 0.07}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.4339, 'learning_rate': 1.5720000000000002e-05, 'epoch': 0.07}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.723, 'learning_rate': 1.578e-05, 'epoch': 0.07}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3121, 'learning_rate': 1.584e-05, 'epoch': 0.07}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.458, 'learning_rate': 1.59e-05, 'epoch': 0.07}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.481, 'learning_rate': 1.596e-05, 'epoch': 0.07}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.423, 'learning_rate': 1.6020000000000002e-05, 'epoch': 0.07}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.452, 'learning_rate': 1.6080000000000002e-05, 'epoch': 0.08}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3395, 'learning_rate': 1.614e-05, 'epoch': 0.08}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.5545, 'learning_rate': 1.62e-05, 'epoch': 0.08}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.297, 'learning_rate': 1.626e-05, 'epoch': 0.08}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.5605, 'learning_rate': 1.6320000000000003e-05, 'epoch': 0.08}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3852, 'learning_rate': 1.6380000000000002e-05, 'epoch': 0.08}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3872, 'learning_rate': 1.6440000000000002e-05, 'epoch': 0.08}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1434, 'learning_rate': 1.65e-05, 'epoch': 0.08}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.5026, 'learning_rate': 1.656e-05, 'epoch': 0.08}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3842, 'learning_rate': 1.6620000000000004e-05, 'epoch': 0.08}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.4802, 'learning_rate': 1.6680000000000003e-05, 'epoch': 0.08}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3137, 'learning_rate': 1.6740000000000002e-05, 'epoch': 0.08}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1712, 'learning_rate': 1.6800000000000002e-05, 'epoch': 0.08}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2147, 'learning_rate': 1.686e-05, 'epoch': 0.08}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2708, 'learning_rate': 1.6919999999999997e-05, 'epoch': 0.08}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3019, 'learning_rate': 1.698e-05, 'epoch': 0.08}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3527, 'learning_rate': 1.704e-05, 'epoch': 0.08}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2426, 'learning_rate': 1.71e-05, 'epoch': 0.08}
  2%|█▊                                                                           | 248/10701 [08:01<3:50:12,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  3%|██                                                                           | 287/10701 [09:21<5:20:54,  1.85s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  3%|██                                                                           | 287/10701 [09:21<5:20:54,  1.85s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.383, 'learning_rate': 1.7219999999999998e-05, 'epoch': 0.08}
  3%|██                                                                           | 287/10701 [09:21<5:20:54,  1.85s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2301, 'learning_rate': 1.728e-05, 'epoch': 0.08}
  3%|██                                                                           | 287/10701 [09:21<5:20:54,  1.85s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.4012, 'learning_rate': 1.734e-05, 'epoch': 0.08}
{'loss': 7.4235, 'learning_rate': 1.74e-05, 'epoch': 0.08}
  3%|██                                                                           | 287/10701 [09:21<5:20:54,  1.85s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.5482, 'learning_rate': 1.746e-05, 'epoch': 0.08}
  3%|██                                                                           | 287/10701 [09:21<5:20:54,  1.85s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.4363, 'learning_rate': 1.7519999999999998e-05, 'epoch': 0.08}
  3%|██                                                                           | 287/10701 [09:21<5:20:54,  1.85s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2763, 'learning_rate': 1.758e-05, 'epoch': 0.08}
{'loss': 7.2757, 'learning_rate': 1.764e-05, 'epoch': 0.08}
  3%|██                                                                           | 287/10701 [09:21<5:20:54,  1.85s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:52:40,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  3%|██▏                                                                          | 296/10701 [09:34<4:03:30,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  3%|██▏                                                                          | 296/10701 [09:34<4:03:30,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2012, 'learning_rate': 1.776e-05, 'epoch': 0.08}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3281, 'learning_rate': 1.7879999999999998e-05, 'epoch': 0.08}
{'loss': 6.8298, 'learning_rate': 1.794e-05, 'epoch': 0.08}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.4168, 'learning_rate': 1.8e-05, 'epoch': 0.08}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2235, 'learning_rate': 1.806e-05, 'epoch': 0.08}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3229, 'learning_rate': 1.812e-05, 'epoch': 0.08}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2181, 'learning_rate': 1.818e-05, 'epoch': 0.08}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3975, 'learning_rate': 1.824e-05, 'epoch': 0.09}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2585, 'learning_rate': 1.83e-05, 'epoch': 0.09}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3365, 'learning_rate': 1.836e-05, 'epoch': 0.09}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.455, 'learning_rate': 1.842e-05, 'epoch': 0.09}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0969, 'learning_rate': 1.848e-05, 'epoch': 0.09}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3453, 'learning_rate': 1.854e-05, 'epoch': 0.09}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1884, 'learning_rate': 1.86e-05, 'epoch': 0.09}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3112, 'learning_rate': 1.866e-05, 'epoch': 0.09}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.4935, 'learning_rate': 1.872e-05, 'epoch': 0.09}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3492, 'learning_rate': 1.878e-05, 'epoch': 0.09}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2252, 'learning_rate': 1.884e-05, 'epoch': 0.09}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2, 'learning_rate': 1.8900000000000002e-05, 'epoch': 0.09}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3023, 'learning_rate': 1.896e-05, 'epoch': 0.09}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0567, 'learning_rate': 1.902e-05, 'epoch': 0.09}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1876, 'learning_rate': 1.908e-05, 'epoch': 0.09}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3179, 'learning_rate': 1.914e-05, 'epoch': 0.09}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3303, 'learning_rate': 1.9200000000000003e-05, 'epoch': 0.09}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2902, 'learning_rate': 1.9260000000000002e-05, 'epoch': 0.09}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0884, 'learning_rate': 1.932e-05, 'epoch': 0.09}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2818, 'learning_rate': 1.938e-05, 'epoch': 0.09}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1258, 'learning_rate': 1.95e-05, 'epoch': 0.09}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0437, 'learning_rate': 1.9560000000000002e-05, 'epoch': 0.09}
{'loss': 7.192, 'learning_rate': 1.9620000000000002e-05, 'epoch': 0.09}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1372, 'learning_rate': 1.968e-05, 'epoch': 0.09}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0059, 'learning_rate': 1.974e-05, 'epoch': 0.09}
{'loss': 7.1942, 'learning_rate': 1.98e-05, 'epoch': 0.09}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1989, 'learning_rate': 1.9860000000000003e-05, 'epoch': 0.09}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0647, 'learning_rate': 1.9920000000000002e-05, 'epoch': 0.09}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1529, 'learning_rate': 1.9980000000000002e-05, 'epoch': 0.09}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0687, 'learning_rate': 2.004e-05, 'epoch': 0.09}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2183, 'learning_rate': 2.01e-05, 'epoch': 0.09}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9497, 'learning_rate': 2.016e-05, 'epoch': 0.09}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.095, 'learning_rate': 2.0220000000000003e-05, 'epoch': 0.09}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9722, 'learning_rate': 2.0280000000000002e-05, 'epoch': 0.09}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.302, 'learning_rate': 2.0340000000000002e-05, 'epoch': 0.1}
  3%|██▏                                                                          | 298/10701 [09:37<3:40:09,  1.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1676, 'learning_rate': 2.04e-05, 'epoch': 0.1}
  3%|██▍                                                                          | 342/10701 [11:05<4:34:25,  1.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  3%|██▍                                                                          | 342/10701 [11:05<4:34:25,  1.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0249, 'learning_rate': 2.0520000000000003e-05, 'epoch': 0.1}
  3%|██▍                                                                          | 342/10701 [11:05<4:34:25,  1.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.105, 'learning_rate': 2.0580000000000003e-05, 'epoch': 0.1}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:50,319 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:50,319 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0939, 'learning_rate': 2.07e-05, 'epoch': 0.1}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:50,319 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9841, 'learning_rate': 2.0759999999999998e-05, 'epoch': 0.1}
  3%|██▌                                                                          | 348/10701 [11:13<3:40:47,  1.28s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  3%|██▌                                                                          | 348/10701 [11:13<3:40:47,  1.28s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6784, 'learning_rate': 2.088e-05, 'epoch': 0.1}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.648, 'learning_rate': 2.1e-05, 'epoch': 0.1}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1398, 'learning_rate': 2.1059999999999998e-05, 'epoch': 0.1}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2444, 'learning_rate': 2.1119999999999998e-05, 'epoch': 0.1}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1684, 'learning_rate': 2.118e-05, 'epoch': 0.1}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3969, 'learning_rate': 2.124e-05, 'epoch': 0.1}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9408, 'learning_rate': 2.13e-05, 'epoch': 0.1}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3996, 'learning_rate': 2.136e-05, 'epoch': 0.1}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9589, 'learning_rate': 2.1419999999999998e-05, 'epoch': 0.1}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2723, 'learning_rate': 2.148e-05, 'epoch': 0.1}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1407, 'learning_rate': 2.154e-05, 'epoch': 0.1}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.338, 'learning_rate': 2.16e-05, 'epoch': 0.1}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.4253, 'learning_rate': 2.166e-05, 'epoch': 0.1}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2386, 'learning_rate': 2.172e-05, 'epoch': 0.1}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2326, 'learning_rate': 2.178e-05, 'epoch': 0.1}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1585, 'learning_rate': 2.184e-05, 'epoch': 0.1}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2236, 'learning_rate': 2.19e-05, 'epoch': 0.1}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1333, 'learning_rate': 2.196e-05, 'epoch': 0.1}
{'loss': 7.2608, 'learning_rate': 2.202e-05, 'epoch': 0.1}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0539, 'learning_rate': 2.208e-05, 'epoch': 0.1}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2314, 'learning_rate': 2.214e-05, 'epoch': 0.1}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8448, 'learning_rate': 2.22e-05, 'epoch': 0.1}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1213, 'learning_rate': 2.226e-05, 'epoch': 0.1}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0645, 'learning_rate': 2.232e-05, 'epoch': 0.1}
{'loss': 7.2308, 'learning_rate': 2.238e-05, 'epoch': 0.1}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2085, 'learning_rate': 2.2440000000000002e-05, 'epoch': 0.1}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9134, 'learning_rate': 2.25e-05, 'epoch': 0.11}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0166, 'learning_rate': 2.256e-05, 'epoch': 0.11}
{'loss': 7.1703, 'learning_rate': 2.262e-05, 'epoch': 0.11}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6971, 'learning_rate': 2.268e-05, 'epoch': 0.11}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.084, 'learning_rate': 2.274e-05, 'epoch': 0.11}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2437, 'learning_rate': 2.2800000000000002e-05, 'epoch': 0.11}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2564, 'learning_rate': 2.286e-05, 'epoch': 0.11}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9667, 'learning_rate': 2.292e-05, 'epoch': 0.11}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9449, 'learning_rate': 2.298e-05, 'epoch': 0.11}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2372, 'learning_rate': 2.304e-05, 'epoch': 0.11}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9318, 'learning_rate': 2.3100000000000002e-05, 'epoch': 0.11}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1121, 'learning_rate': 2.3160000000000002e-05, 'epoch': 0.11}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9505, 'learning_rate': 2.322e-05, 'epoch': 0.11}
[WARNING|modeling_utils.py:388] 2022-03-02 09:55:56,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  4%|██▊                                                                          | 389/10701 [12:37<5:18:40,  1.85s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  4%|██▊                                                                          | 389/10701 [12:37<5:18:40,  1.85s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1731, 'learning_rate': 2.334e-05, 'epoch': 0.11}
  4%|██▊                                                                          | 389/10701 [12:37<5:18:40,  1.85s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0853, 'learning_rate': 2.3400000000000003e-05, 'epoch': 0.11}
  4%|██▊                                                                          | 389/10701 [12:37<5:18:40,  1.85s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8283, 'learning_rate': 2.3460000000000002e-05, 'epoch': 0.11}
{'loss': 7.2011, 'learning_rate': 2.3520000000000002e-05, 'epoch': 0.11}
  4%|██▊                                                                          | 389/10701 [12:37<5:18:40,  1.85s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  4%|██▊                                                                          | 394/10701 [12:45<4:38:42,  1.62s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  4%|██▊                                                                          | 394/10701 [12:45<4:38:42,  1.62s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9783, 'learning_rate': 2.364e-05, 'epoch': 0.11}
  4%|██▊                                                                          | 394/10701 [12:45<4:38:42,  1.62s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3072, 'learning_rate': 2.37e-05, 'epoch': 0.11}
  4%|██▊                                                                          | 397/10701 [12:49<4:00:48,  1.40s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  4%|██▊                                                                          | 397/10701 [12:49<4:00:48,  1.40s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8423, 'learning_rate': 2.3820000000000002e-05, 'epoch': 0.11}
[WARNING|modeling_utils.py:388] 2022-03-02 09:57:32,729 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:57:32,729 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6825, 'learning_rate': 2.394e-05, 'epoch': 0.11}
[WARNING|modeling_utils.py:388] 2022-03-02 09:57:32,729 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.3885, 'learning_rate': 2.4e-05, 'epoch': 0.11}
[WARNING|modeling_utils.py:388] 2022-03-02 09:57:32,729 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2664, 'learning_rate': 2.4060000000000003e-05, 'epoch': 0.11}
[WARNING|modeling_utils.py:388] 2022-03-02 09:57:32,729 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9272, 'learning_rate': 2.4120000000000003e-05, 'epoch': 0.11}
[WARNING|modeling_utils.py:388] 2022-03-02 09:57:32,729 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2954, 'learning_rate': 2.4180000000000002e-05, 'epoch': 0.11}
[WARNING|modeling_utils.py:388] 2022-03-02 09:57:32,729 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.88, 'learning_rate': 2.4240000000000002e-05, 'epoch': 0.11}
[WARNING|modeling_utils.py:388] 2022-03-02 09:57:32,729 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8738, 'learning_rate': 2.43e-05, 'epoch': 0.11}
[WARNING|modeling_utils.py:388] 2022-03-02 09:57:32,729 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.419, 'learning_rate': 2.4360000000000004e-05, 'epoch': 0.11}
[WARNING|modeling_utils.py:388] 2022-03-02 09:57:32,729 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:57:32,729 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1816, 'learning_rate': 2.442e-05, 'epoch': 0.11}
[WARNING|modeling_utils.py:388] 2022-03-02 09:57:32,729 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1283, 'learning_rate': 2.448e-05, 'epoch': 0.11}
[WARNING|modeling_utils.py:388] 2022-03-02 09:57:32,729 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2607, 'learning_rate': 2.454e-05, 'epoch': 0.11}
{'loss': 6.9234, 'learning_rate': 2.4599999999999998e-05, 'epoch': 0.11}
[WARNING|modeling_utils.py:388] 2022-03-02 09:57:32,729 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:57:32,729 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0308, 'learning_rate': 2.4659999999999998e-05, 'epoch': 0.12}
[WARNING|modeling_utils.py:388] 2022-03-02 09:57:32,729 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8637, 'learning_rate': 2.472e-05, 'epoch': 0.12}
[WARNING|modeling_utils.py:388] 2022-03-02 09:57:32,729 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9558, 'learning_rate': 2.478e-05, 'epoch': 0.12}
[WARNING|modeling_utils.py:388] 2022-03-02 09:57:32,729 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:57:32,729 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2137, 'learning_rate': 2.484e-05, 'epoch': 0.12}
[WARNING|modeling_utils.py:388] 2022-03-02 09:57:32,729 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1712, 'learning_rate': 2.49e-05, 'epoch': 0.12}
{'loss': 7.2463, 'learning_rate': 2.4959999999999998e-05, 'epoch': 0.12}
[WARNING|modeling_utils.py:388] 2022-03-02 09:57:32,729 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  4%|███                                                                          | 418/10701 [13:32<6:02:57,  2.12s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  4%|███                                                                          | 418/10701 [13:32<6:02:57,  2.12s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7192, 'learning_rate': 2.508e-05, 'epoch': 0.12}
{'loss': 7.0369, 'learning_rate': 2.514e-05, 'epoch': 0.12}
  4%|███                                                                          | 418/10701 [13:32<6:02:57,  2.12s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  4%|███                                                                          | 418/10701 [13:32<6:02:57,  2.12s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2517, 'learning_rate': 2.52e-05, 'epoch': 0.12}
  4%|███                                                                          | 418/10701 [13:32<6:02:57,  2.12s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1193, 'learning_rate': 2.526e-05, 'epoch': 0.12}
  4%|███                                                                          | 418/10701 [13:32<6:02:57,  2.12s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0365, 'learning_rate': 2.5319999999999998e-05, 'epoch': 0.12}
  4%|███                                                                          | 418/10701 [13:32<6:02:57,  2.12s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3968, 'learning_rate': 2.538e-05, 'epoch': 0.12}
  4%|███                                                                          | 418/10701 [13:32<6:02:57,  2.12s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0466, 'learning_rate': 2.544e-05, 'epoch': 0.12}
  4%|███                                                                          | 418/10701 [13:32<6:02:57,  2.12s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1708, 'learning_rate': 2.55e-05, 'epoch': 0.12}
  4%|███                                                                          | 418/10701 [13:32<6:02:57,  2.12s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2789, 'learning_rate': 2.556e-05, 'epoch': 0.12}
{'loss': 7.3759, 'learning_rate': 2.562e-05, 'epoch': 0.12}
  4%|███                                                                          | 418/10701 [13:32<6:02:57,  2.12s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1169, 'learning_rate': 2.568e-05, 'epoch': 0.12}
  4%|███                                                                          | 418/10701 [13:32<6:02:57,  2.12s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  4%|███                                                                          | 418/10701 [13:32<6:02:57,  2.12s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  4%|███                                                                          | 430/10701 [13:57<5:46:39,  2.03s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  4%|███                                                                          | 430/10701 [13:57<5:46:39,  2.03s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9739, 'learning_rate': 2.58e-05, 'epoch': 0.12}
{'loss': 6.8157, 'learning_rate': 2.586e-05, 'epoch': 0.12}
  4%|███                                                                          | 430/10701 [13:57<5:46:39,  2.03s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3529, 'learning_rate': 2.592e-05, 'epoch': 0.12}
  4%|███                                                                          | 430/10701 [13:57<5:46:39,  2.03s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2562, 'learning_rate': 2.5980000000000002e-05, 'epoch': 0.12}
  4%|███                                                                          | 430/10701 [13:57<5:46:39,  2.03s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.007, 'learning_rate': 2.604e-05, 'epoch': 0.12}
  4%|███                                                                          | 430/10701 [13:57<5:46:39,  2.03s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.115, 'learning_rate': 2.61e-05, 'epoch': 0.12}
  4%|███                                                                          | 430/10701 [13:57<5:46:39,  2.03s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3253, 'learning_rate': 2.616e-05, 'epoch': 0.12}
  4%|███                                                                          | 430/10701 [13:57<5:46:39,  2.03s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2258, 'learning_rate': 2.622e-05, 'epoch': 0.12}
  4%|███                                                                          | 430/10701 [13:57<5:46:39,  2.03s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.028, 'learning_rate': 2.628e-05, 'epoch': 0.12}
  4%|███                                                                          | 430/10701 [13:57<5:46:39,  2.03s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0361, 'learning_rate': 2.6340000000000002e-05, 'epoch': 0.12}
  4%|███                                                                          | 430/10701 [13:57<5:46:39,  2.03s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2074, 'learning_rate': 2.64e-05, 'epoch': 0.12}
{'loss': 7.1562, 'learning_rate': 2.646e-05, 'epoch': 0.12}
  4%|███                                                                          | 430/10701 [13:57<5:46:39,  2.03s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8889, 'learning_rate': 2.652e-05, 'epoch': 0.12}
  4%|███                                                                          | 430/10701 [13:57<5:46:39,  2.03s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8523, 'learning_rate': 2.658e-05, 'epoch': 0.12}
  4%|███▏                                                                         | 444/10701 [14:21<4:11:34,  1.47s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  4%|███▏                                                                         | 444/10701 [14:21<4:11:34,  1.47s/it]g-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:59:04,909 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 09:59:04,909 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8937, 'learning_rate': 2.676e-05, 'epoch': 0.13}
{'loss': 6.3308, 'learning_rate': 2.682e-05, 'epoch': 0.13}
[WARNING|modeling_utils.py:388] 2022-03-02 09:59:04,909 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:54:16,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7587, 'learning_rate': 2.688e-05, 'epoch': 0.13}
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5485, 'learning_rate': 2.7000000000000002e-05, 'epoch': 0.13}
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2754, 'learning_rate': 2.7060000000000002e-05, 'epoch': 0.13}
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0801, 'learning_rate': 2.712e-05, 'epoch': 0.13}
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3503, 'learning_rate': 2.718e-05, 'epoch': 0.13}
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0553, 'learning_rate': 2.724e-05, 'epoch': 0.13}
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8619, 'learning_rate': 2.7300000000000003e-05, 'epoch': 0.13}
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0834, 'learning_rate': 2.7360000000000002e-05, 'epoch': 0.13}
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0272, 'learning_rate': 2.7420000000000002e-05, 'epoch': 0.13}
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7911, 'learning_rate': 2.748e-05, 'epoch': 0.13}
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8926, 'learning_rate': 2.754e-05, 'epoch': 0.13}
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1745, 'learning_rate': 2.7600000000000003e-05, 'epoch': 0.13}
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2602, 'learning_rate': 2.7660000000000003e-05, 'epoch': 0.13}
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2217, 'learning_rate': 2.7720000000000002e-05, 'epoch': 0.13}
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1336, 'learning_rate': 2.778e-05, 'epoch': 0.13}
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9218, 'learning_rate': 2.784e-05, 'epoch': 0.13}
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7562, 'learning_rate': 2.79e-05, 'epoch': 0.13}
{'loss': 6.696, 'learning_rate': 2.7960000000000003e-05, 'epoch': 0.13}
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9161, 'learning_rate': 2.8020000000000003e-05, 'epoch': 0.13}
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.025, 'learning_rate': 2.8080000000000002e-05, 'epoch': 0.13}
{'loss': 6.8767, 'learning_rate': 2.8139999999999998e-05, 'epoch': 0.13}
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.207, 'learning_rate': 2.8199999999999998e-05, 'epoch': 0.13}
{'loss': 7.0644, 'learning_rate': 2.826e-05, 'epoch': 0.13}
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.831, 'learning_rate': 2.832e-05, 'epoch': 0.13}
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7322, 'learning_rate': 2.838e-05, 'epoch': 0.13}
{'loss': 7.0429, 'learning_rate': 2.844e-05, 'epoch': 0.13}
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.999, 'learning_rate': 2.8499999999999998e-05, 'epoch': 0.13}
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.259, 'learning_rate': 2.856e-05, 'epoch': 0.13}
{'loss': 7.141, 'learning_rate': 2.862e-05, 'epoch': 0.13}
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9479, 'learning_rate': 2.868e-05, 'epoch': 0.13}
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2214, 'learning_rate': 2.874e-05, 'epoch': 0.13}
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0184, 'learning_rate': 2.88e-05, 'epoch': 0.13}
{'loss': 7.1725, 'learning_rate': 2.8859999999999998e-05, 'epoch': 0.13}
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9388, 'learning_rate': 2.892e-05, 'epoch': 0.14}
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1595, 'learning_rate': 2.898e-05, 'epoch': 0.14}
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9544, 'learning_rate': 2.904e-05, 'epoch': 0.14}
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8867, 'learning_rate': 2.91e-05, 'epoch': 0.14}
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0088, 'learning_rate': 2.916e-05, 'epoch': 0.14}
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0532, 'learning_rate': 2.922e-05, 'epoch': 0.14}
  4%|███▏                                                                         | 449/10701 [14:27<3:22:42,  1.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9692, 'learning_rate': 2.934e-05, 'epoch': 0.14}
{'loss': 6.8579, 'learning_rate': 2.94e-05, 'epoch': 0.14}
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8996, 'learning_rate': 2.946e-05, 'epoch': 0.14}
[WARNING|modeling_utils.py:388] 2022-03-02 10:00:36,663 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:00:36,663 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8518, 'learning_rate': 2.958e-05, 'epoch': 0.14}
{'loss': 7.1199, 'learning_rate': 2.964e-05, 'epoch': 0.14}
[WARNING|modeling_utils.py:388] 2022-03-02 10:00:36,663 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9929, 'learning_rate': 2.97e-05, 'epoch': 0.14}
  5%|███▌                                                                         | 496/10701 [15:59<3:43:05,  1.31s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  5%|███▌                                                                         | 496/10701 [15:59<3:43:05,  1.31s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5701, 'learning_rate': 2.982e-05, 'epoch': 0.14}
[WARNING|modeling_utils.py:388] 2022-03-02 10:00:42,630 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:00:42,630 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5625, 'learning_rate': 2.994e-05, 'epoch': 0.14}
[WARNING|modeling_utils.py:388] 2022-03-02 10:00:42,630 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
03/02/2022 10:07:58 - INFO - datasets.metric - Removing /home/sanchit_huggingface_co/.cache/huggingface/metrics/wer/default/default_experiment-1-0.arrow
{'eval_loss': 6.984083652496338, 'eval_wer': 0.9473381352064607, 'eval_runtime': 433.6165, 'eval_samples_per_second': 6.093, 'eval_steps_per_second': 1.524, 'epoch': 0.14}
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
03/02/2022 10:08:22 - WARNING - huggingface_hub.repository - Adding files tracked by Git LFS: ['wandb/run-20220302_094439-2kys49al/run-2kys49al.wandb']. This may take a bit of time if the files are large.
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0985, 'learning_rate': 2.999705911185178e-05, 'epoch': 0.14}
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.5763, 'learning_rate': 2.9994118223703557e-05, 'epoch': 0.14}
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.264, 'learning_rate': 2.999117733555534e-05, 'epoch': 0.14}
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0838, 'learning_rate': 2.998823644740712e-05, 'epoch': 0.14}
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1975, 'learning_rate': 2.9985295559258896e-05, 'epoch': 0.14}
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0288, 'learning_rate': 2.9982354671110676e-05, 'epoch': 0.14}
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0503, 'learning_rate': 2.9979413782962455e-05, 'epoch': 0.14}
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1201, 'learning_rate': 2.9976472894814235e-05, 'epoch': 0.14}
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8325, 'learning_rate': 2.9973532006666015e-05, 'epoch': 0.14}
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9237, 'learning_rate': 2.997059111851779e-05, 'epoch': 0.14}
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8013, 'learning_rate': 2.996765023036957e-05, 'epoch': 0.14}
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.193, 'learning_rate': 2.9964709342221354e-05, 'epoch': 0.14}
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.4087, 'learning_rate': 2.996176845407313e-05, 'epoch': 0.14}
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.97, 'learning_rate': 2.995882756592491e-05, 'epoch': 0.14}
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9992, 'learning_rate': 2.995588667777669e-05, 'epoch': 0.14}
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.033, 'learning_rate': 2.9952945789628466e-05, 'epoch': 0.14}
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2033, 'learning_rate': 2.995000490148025e-05, 'epoch': 0.14}
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0968, 'learning_rate': 2.994706401333203e-05, 'epoch': 0.15}
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1296, 'learning_rate': 2.9944123125183805e-05, 'epoch': 0.15}
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6102, 'learning_rate': 2.9941182237035585e-05, 'epoch': 0.15}
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7838, 'learning_rate': 2.9938241348887365e-05, 'epoch': 0.15}
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9899, 'learning_rate': 2.9935300460739145e-05, 'epoch': 0.15}
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8082, 'learning_rate': 2.9932359572590924e-05, 'epoch': 0.15}
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.5751, 'learning_rate': 2.99294186844427e-05, 'epoch': 0.15}
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8012, 'learning_rate': 2.992647779629448e-05, 'epoch': 0.15}
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2513, 'learning_rate': 2.992353690814626e-05, 'epoch': 0.15}
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8706, 'learning_rate': 2.992059601999804e-05, 'epoch': 0.15}
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2806, 'learning_rate': 2.991765513184982e-05, 'epoch': 0.15}
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.125, 'learning_rate': 2.99147142437016e-05, 'epoch': 0.15}
[INFO|trainer.py:2366] 2022-03-02 10:00:45,367 >>   Num examples = 2642timate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  5%|███▊                                                                         | 531/10701 [25:28<6:06:51,  2.16s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  5%|███▊                                                                         | 531/10701 [25:28<6:06:51,  2.16s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1444, 'learning_rate': 2.990883246740516e-05, 'epoch': 0.15}
  5%|███▊                                                                         | 531/10701 [25:28<6:06:51,  2.16s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  5%|███▊                                                                         | 533/10701 [25:33<5:58:23,  2.11s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  5%|███▊                                                                         | 533/10701 [25:33<5:58:23,  2.11s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0237, 'learning_rate': 2.9902950691108715e-05, 'epoch': 0.15}
  5%|███▊                                                                         | 533/10701 [25:33<5:58:23,  2.11s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0889, 'learning_rate': 2.9900009802960494e-05, 'epoch': 0.15}
  5%|███▊                                                                         | 533/10701 [25:33<5:58:23,  2.11s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0149, 'learning_rate': 2.9897068914812274e-05, 'epoch': 0.15}
  5%|███▊                                                                         | 533/10701 [25:33<5:58:23,  2.11s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6774, 'learning_rate': 2.9894128026664054e-05, 'epoch': 0.15}
  5%|███▊                                                                         | 533/10701 [25:33<5:58:23,  2.11s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7822, 'learning_rate': 2.9891187138515834e-05, 'epoch': 0.15}
  5%|███▊                                                                         | 533/10701 [25:33<5:58:23,  2.11s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9534, 'learning_rate': 2.988824625036761e-05, 'epoch': 0.15}
  5%|███▊                                                                         | 533/10701 [25:33<5:58:23,  2.11s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8698, 'learning_rate': 2.988530536221939e-05, 'epoch': 0.15}
  5%|███▊                                                                         | 533/10701 [25:33<5:58:23,  2.11s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.797, 'learning_rate': 2.988236447407117e-05, 'epoch': 0.15}
  5%|███▊                                                                         | 533/10701 [25:33<5:58:23,  2.11s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1033, 'learning_rate': 2.987942358592295e-05, 'epoch': 0.15}
{'loss': 6.8316, 'learning_rate': 2.987648269777473e-05, 'epoch': 0.15}
  5%|███▊                                                                         | 533/10701 [25:33<5:58:23,  2.11s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  5%|███▉                                                                         | 544/10701 [25:53<4:46:47,  1.69s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  5%|███▉                                                                         | 544/10701 [25:53<4:46:47,  1.69s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9694, 'learning_rate': 2.9870600921478285e-05, 'epoch': 0.15}
  5%|███▉                                                                         | 544/10701 [25:53<4:46:47,  1.69s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.721, 'learning_rate': 2.9867660033330068e-05, 'epoch': 0.15}
  5%|███▉                                                                         | 547/10701 [25:57<4:09:35,  1.47s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  5%|███▉                                                                         | 547/10701 [25:57<4:09:35,  1.47s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8381, 'learning_rate': 2.9861778257033624e-05, 'epoch': 0.15}
  5%|███▉                                                                         | 547/10701 [25:57<4:09:35,  1.47s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1894, 'learning_rate': 2.9858837368885404e-05, 'epoch': 0.15}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.2109, 'learning_rate': 2.9852955592588963e-05, 'epoch': 0.15}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1571, 'learning_rate': 2.9850014704440743e-05, 'epoch': 0.15}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.4567, 'learning_rate': 2.9847073816292523e-05, 'epoch': 0.15}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0505, 'learning_rate': 2.98441329281443e-05, 'epoch': 0.16}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0207, 'learning_rate': 2.984119203999608e-05, 'epoch': 0.16}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9451, 'learning_rate': 2.983825115184786e-05, 'epoch': 0.16}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2624, 'learning_rate': 2.9835310263699638e-05, 'epoch': 0.16}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7773, 'learning_rate': 2.9832369375551418e-05, 'epoch': 0.16}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.191, 'learning_rate': 2.9829428487403194e-05, 'epoch': 0.16}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0813, 'learning_rate': 2.9826487599254974e-05, 'epoch': 0.16}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3464, 'learning_rate': 2.9823546711106757e-05, 'epoch': 0.16}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7548, 'learning_rate': 2.9820605822958534e-05, 'epoch': 0.16}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9526, 'learning_rate': 2.9817664934810313e-05, 'epoch': 0.16}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.916, 'learning_rate': 2.9814724046662093e-05, 'epoch': 0.16}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.138, 'learning_rate': 2.9811783158513873e-05, 'epoch': 0.16}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8462, 'learning_rate': 2.9808842270365652e-05, 'epoch': 0.16}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1181, 'learning_rate': 2.9805901382217432e-05, 'epoch': 0.16}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6915, 'learning_rate': 2.980296049406921e-05, 'epoch': 0.16}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1789, 'learning_rate': 2.9800019605920988e-05, 'epoch': 0.16}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9202, 'learning_rate': 2.9797078717772768e-05, 'epoch': 0.16}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8236, 'learning_rate': 2.9794137829624548e-05, 'epoch': 0.16}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1802, 'learning_rate': 2.9791196941476327e-05, 'epoch': 0.16}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8803, 'learning_rate': 2.9788256053328104e-05, 'epoch': 0.16}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1917, 'learning_rate': 2.9785315165179883e-05, 'epoch': 0.16}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7614, 'learning_rate': 2.9782374277031667e-05, 'epoch': 0.16}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9339, 'learning_rate': 2.9779433388883443e-05, 'epoch': 0.16}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2893, 'learning_rate': 2.9776492500735223e-05, 'epoch': 0.16}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1829, 'learning_rate': 2.9773551612587002e-05, 'epoch': 0.16}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.814, 'learning_rate': 2.977061072443878e-05, 'epoch': 0.16}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6267, 'learning_rate': 2.9767669836290562e-05, 'epoch': 0.16}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.215, 'learning_rate': 2.976472894814234e-05, 'epoch': 0.16}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2513, 'learning_rate': 2.9761788059994118e-05, 'epoch': 0.16}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1997, 'learning_rate': 2.9758847171845898e-05, 'epoch': 0.16}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9178, 'learning_rate': 2.9755906283697677e-05, 'epoch': 0.16}
{'loss': 7.0291, 'learning_rate': 2.9752965395549457e-05, 'epoch': 0.16}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8371, 'learning_rate': 2.9750024507401237e-05, 'epoch': 0.16}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9192, 'learning_rate': 2.9747083619253013e-05, 'epoch': 0.16}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.481, 'learning_rate': 2.9744142731104793e-05, 'epoch': 0.16}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.949, 'learning_rate': 2.9741201842956576e-05, 'epoch': 0.16}
  5%|███▉                                                                         | 550/10701 [26:01<4:07:02,  1.46s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0198, 'learning_rate': 2.9738260954808352e-05, 'epoch': 0.17}
                                                                                                                        g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
                                                                                                                        g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
                                                                                                                        g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3312, 'learning_rate': 2.9732379178511912e-05, 'epoch': 0.17}
                                                                                                                        g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  6%|████▎                                                                        | 594/10701 [27:32<4:07:53,  1.47s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  6%|████▎                                                                        | 594/10701 [27:32<4:07:53,  1.47s/it]g-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0237, 'learning_rate': 2.972649740221547e-05, 'epoch': 0.17}
[WARNING|modeling_utils.py:388] 2022-03-02 10:12:15,938 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:12:15,938 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8542, 'learning_rate': 2.9720615625919027e-05, 'epoch': 0.17}
[WARNING|modeling_utils.py:388] 2022-03-02 10:12:15,938 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 09:59:08,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.558, 'learning_rate': 2.9717674737770807e-05, 'epoch': 0.17}
  6%|████▎                                                                        | 599/10701 [27:38<3:27:20,  1.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  6%|████▎                                                                        | 599/10701 [27:38<3:27:20,  1.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7865, 'learning_rate': 2.9711792961474366e-05, 'epoch': 0.17}
{'loss': 6.5603, 'learning_rate': 2.9708852073326146e-05, 'epoch': 0.17}
  6%|████▎                                                                        | 599/10701 [27:38<3:27:20,  1.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.15, 'learning_rate': 2.9705911185177923e-05, 'epoch': 0.17}
  6%|████▎                                                                        | 599/10701 [27:38<3:27:20,  1.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0479, 'learning_rate': 2.9702970297029702e-05, 'epoch': 0.17}
  6%|████▎                                                                        | 599/10701 [27:38<3:27:20,  1.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0373, 'learning_rate': 2.9700029408881485e-05, 'epoch': 0.17}
{'loss': 7.1107, 'learning_rate': 2.9697088520733262e-05, 'epoch': 0.17}
  6%|████▎                                                                        | 599/10701 [27:38<3:27:20,  1.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  6%|████▎                                                                        | 599/10701 [27:38<3:27:20,  1.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9727, 'learning_rate': 2.969414763258504e-05, 'epoch': 0.17}
  6%|████▎                                                                        | 599/10701 [27:38<3:27:20,  1.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8387, 'learning_rate': 2.969120674443682e-05, 'epoch': 0.17}
  6%|████▎                                                                        | 599/10701 [27:38<3:27:20,  1.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.112, 'learning_rate': 2.9688265856288597e-05, 'epoch': 0.17}
  6%|████▎                                                                        | 599/10701 [27:38<3:27:20,  1.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  6%|████▎                                                                        | 599/10701 [27:38<3:27:20,  1.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0816, 'learning_rate': 2.968532496814038e-05, 'epoch': 0.17}
  6%|████▎                                                                        | 599/10701 [27:38<3:27:20,  1.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1308, 'learning_rate': 2.968238407999216e-05, 'epoch': 0.17}
  6%|████▎                                                                        | 599/10701 [27:38<3:27:20,  1.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0187, 'learning_rate': 2.9679443191843937e-05, 'epoch': 0.17}
  6%|████▎                                                                        | 599/10701 [27:38<3:27:20,  1.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8654, 'learning_rate': 2.9676502303695716e-05, 'epoch': 0.17}
  6%|████▎                                                                        | 599/10701 [27:38<3:27:20,  1.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8202, 'learning_rate': 2.9673561415547496e-05, 'epoch': 0.17}
  6%|████▎                                                                        | 599/10701 [27:38<3:27:20,  1.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1042, 'learning_rate': 2.9670620527399276e-05, 'epoch': 0.17}
  6%|████▎                                                                        | 599/10701 [27:38<3:27:20,  1.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  6%|████▎                                                                        | 599/10701 [27:38<3:27:20,  1.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0754, 'learning_rate': 2.9667679639251056e-05, 'epoch': 0.17}
{'loss': 7.0655, 'learning_rate': 2.9664738751102832e-05, 'epoch': 0.17}
  6%|████▎                                                                        | 599/10701 [27:38<3:27:20,  1.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9377, 'learning_rate': 2.966179786295461e-05, 'epoch': 0.17}
  6%|████▎                                                                        | 599/10701 [27:38<3:27:20,  1.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  6%|████▎                                                                        | 599/10701 [27:38<3:27:20,  1.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9486, 'learning_rate': 2.9658856974806395e-05, 'epoch': 0.17}
  6%|████▎                                                                        | 599/10701 [27:38<3:27:20,  1.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1512, 'learning_rate': 2.965591608665817e-05, 'epoch': 0.17}
{'loss': 6.8715, 'learning_rate': 2.965297519850995e-05, 'epoch': 0.17}
  6%|████▎                                                                        | 599/10701 [27:38<3:27:20,  1.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8978, 'learning_rate': 2.965003431036173e-05, 'epoch': 0.17}
{'loss': 7.0912, 'learning_rate': 2.9647093422213507e-05, 'epoch': 0.17}
  6%|████▍                                                                        | 621/10701 [28:26<5:54:43,  2.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  6%|████▍                                                                        | 621/10701 [28:26<5:54:43,  2.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3701, 'learning_rate': 2.964415253406529e-05, 'epoch': 0.17}
  6%|████▍                                                                        | 621/10701 [28:26<5:54:43,  2.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0757, 'learning_rate': 2.964121164591707e-05, 'epoch': 0.17}
  6%|████▍                                                                        | 623/10701 [28:30<5:51:51,  2.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  6%|████▍                                                                        | 623/10701 [28:30<5:51:51,  2.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  6%|████▍                                                                        | 623/10701 [28:30<5:51:51,  2.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0287, 'learning_rate': 2.9635329869620626e-05, 'epoch': 0.17}
  6%|████▍                                                                        | 623/10701 [28:30<5:51:51,  2.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9228, 'learning_rate': 2.9632388981472406e-05, 'epoch': 0.18}
  6%|████▌                                                                        | 626/10701 [28:36<5:47:36,  2.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  6%|████▌                                                                        | 626/10701 [28:36<5:47:36,  2.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0523, 'learning_rate': 2.9626507205175965e-05, 'epoch': 0.18}
  6%|████▌                                                                        | 626/10701 [28:36<5:47:36,  2.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  6%|████▌                                                                        | 626/10701 [28:36<5:47:36,  2.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9293, 'learning_rate': 2.962356631702774e-05, 'epoch': 0.18}
  6%|████▌                                                                        | 626/10701 [28:36<5:47:36,  2.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1758, 'learning_rate': 2.962062542887952e-05, 'epoch': 0.18}
  6%|████▌                                                                        | 626/10701 [28:36<5:47:36,  2.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0293, 'learning_rate': 2.96176845407313e-05, 'epoch': 0.18}
{'loss': 7.2668, 'learning_rate': 2.961474365258308e-05, 'epoch': 0.18}
  6%|████▌                                                                        | 626/10701 [28:36<5:47:36,  2.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  6%|████▌                                                                        | 626/10701 [28:36<5:47:36,  2.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9514, 'learning_rate': 2.961180276443486e-05, 'epoch': 0.18}
  6%|████▌                                                                        | 626/10701 [28:36<5:47:36,  2.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0397, 'learning_rate': 2.960886187628664e-05, 'epoch': 0.18}
{'loss': 7.408, 'learning_rate': 2.9605920988138416e-05, 'epoch': 0.18}
  6%|████▌                                                                        | 626/10701 [28:36<5:47:36,  2.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8306, 'learning_rate': 2.96029800999902e-05, 'epoch': 0.18}
  6%|████▌                                                                        | 626/10701 [28:36<5:47:36,  2.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8386, 'learning_rate': 2.960003921184198e-05, 'epoch': 0.18}
  6%|████▌                                                                        | 626/10701 [28:36<5:47:36,  2.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9348, 'learning_rate': 2.9597098323693755e-05, 'epoch': 0.18}
  6%|████▌                                                                        | 626/10701 [28:36<5:47:36,  2.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6761, 'learning_rate': 2.9594157435545535e-05, 'epoch': 0.18}
  6%|████▌                                                                        | 626/10701 [28:36<5:47:36,  2.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1253, 'learning_rate': 2.9591216547397315e-05, 'epoch': 0.18}
  6%|████▌                                                                        | 626/10701 [28:36<5:47:36,  2.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8304, 'learning_rate': 2.9588275659249095e-05, 'epoch': 0.18}
  6%|████▌                                                                        | 626/10701 [28:36<5:47:36,  2.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8803, 'learning_rate': 2.9585334771100874e-05, 'epoch': 0.18}
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.889, 'learning_rate': 2.957945299480443e-05, 'epoch': 0.18}
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  6%|████▋                                                                        | 646/10701 [29:12<4:11:20,  1.50s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  6%|████▋                                                                        | 646/10701 [29:12<4:11:20,  1.50s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9895, 'learning_rate': 2.957357121850799e-05, 'epoch': 0.18}
[WARNING|modeling_utils.py:388] 2022-03-02 10:13:56,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:13:56,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.5328, 'learning_rate': 2.956768944221155e-05, 'epoch': 0.18}
[WARNING|modeling_utils.py:388] 2022-03-02 10:13:56,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.854, 'learning_rate': 2.9564748554063326e-05, 'epoch': 0.18}
{'loss': 6.1713, 'learning_rate': 2.956180766591511e-05, 'epoch': 0.18}
{'loss': 5.797, 'learning_rate': 2.955886677776689e-05, 'epoch': 0.18}
[WARNING|modeling_utils.py:388] 2022-03-02 10:13:56,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:13:56,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9723, 'learning_rate': 2.9555925889618665e-05, 'epoch': 0.18}
[WARNING|modeling_utils.py:388] 2022-03-02 10:13:56,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.908, 'learning_rate': 2.9552985001470445e-05, 'epoch': 0.18}
[WARNING|modeling_utils.py:388] 2022-03-02 10:13:56,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9381, 'learning_rate': 2.9550044113322224e-05, 'epoch': 0.18}
[WARNING|modeling_utils.py:388] 2022-03-02 10:13:56,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6871, 'learning_rate': 2.9547103225174004e-05, 'epoch': 0.18}
[WARNING|modeling_utils.py:388] 2022-03-02 10:13:56,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:13:56,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.989, 'learning_rate': 2.9544162337025784e-05, 'epoch': 0.18}
[WARNING|modeling_utils.py:388] 2022-03-02 10:13:56,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0785, 'learning_rate': 2.9541221448877563e-05, 'epoch': 0.18}
[WARNING|modeling_utils.py:388] 2022-03-02 10:13:56,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.968, 'learning_rate': 2.953828056072934e-05, 'epoch': 0.18}
[WARNING|modeling_utils.py:388] 2022-03-02 10:13:56,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0075, 'learning_rate': 2.953533967258112e-05, 'epoch': 0.18}
[WARNING|modeling_utils.py:388] 2022-03-02 10:13:56,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1008, 'learning_rate': 2.95323987844329e-05, 'epoch': 0.18}
[WARNING|modeling_utils.py:388] 2022-03-02 10:13:56,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3269, 'learning_rate': 2.952945789628468e-05, 'epoch': 0.19}
[WARNING|modeling_utils.py:388] 2022-03-02 10:13:56,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8476, 'learning_rate': 2.952651700813646e-05, 'epoch': 0.19}
[WARNING|modeling_utils.py:388] 2022-03-02 10:13:56,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0651, 'learning_rate': 2.9523576119988235e-05, 'epoch': 0.19}
[WARNING|modeling_utils.py:388] 2022-03-02 10:13:56,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9271, 'learning_rate': 2.9520635231840015e-05, 'epoch': 0.19}
[WARNING|modeling_utils.py:388] 2022-03-02 10:13:56,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9544, 'learning_rate': 2.9517694343691798e-05, 'epoch': 0.19}
[WARNING|modeling_utils.py:388] 2022-03-02 10:13:56,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0839, 'learning_rate': 2.9514753455543574e-05, 'epoch': 0.19}
[WARNING|modeling_utils.py:388] 2022-03-02 10:13:56,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8497, 'learning_rate': 2.9511812567395354e-05, 'epoch': 0.19}
[WARNING|modeling_utils.py:388] 2022-03-02 10:13:56,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8595, 'learning_rate': 2.9508871679247134e-05, 'epoch': 0.19}
[WARNING|modeling_utils.py:388] 2022-03-02 10:13:56,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7832, 'learning_rate': 2.9505930791098913e-05, 'epoch': 0.19}
[WARNING|modeling_utils.py:388] 2022-03-02 10:13:56,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1775, 'learning_rate': 2.9502989902950693e-05, 'epoch': 0.19}
[WARNING|modeling_utils.py:388] 2022-03-02 10:13:56,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9686, 'learning_rate': 2.9500049014802473e-05, 'epoch': 0.19}
[WARNING|modeling_utils.py:388] 2022-03-02 10:13:56,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.041, 'learning_rate': 2.949710812665425e-05, 'epoch': 0.19}
[WARNING|modeling_utils.py:388] 2022-03-02 10:13:56,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.406, 'learning_rate': 2.949416723850603e-05, 'epoch': 0.19}
[WARNING|modeling_utils.py:388] 2022-03-02 10:13:56,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0021, 'learning_rate': 2.949122635035781e-05, 'epoch': 0.19}
[WARNING|modeling_utils.py:388] 2022-03-02 10:13:56,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  6%|████▊                                                                        | 675/10701 [30:12<5:45:37,  2.07s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  6%|████▊                                                                        | 675/10701 [30:12<5:45:37,  2.07s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8713, 'learning_rate': 2.9485344574061368e-05, 'epoch': 0.19}
  6%|████▊                                                                        | 675/10701 [30:12<5:45:37,  2.07s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7654, 'learning_rate': 2.9482403685913144e-05, 'epoch': 0.19}
  6%|████▊                                                                        | 675/10701 [30:12<5:45:37,  2.07s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0367, 'learning_rate': 2.9479462797764924e-05, 'epoch': 0.19}
  6%|████▊                                                                        | 675/10701 [30:12<5:45:37,  2.07s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8912, 'learning_rate': 2.9476521909616707e-05, 'epoch': 0.19}
  6%|████▊                                                                        | 675/10701 [30:12<5:45:37,  2.07s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8103, 'learning_rate': 2.9473581021468484e-05, 'epoch': 0.19}
  6%|████▊                                                                        | 675/10701 [30:12<5:45:37,  2.07s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  6%|████▉                                                                        | 681/10701 [30:24<5:35:20,  2.01s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  6%|████▉                                                                        | 681/10701 [30:24<5:35:20,  2.01s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8717, 'learning_rate': 2.9467699245172043e-05, 'epoch': 0.19}
{'loss': 6.9325, 'learning_rate': 2.946475835702382e-05, 'epoch': 0.19}
  6%|████▉                                                                        | 681/10701 [30:24<5:35:20,  2.01s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1206, 'learning_rate': 2.9461817468875602e-05, 'epoch': 0.19}
  6%|████▉                                                                        | 681/10701 [30:24<5:35:20,  2.01s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8435, 'learning_rate': 2.9458876580727382e-05, 'epoch': 0.19}
  6%|████▉                                                                        | 681/10701 [30:24<5:35:20,  2.01s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1285, 'learning_rate': 2.945593569257916e-05, 'epoch': 0.19}
  6%|████▉                                                                        | 681/10701 [30:24<5:35:20,  2.01s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  6%|████▉                                                                        | 681/10701 [30:24<5:35:20,  2.01s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1141, 'learning_rate': 2.9452994804430938e-05, 'epoch': 0.19}
  6%|████▉                                                                        | 681/10701 [30:24<5:35:20,  2.01s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0024, 'learning_rate': 2.9450053916282718e-05, 'epoch': 0.19}
  6%|████▉                                                                        | 681/10701 [30:24<5:35:20,  2.01s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6109, 'learning_rate': 2.9447113028134498e-05, 'epoch': 0.19}
  6%|████▉                                                                        | 681/10701 [30:24<5:35:20,  2.01s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  6%|████▉                                                                        | 691/10701 [30:42<4:50:01,  1.74s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  6%|████▉                                                                        | 691/10701 [30:42<4:50:01,  1.74s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7631, 'learning_rate': 2.9441231251838054e-05, 'epoch': 0.19}
{'loss': 6.9072, 'learning_rate': 2.9438290363689834e-05, 'epoch': 0.19}
  6%|████▉                                                                        | 691/10701 [30:42<4:50:01,  1.74s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2144, 'learning_rate': 2.9435349475541617e-05, 'epoch': 0.19}
  6%|████▉                                                                        | 691/10701 [30:42<4:50:01,  1.74s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  6%|█████                                                                        | 695/10701 [30:48<4:10:41,  1.50s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  6%|█████                                                                        | 695/10701 [30:48<4:10:41,  1.50s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7613, 'learning_rate': 2.9429467699245173e-05, 'epoch': 0.19}
[WARNING|modeling_utils.py:388] 2022-03-02 10:15:32,155 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:15:32,155 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8497, 'learning_rate': 2.942358592294873e-05, 'epoch': 0.2}
{'loss': 7.1595, 'learning_rate': 2.9420645034800512e-05, 'epoch': 0.2}
[WARNING|modeling_utils.py:388] 2022-03-02 10:15:32,155 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.4005, 'learning_rate': 2.9414763258504068e-05, 'epoch': 0.2}
{'loss': 6.4586, 'learning_rate': 2.9411822370355848e-05, 'epoch': 0.2}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0455, 'learning_rate': 2.9408881482207627e-05, 'epoch': 0.2}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1149, 'learning_rate': 2.9405940594059407e-05, 'epoch': 0.2}
{'loss': 7.1698, 'learning_rate': 2.9402999705911187e-05, 'epoch': 0.2}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.975, 'learning_rate': 2.9400058817762963e-05, 'epoch': 0.2}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8382, 'learning_rate': 2.9397117929614743e-05, 'epoch': 0.2}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0203, 'learning_rate': 2.9394177041466526e-05, 'epoch': 0.2}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0721, 'learning_rate': 2.9391236153318302e-05, 'epoch': 0.2}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.086, 'learning_rate': 2.9388295265170082e-05, 'epoch': 0.2}
{'loss': 7.2285, 'learning_rate': 2.9385354377021862e-05, 'epoch': 0.2}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2631, 'learning_rate': 2.9382413488873638e-05, 'epoch': 0.2}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8899, 'learning_rate': 2.937947260072542e-05, 'epoch': 0.2}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7186, 'learning_rate': 2.93765317125772e-05, 'epoch': 0.2}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8755, 'learning_rate': 2.9373590824428977e-05, 'epoch': 0.2}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8248, 'learning_rate': 2.9370649936280757e-05, 'epoch': 0.2}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8904, 'learning_rate': 2.9367709048132537e-05, 'epoch': 0.2}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8052, 'learning_rate': 2.9364768159984317e-05, 'epoch': 0.2}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9436, 'learning_rate': 2.9361827271836096e-05, 'epoch': 0.2}
{'loss': 7.1089, 'learning_rate': 2.9358886383687873e-05, 'epoch': 0.2}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.078, 'learning_rate': 2.9355945495539652e-05, 'epoch': 0.2}
{'loss': 7.1416, 'learning_rate': 2.9353004607391435e-05, 'epoch': 0.2}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0905, 'learning_rate': 2.9350063719243212e-05, 'epoch': 0.2}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.972, 'learning_rate': 2.934712283109499e-05, 'epoch': 0.2}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5922, 'learning_rate': 2.934418194294677e-05, 'epoch': 0.2}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2184, 'learning_rate': 2.9341241054798548e-05, 'epoch': 0.2}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9624, 'learning_rate': 2.933830016665033e-05, 'epoch': 0.2}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9016, 'learning_rate': 2.933535927850211e-05, 'epoch': 0.2}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.682, 'learning_rate': 2.9332418390353887e-05, 'epoch': 0.2}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.846, 'learning_rate': 2.9329477502205666e-05, 'epoch': 0.2}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1007, 'learning_rate': 2.9326536614057446e-05, 'epoch': 0.2}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9525, 'learning_rate': 2.9323595725909226e-05, 'epoch': 0.2}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9767, 'learning_rate': 2.9320654837761006e-05, 'epoch': 0.2}
{'loss': 7.1675, 'learning_rate': 2.9317713949612782e-05, 'epoch': 0.21}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8126, 'learning_rate': 2.931477306146456e-05, 'epoch': 0.21}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8488, 'learning_rate': 2.931183217331634e-05, 'epoch': 0.21}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7587, 'learning_rate': 2.930889128516812e-05, 'epoch': 0.21}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8331, 'learning_rate': 2.93059503970199e-05, 'epoch': 0.21}
{'loss': 7.0074, 'learning_rate': 2.930300950887168e-05, 'epoch': 0.21}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3274, 'learning_rate': 2.9300068620723457e-05, 'epoch': 0.21}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1225, 'learning_rate': 2.929712773257524e-05, 'epoch': 0.21}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7227, 'learning_rate': 2.929418684442702e-05, 'epoch': 0.21}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9841, 'learning_rate': 2.9291245956278796e-05, 'epoch': 0.21}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1828, 'learning_rate': 2.9288305068130576e-05, 'epoch': 0.21}
  7%|█████                                                                        | 700/10701 [30:55<3:47:35,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9759, 'learning_rate': 2.9285364179982356e-05, 'epoch': 0.21}
  7%|█████▎                                                                       | 744/10701 [32:26<4:42:59,  1.71s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  7%|█████▎                                                                       | 744/10701 [32:26<4:42:59,  1.71s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9944, 'learning_rate': 2.9279482403685915e-05, 'epoch': 0.21}
  7%|█████▎                                                                       | 744/10701 [32:26<4:42:59,  1.71s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5626, 'learning_rate': 2.9276541515537695e-05, 'epoch': 0.21}
  7%|█████▎                                                                       | 744/10701 [32:26<4:42:59,  1.71s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8393, 'learning_rate': 2.927360062738947e-05, 'epoch': 0.21}
{'loss': 7.0952, 'learning_rate': 2.927065973924125e-05, 'epoch': 0.21}
  7%|█████▎                                                                       | 744/10701 [32:26<4:42:59,  1.71s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5828, 'learning_rate': 2.926771885109303e-05, 'epoch': 0.21}
  7%|█████▎                                                                       | 744/10701 [32:26<4:42:59,  1.71s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1529, 'learning_rate': 2.926183707479659e-05, 'epoch': 0.21}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.966, 'learning_rate': 2.9258896186648366e-05, 'epoch': 0.21}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9793, 'learning_rate': 2.9255955298500146e-05, 'epoch': 0.21}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8722, 'learning_rate': 2.925301441035193e-05, 'epoch': 0.21}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9344, 'learning_rate': 2.9250073522203705e-05, 'epoch': 0.21}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.136, 'learning_rate': 2.9247132634055485e-05, 'epoch': 0.21}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9407, 'learning_rate': 2.9244191745907265e-05, 'epoch': 0.21}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1203, 'learning_rate': 2.9241250857759045e-05, 'epoch': 0.21}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9956, 'learning_rate': 2.9238309969610824e-05, 'epoch': 0.21}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8184, 'learning_rate': 2.9235369081462604e-05, 'epoch': 0.21}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7034, 'learning_rate': 2.923242819331438e-05, 'epoch': 0.21}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0343, 'learning_rate': 2.922948730516616e-05, 'epoch': 0.21}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7628, 'learning_rate': 2.922654641701794e-05, 'epoch': 0.21}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1711, 'learning_rate': 2.922360552886972e-05, 'epoch': 0.21}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8651, 'learning_rate': 2.92206646407215e-05, 'epoch': 0.21}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6115, 'learning_rate': 2.9217723752573276e-05, 'epoch': 0.21}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0041, 'learning_rate': 2.9214782864425055e-05, 'epoch': 0.21}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7886, 'learning_rate': 2.921184197627684e-05, 'epoch': 0.22}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.035, 'learning_rate': 2.9208901088128615e-05, 'epoch': 0.22}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.897, 'learning_rate': 2.9205960199980395e-05, 'epoch': 0.22}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9433, 'learning_rate': 2.9203019311832174e-05, 'epoch': 0.22}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9897, 'learning_rate': 2.9200078423683954e-05, 'epoch': 0.22}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8666, 'learning_rate': 2.9197137535535734e-05, 'epoch': 0.22}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0424, 'learning_rate': 2.9194196647387514e-05, 'epoch': 0.22}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6545, 'learning_rate': 2.919125575923929e-05, 'epoch': 0.22}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8368, 'learning_rate': 2.918831487109107e-05, 'epoch': 0.22}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1495, 'learning_rate': 2.918537398294285e-05, 'epoch': 0.22}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9931, 'learning_rate': 2.918243309479463e-05, 'epoch': 0.22}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6987, 'learning_rate': 2.917949220664641e-05, 'epoch': 0.22}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0024, 'learning_rate': 2.9176551318498185e-05, 'epoch': 0.22}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9595, 'learning_rate': 2.9173610430349965e-05, 'epoch': 0.22}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1656, 'learning_rate': 2.9170669542201748e-05, 'epoch': 0.22}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9947, 'learning_rate': 2.9167728654053524e-05, 'epoch': 0.22}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0751, 'learning_rate': 2.9164787765905304e-05, 'epoch': 0.22}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1707, 'learning_rate': 2.9161846877757084e-05, 'epoch': 0.22}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.663, 'learning_rate': 2.915890598960886e-05, 'epoch': 0.22}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6864, 'learning_rate': 2.9155965101460643e-05, 'epoch': 0.22}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0355, 'learning_rate': 2.9153024213312423e-05, 'epoch': 0.22}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7002, 'learning_rate': 2.91500833251642e-05, 'epoch': 0.22}
{'loss': 7.1526, 'learning_rate': 2.914714243701598e-05, 'epoch': 0.22}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3922, 'learning_rate': 2.9144201548867762e-05, 'epoch': 0.22}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  7%|█████▋                                                                       | 793/10701 [34:05<4:23:30,  1.60s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  7%|█████▋                                                                       | 793/10701 [34:05<4:23:30,  1.60s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.703, 'learning_rate': 2.9138319772571318e-05, 'epoch': 0.22}
{'loss': 6.8023, 'learning_rate': 2.9135378884423094e-05, 'epoch': 0.22}
  7%|█████▋                                                                       | 793/10701 [34:05<4:23:30,  1.60s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  7%|█████▋                                                                       | 796/10701 [34:09<3:49:19,  1.39s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  7%|█████▋                                                                       | 796/10701 [34:09<3:49:19,  1.39s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.2144, 'learning_rate': 2.9129497108126657e-05, 'epoch': 0.22}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.713, 'learning_rate': 2.9123615331830213e-05, 'epoch': 0.22}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7257, 'learning_rate': 2.9120674443681993e-05, 'epoch': 0.22}
{'loss': 6.6457, 'learning_rate': 2.911773355553377e-05, 'epoch': 0.22}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3509, 'learning_rate': 2.9114792667385553e-05, 'epoch': 0.22}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7318, 'learning_rate': 2.9111851779237332e-05, 'epoch': 0.22}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1466, 'learning_rate': 2.910891089108911e-05, 'epoch': 0.23}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1656, 'learning_rate': 2.910597000294089e-05, 'epoch': 0.23}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0229, 'learning_rate': 2.9103029114792668e-05, 'epoch': 0.23}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9039, 'learning_rate': 2.9100088226644448e-05, 'epoch': 0.23}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9162, 'learning_rate': 2.9097147338496228e-05, 'epoch': 0.23}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.11, 'learning_rate': 2.9094206450348004e-05, 'epoch': 0.23}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8498, 'learning_rate': 2.9091265562199784e-05, 'epoch': 0.23}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7362, 'learning_rate': 2.9088324674051567e-05, 'epoch': 0.23}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9845, 'learning_rate': 2.9085383785903343e-05, 'epoch': 0.23}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6237, 'learning_rate': 2.9082442897755123e-05, 'epoch': 0.23}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5303, 'learning_rate': 2.9079502009606902e-05, 'epoch': 0.23}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.039, 'learning_rate': 2.907656112145868e-05, 'epoch': 0.23}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7634, 'learning_rate': 2.9073620233310462e-05, 'epoch': 0.23}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8981, 'learning_rate': 2.907067934516224e-05, 'epoch': 0.23}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8593, 'learning_rate': 2.9067738457014018e-05, 'epoch': 0.23}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7928, 'learning_rate': 2.9064797568865798e-05, 'epoch': 0.23}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9787, 'learning_rate': 2.9061856680717577e-05, 'epoch': 0.23}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9527, 'learning_rate': 2.9058915792569357e-05, 'epoch': 0.23}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7761, 'learning_rate': 2.9055974904421137e-05, 'epoch': 0.23}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7278, 'learning_rate': 2.9053034016272913e-05, 'epoch': 0.23}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1988, 'learning_rate': 2.9050093128124693e-05, 'epoch': 0.23}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1557, 'learning_rate': 2.9047152239976476e-05, 'epoch': 0.23}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2547, 'learning_rate': 2.9044211351828252e-05, 'epoch': 0.23}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0175, 'learning_rate': 2.9041270463680032e-05, 'epoch': 0.23}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2074, 'learning_rate': 2.9038329575531812e-05, 'epoch': 0.23}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.882, 'learning_rate': 2.9035388687383588e-05, 'epoch': 0.23}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9756, 'learning_rate': 2.903244779923537e-05, 'epoch': 0.23}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.758, 'learning_rate': 2.902950691108715e-05, 'epoch': 0.23}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1634, 'learning_rate': 2.9026566022938927e-05, 'epoch': 0.23}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0484, 'learning_rate': 2.9023625134790707e-05, 'epoch': 0.23}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9883, 'learning_rate': 2.9020684246642487e-05, 'epoch': 0.23}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7861, 'learning_rate': 2.9017743358494267e-05, 'epoch': 0.23}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9701, 'learning_rate': 2.9014802470346046e-05, 'epoch': 0.23}
[WARNING|modeling_utils.py:388] 2022-03-02 10:18:52,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6094, 'learning_rate': 2.9011861582197826e-05, 'epoch': 0.23}
  8%|██████                                                                       | 838/10701 [35:35<5:03:44,  1.85s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  8%|██████                                                                       | 838/10701 [35:35<5:03:44,  1.85s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7008, 'learning_rate': 2.9005979805901382e-05, 'epoch': 0.23}
  8%|██████                                                                       | 838/10701 [35:35<5:03:44,  1.85s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9382, 'learning_rate': 2.9003038917753162e-05, 'epoch': 0.24}
  8%|██████                                                                       | 838/10701 [35:35<5:03:44,  1.85s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8527, 'learning_rate': 2.900009802960494e-05, 'epoch': 0.24}
  8%|██████                                                                       | 838/10701 [35:35<5:03:44,  1.85s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  8%|██████                                                                       | 843/10701 [35:43<4:13:46,  1.54s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  8%|██████                                                                       | 843/10701 [35:43<4:13:46,  1.54s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6495, 'learning_rate': 2.8994216253308498e-05, 'epoch': 0.24}
{'loss': 6.8258, 'learning_rate': 2.899127536516028e-05, 'epoch': 0.24}
  8%|██████                                                                       | 843/10701 [35:43<4:13:46,  1.54s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  8%|██████                                                                       | 846/10701 [35:47<3:48:25,  1.39s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  8%|██████                                                                       | 846/10701 [35:47<3:48:25,  1.39s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6825, 'learning_rate': 2.8985393588863837e-05, 'epoch': 0.24}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8729, 'learning_rate': 2.8979511812567396e-05, 'epoch': 0.24}
{'loss': 6.9313, 'learning_rate': 2.8976570924419176e-05, 'epoch': 0.24}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7153, 'learning_rate': 2.8973630036270956e-05, 'epoch': 0.24}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5908, 'learning_rate': 2.8970689148122735e-05, 'epoch': 0.24}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0886, 'learning_rate': 2.8967748259974512e-05, 'epoch': 0.24}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9021, 'learning_rate': 2.896480737182629e-05, 'epoch': 0.24}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5712, 'learning_rate': 2.896186648367807e-05, 'epoch': 0.24}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0866, 'learning_rate': 2.895892559552985e-05, 'epoch': 0.24}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8419, 'learning_rate': 2.895598470738163e-05, 'epoch': 0.24}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1238, 'learning_rate': 2.8953043819233407e-05, 'epoch': 0.24}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8553, 'learning_rate': 2.8950102931085187e-05, 'epoch': 0.24}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7638, 'learning_rate': 2.894716204293697e-05, 'epoch': 0.24}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8507, 'learning_rate': 2.8944221154788746e-05, 'epoch': 0.24}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9144, 'learning_rate': 2.8941280266640526e-05, 'epoch': 0.24}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8982, 'learning_rate': 2.8938339378492306e-05, 'epoch': 0.24}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1053, 'learning_rate': 2.8935398490344085e-05, 'epoch': 0.24}
{'loss': 7.3093, 'learning_rate': 2.8932457602195865e-05, 'epoch': 0.24}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8059, 'learning_rate': 2.8929516714047645e-05, 'epoch': 0.24}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0714, 'learning_rate': 2.892657582589942e-05, 'epoch': 0.24}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.044, 'learning_rate': 2.89236349377512e-05, 'epoch': 0.24}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8827, 'learning_rate': 2.892069404960298e-05, 'epoch': 0.24}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1062, 'learning_rate': 2.891775316145476e-05, 'epoch': 0.24}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7392, 'learning_rate': 2.891481227330654e-05, 'epoch': 0.24}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0121, 'learning_rate': 2.8911871385158316e-05, 'epoch': 0.24}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9921, 'learning_rate': 2.8908930497010096e-05, 'epoch': 0.24}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3379, 'learning_rate': 2.890598960886188e-05, 'epoch': 0.24}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1774, 'learning_rate': 2.8903048720713656e-05, 'epoch': 0.24}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6636, 'learning_rate': 2.8900107832565435e-05, 'epoch': 0.24}
{'loss': 6.966, 'learning_rate': 2.8897166944417215e-05, 'epoch': 0.25}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.92, 'learning_rate': 2.8894226056268995e-05, 'epoch': 0.25}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7019, 'learning_rate': 2.8891285168120774e-05, 'epoch': 0.25}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7833, 'learning_rate': 2.8888344279972554e-05, 'epoch': 0.25}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0254, 'learning_rate': 2.888540339182433e-05, 'epoch': 0.25}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9994, 'learning_rate': 2.888246250367611e-05, 'epoch': 0.25}
{'loss': 6.917, 'learning_rate': 2.887952161552789e-05, 'epoch': 0.25}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6927, 'learning_rate': 2.887658072737967e-05, 'epoch': 0.25}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9758, 'learning_rate': 2.887363983923145e-05, 'epoch': 0.25}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9077, 'learning_rate': 2.8870698951083226e-05, 'epoch': 0.25}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0692, 'learning_rate': 2.8867758062935005e-05, 'epoch': 0.25}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8724, 'learning_rate': 2.886481717478679e-05, 'epoch': 0.25}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8829, 'learning_rate': 2.8861876286638565e-05, 'epoch': 0.25}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2409, 'learning_rate': 2.8858935398490345e-05, 'epoch': 0.25}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0587, 'learning_rate': 2.8855994510342124e-05, 'epoch': 0.25}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9923, 'learning_rate': 2.88530536221939e-05, 'epoch': 0.25}
{'loss': 6.9849, 'learning_rate': 2.8850112734045684e-05, 'epoch': 0.25}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9882, 'learning_rate': 2.8847171845897464e-05, 'epoch': 0.25}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9661, 'learning_rate': 2.884423095774924e-05, 'epoch': 0.25}
[WARNING|modeling_utils.py:388] 2022-03-02 10:20:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  8%|██████▍                                                                      | 895/10701 [37:28<4:11:18,  1.54s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  8%|██████▍                                                                      | 895/10701 [37:28<4:11:18,  1.54s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5958, 'learning_rate': 2.8838349181452803e-05, 'epoch': 0.25}
[WARNING|modeling_utils.py:388] 2022-03-02 10:22:11,387 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:22:11,387 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9146, 'learning_rate': 2.883246740515636e-05, 'epoch': 0.25}
{'loss': 6.7661, 'learning_rate': 2.8829526517008135e-05, 'epoch': 0.25}
[WARNING|modeling_utils.py:388] 2022-03-02 10:22:11,387 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  8%|██████▍                                                                      | 900/10701 [37:34<3:44:24,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  8%|██████▍                                                                      | 900/10701 [37:34<3:44:24,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.19, 'learning_rate': 2.8823644740711698e-05, 'epoch': 0.25}
  8%|██████▍                                                                      | 900/10701 [37:34<3:44:24,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8924, 'learning_rate': 2.8820703852563474e-05, 'epoch': 0.25}
  8%|██████▍                                                                      | 900/10701 [37:34<3:44:24,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9885, 'learning_rate': 2.8817762964415254e-05, 'epoch': 0.25}
  8%|██████▍                                                                      | 900/10701 [37:34<3:44:24,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1423, 'learning_rate': 2.8814822076267034e-05, 'epoch': 0.25}
  8%|██████▍                                                                      | 900/10701 [37:34<3:44:24,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8024, 'learning_rate': 2.881188118811881e-05, 'epoch': 0.25}
  8%|██████▍                                                                      | 900/10701 [37:34<3:44:24,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  8%|██████▍                                                                      | 900/10701 [37:34<3:44:24,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8291, 'learning_rate': 2.8808940299970593e-05, 'epoch': 0.25}
{'loss': 7.2142, 'learning_rate': 2.8805999411822373e-05, 'epoch': 0.25}
  8%|██████▍                                                                      | 900/10701 [37:34<3:44:24,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  8%|██████▍                                                                      | 900/10701 [37:34<3:44:24,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.923, 'learning_rate': 2.880305852367415e-05, 'epoch': 0.25}
  8%|██████▍                                                                      | 900/10701 [37:34<3:44:24,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7395, 'learning_rate': 2.880011763552593e-05, 'epoch': 0.25}
  8%|██████▍                                                                      | 900/10701 [37:34<3:44:24,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7931, 'learning_rate': 2.879717674737771e-05, 'epoch': 0.25}
  8%|██████▍                                                                      | 900/10701 [37:34<3:44:24,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7536, 'learning_rate': 2.879423585922949e-05, 'epoch': 0.26}
  8%|██████▍                                                                      | 900/10701 [37:34<3:44:24,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2118, 'learning_rate': 2.8791294971081268e-05, 'epoch': 0.26}
  8%|██████▍                                                                      | 900/10701 [37:34<3:44:24,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8422, 'learning_rate': 2.8788354082933045e-05, 'epoch': 0.26}
  8%|██████▍                                                                      | 900/10701 [37:34<3:44:24,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9102, 'learning_rate': 2.8785413194784824e-05, 'epoch': 0.26}
  8%|██████▍                                                                      | 900/10701 [37:34<3:44:24,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8993, 'learning_rate': 2.8782472306636607e-05, 'epoch': 0.26}
  8%|██████▍                                                                      | 900/10701 [37:34<3:44:24,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7016, 'learning_rate': 2.8779531418488384e-05, 'epoch': 0.26}
  8%|██████▍                                                                      | 900/10701 [37:34<3:44:24,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9275, 'learning_rate': 2.8776590530340163e-05, 'epoch': 0.26}
  8%|██████▍                                                                      | 900/10701 [37:34<3:44:24,  1.37s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  9%|██████▌                                                                      | 918/10701 [38:14<5:50:28,  2.15s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  9%|██████▌                                                                      | 918/10701 [38:14<5:50:28,  2.15s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7574, 'learning_rate': 2.877070875404372e-05, 'epoch': 0.26}
  9%|██████▌                                                                      | 918/10701 [38:14<5:50:28,  2.15s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9643, 'learning_rate': 2.8767767865895503e-05, 'epoch': 0.26}
  9%|██████▌                                                                      | 918/10701 [38:14<5:50:28,  2.15s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8848, 'learning_rate': 2.8764826977747282e-05, 'epoch': 0.26}
  9%|██████▌                                                                      | 918/10701 [38:14<5:50:28,  2.15s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7683, 'learning_rate': 2.876188608959906e-05, 'epoch': 0.26}
  9%|██████▌                                                                      | 918/10701 [38:14<5:50:28,  2.15s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0178, 'learning_rate': 2.875894520145084e-05, 'epoch': 0.26}
  9%|██████▌                                                                      | 918/10701 [38:14<5:50:28,  2.15s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7387, 'learning_rate': 2.8756004313302618e-05, 'epoch': 0.26}
  9%|██████▌                                                                      | 918/10701 [38:14<5:50:28,  2.15s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  9%|██████▋                                                                      | 925/10701 [38:28<5:35:50,  2.06s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  9%|██████▋                                                                      | 925/10701 [38:28<5:35:50,  2.06s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0349, 'learning_rate': 2.8750122537006178e-05, 'epoch': 0.26}
  9%|██████▋                                                                      | 925/10701 [38:28<5:35:50,  2.06s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9345, 'learning_rate': 2.8747181648857954e-05, 'epoch': 0.26}
  9%|██████▋                                                                      | 925/10701 [38:28<5:35:50,  2.06s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7551, 'learning_rate': 2.8744240760709734e-05, 'epoch': 0.26}
  9%|██████▋                                                                      | 925/10701 [38:28<5:35:50,  2.06s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1174, 'learning_rate': 2.8741299872561513e-05, 'epoch': 0.26}
  9%|██████▋                                                                      | 925/10701 [38:28<5:35:50,  2.06s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0309, 'learning_rate': 2.8738358984413293e-05, 'epoch': 0.26}
  9%|██████▋                                                                      | 925/10701 [38:28<5:35:50,  2.06s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9585, 'learning_rate': 2.8735418096265073e-05, 'epoch': 0.26}
  9%|██████▋                                                                      | 925/10701 [38:28<5:35:50,  2.06s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.985, 'learning_rate': 2.8732477208116853e-05, 'epoch': 0.26}
  9%|██████▋                                                                      | 925/10701 [38:28<5:35:50,  2.06s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9548, 'learning_rate': 2.872953631996863e-05, 'epoch': 0.26}
  9%|██████▋                                                                      | 925/10701 [38:28<5:35:50,  2.06s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3116, 'learning_rate': 2.8726595431820412e-05, 'epoch': 0.26}
  9%|██████▋                                                                      | 925/10701 [38:28<5:35:50,  2.06s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0393, 'learning_rate': 2.8723654543672192e-05, 'epoch': 0.26}
  9%|██████▋                                                                      | 925/10701 [38:28<5:35:50,  2.06s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9932, 'learning_rate': 2.8720713655523968e-05, 'epoch': 0.26}
  9%|██████▋                                                                      | 925/10701 [38:28<5:35:50,  2.06s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1264, 'learning_rate': 2.8717772767375748e-05, 'epoch': 0.26}
  9%|██████▋                                                                      | 925/10701 [38:28<5:35:50,  2.06s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8354, 'learning_rate': 2.8714831879227527e-05, 'epoch': 0.26}
  9%|██████▋                                                                      | 925/10701 [38:28<5:35:50,  2.06s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0786, 'learning_rate': 2.8711890991079307e-05, 'epoch': 0.26}
{'loss': 6.8404, 'learning_rate': 2.8708950102931087e-05, 'epoch': 0.26}
  9%|██████▋                                                                      | 925/10701 [38:28<5:35:50,  2.06s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9164, 'learning_rate': 2.8706009214782867e-05, 'epoch': 0.26}
  9%|██████▋                                                                      | 925/10701 [38:28<5:35:50,  2.06s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  9%|██████▊                                                                      | 942/10701 [39:00<4:36:20,  1.70s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  9%|██████▊                                                                      | 942/10701 [39:00<4:36:20,  1.70s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9983, 'learning_rate': 2.8700127438486423e-05, 'epoch': 0.26}
  9%|██████▊                                                                      | 942/10701 [39:00<4:36:20,  1.70s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0937, 'learning_rate': 2.8697186550338202e-05, 'epoch': 0.26}
  9%|██████▊                                                                      | 945/10701 [39:05<4:05:27,  1.51s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  9%|██████▊                                                                      | 945/10701 [39:05<4:05:27,  1.51s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8847, 'learning_rate': 2.8691304774041762e-05, 'epoch': 0.26}
  9%|██████▊                                                                      | 945/10701 [39:05<4:05:27,  1.51s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.051, 'learning_rate': 2.8688363885893538e-05, 'epoch': 0.27}
  9%|██████▊                                                                      | 948/10701 [39:08<3:35:40,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  9%|██████▊                                                                      | 948/10701 [39:08<3:35:40,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8548, 'learning_rate': 2.86824821095971e-05, 'epoch': 0.27}
{'loss': 6.4982, 'learning_rate': 2.8679541221448877e-05, 'epoch': 0.27}
  9%|██████▊                                                                      | 948/10701 [39:08<3:35:40,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5017, 'learning_rate': 2.8676600333300657e-05, 'epoch': 0.27}
  9%|██████▊                                                                      | 948/10701 [39:08<3:35:40,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7995, 'learning_rate': 2.8673659445152437e-05, 'epoch': 0.27}
  9%|██████▊                                                                      | 948/10701 [39:08<3:35:40,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  9%|██████▊                                                                      | 948/10701 [39:08<3:35:40,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0307, 'learning_rate': 2.8670718557004217e-05, 'epoch': 0.27}
  9%|██████▊                                                                      | 948/10701 [39:08<3:35:40,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2087, 'learning_rate': 2.8667777668855996e-05, 'epoch': 0.27}
  9%|██████▊                                                                      | 948/10701 [39:08<3:35:40,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8219, 'learning_rate': 2.8664836780707776e-05, 'epoch': 0.27}
  9%|██████▊                                                                      | 948/10701 [39:08<3:35:40,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9364, 'learning_rate': 2.8661895892559552e-05, 'epoch': 0.27}
  9%|██████▊                                                                      | 948/10701 [39:08<3:35:40,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9026, 'learning_rate': 2.8658955004411332e-05, 'epoch': 0.27}
  9%|██████▊                                                                      | 948/10701 [39:08<3:35:40,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8164, 'learning_rate': 2.8656014116263112e-05, 'epoch': 0.27}
  9%|██████▊                                                                      | 948/10701 [39:08<3:35:40,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6548, 'learning_rate': 2.865307322811489e-05, 'epoch': 0.27}
  9%|██████▊                                                                      | 948/10701 [39:08<3:35:40,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1089, 'learning_rate': 2.865013233996667e-05, 'epoch': 0.27}
  9%|██████▊                                                                      | 948/10701 [39:08<3:35:40,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7719, 'learning_rate': 2.8647191451818448e-05, 'epoch': 0.27}
  9%|██████▊                                                                      | 948/10701 [39:08<3:35:40,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1054, 'learning_rate': 2.8644250563670227e-05, 'epoch': 0.27}
  9%|██████▊                                                                      | 948/10701 [39:08<3:35:40,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8203, 'learning_rate': 2.864130967552201e-05, 'epoch': 0.27}
  9%|██████▊                                                                      | 948/10701 [39:08<3:35:40,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7181, 'learning_rate': 2.8638368787373787e-05, 'epoch': 0.27}
  9%|██████▊                                                                      | 948/10701 [39:08<3:35:40,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  9%|██████▊                                                                      | 948/10701 [39:08<3:35:40,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6164, 'learning_rate': 2.8635427899225567e-05, 'epoch': 0.27}
  9%|██████▊                                                                      | 948/10701 [39:08<3:35:40,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7956, 'learning_rate': 2.8632487011077346e-05, 'epoch': 0.27}
  9%|██████▊                                                                      | 948/10701 [39:08<3:35:40,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9461, 'learning_rate': 2.8629546122929126e-05, 'epoch': 0.27}
  9%|██████▊                                                                      | 948/10701 [39:08<3:35:40,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7671, 'learning_rate': 2.8626605234780906e-05, 'epoch': 0.27}
  9%|██████▊                                                                      | 948/10701 [39:08<3:35:40,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1773, 'learning_rate': 2.8623664346632685e-05, 'epoch': 0.27}
  9%|██████▊                                                                      | 948/10701 [39:08<3:35:40,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9282, 'learning_rate': 2.8620723458484462e-05, 'epoch': 0.27}
  9%|██████▊                                                                      | 948/10701 [39:08<3:35:40,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0185, 'learning_rate': 2.861778257033624e-05, 'epoch': 0.27}
  9%|██████▊                                                                      | 948/10701 [39:08<3:35:40,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8419, 'learning_rate': 2.861484168218802e-05, 'epoch': 0.27}
  9%|██████▊                                                                      | 948/10701 [39:08<3:35:40,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9861, 'learning_rate': 2.86119007940398e-05, 'epoch': 0.27}
  9%|██████▊                                                                      | 948/10701 [39:08<3:35:40,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0673, 'learning_rate': 2.860895990589158e-05, 'epoch': 0.27}
  9%|██████▊                                                                      | 948/10701 [39:08<3:35:40,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5157, 'learning_rate': 2.8606019017743357e-05, 'epoch': 0.27}
  9%|██████▊                                                                      | 948/10701 [39:08<3:35:40,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  9%|███████                                                                      | 976/10701 [40:09<5:41:50,  2.11s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  9%|███████                                                                      | 976/10701 [40:09<5:41:50,  2.11s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9778, 'learning_rate': 2.860013724144692e-05, 'epoch': 0.27}
  9%|███████                                                                      | 976/10701 [40:09<5:41:50,  2.11s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  9%|███████                                                                      | 978/10701 [40:13<5:38:15,  2.09s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  9%|███████                                                                      | 978/10701 [40:13<5:38:15,  2.09s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7988, 'learning_rate': 2.8594255465150476e-05, 'epoch': 0.27}
  9%|███████                                                                      | 978/10701 [40:13<5:38:15,  2.09s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7827, 'learning_rate': 2.8591314577002256e-05, 'epoch': 0.27}
  9%|███████                                                                      | 978/10701 [40:13<5:38:15,  2.09s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.842, 'learning_rate': 2.8588373688854035e-05, 'epoch': 0.27}
  9%|███████                                                                      | 978/10701 [40:13<5:38:15,  2.09s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0801, 'learning_rate': 2.8585432800705815e-05, 'epoch': 0.27}
  9%|███████                                                                      | 978/10701 [40:13<5:38:15,  2.09s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3692, 'learning_rate': 2.8582491912557595e-05, 'epoch': 0.28}
  9%|███████                                                                      | 978/10701 [40:13<5:38:15,  2.09s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9591, 'learning_rate': 2.857955102440937e-05, 'epoch': 0.28}
  9%|███████                                                                      | 978/10701 [40:13<5:38:15,  2.09s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.932, 'learning_rate': 2.857661013626115e-05, 'epoch': 0.28}
  9%|███████                                                                      | 978/10701 [40:13<5:38:15,  2.09s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7507, 'learning_rate': 2.8573669248112934e-05, 'epoch': 0.28}
  9%|███████                                                                      | 978/10701 [40:13<5:38:15,  2.09s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8008, 'learning_rate': 2.857072835996471e-05, 'epoch': 0.28}
  9%|███████                                                                      | 978/10701 [40:13<5:38:15,  2.09s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7253, 'learning_rate': 2.856778747181649e-05, 'epoch': 0.28}
  9%|███████                                                                      | 978/10701 [40:13<5:38:15,  2.09s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0657, 'learning_rate': 2.8564846583668266e-05, 'epoch': 0.28}
{'loss': 6.7518, 'learning_rate': 2.8561905695520046e-05, 'epoch': 0.28}
  9%|███████                                                                      | 978/10701 [40:13<5:38:15,  2.09s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8717, 'learning_rate': 2.855896480737183e-05, 'epoch': 0.28}
  9%|███████                                                                      | 978/10701 [40:13<5:38:15,  2.09s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5191, 'learning_rate': 2.8556023919223606e-05, 'epoch': 0.28}
  9%|███████                                                                      | 978/10701 [40:13<5:38:15,  2.09s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1056, 'learning_rate': 2.8553083031075385e-05, 'epoch': 0.28}
  9%|███████                                                                      | 978/10701 [40:13<5:38:15,  2.09s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.777, 'learning_rate': 2.8550142142927165e-05, 'epoch': 0.28}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6553, 'learning_rate': 2.8544260366630724e-05, 'epoch': 0.28}
                                                                                                                        g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7803, 'learning_rate': 2.8541319478482504e-05, 'epoch': 0.28}
  9%|███████▏                                                                     | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  9%|███████▏                                                                     | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7736, 'learning_rate': 2.853543770218606e-05, 'epoch': 0.28}
{'loss': 6.8906, 'learning_rate': 2.8532496814037843e-05, 'epoch': 0.28}
  9%|███████▏                                                                     | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
03/02/2022 10:45:15 - INFO - datasets.metric - Removing /home/sanchit_huggingface_co/.cache/huggingface/metrics/wer/default/default_experiment-1-0.arrow
{'eval_loss': 6.8280415534973145, 'eval_wer': 1.2167986189654145, 'eval_runtime': 1183.9949, 'eval_samples_per_second': 2.231, 'eval_steps_per_second': 0.558, 'epoch': 0.28}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8725, 'learning_rate': 2.85266150377414e-05, 'epoch': 0.28}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1829, 'learning_rate': 2.8523674149593176e-05, 'epoch': 0.28}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8269, 'learning_rate': 2.8520733261444956e-05, 'epoch': 0.28}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.887, 'learning_rate': 2.851779237329674e-05, 'epoch': 0.28}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7283, 'learning_rate': 2.8514851485148515e-05, 'epoch': 0.28}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0614, 'learning_rate': 2.8511910597000295e-05, 'epoch': 0.28}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0936, 'learning_rate': 2.8508969708852074e-05, 'epoch': 0.28}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8707, 'learning_rate': 2.850602882070385e-05, 'epoch': 0.28}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9546, 'learning_rate': 2.8503087932555634e-05, 'epoch': 0.28}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0846, 'learning_rate': 2.8500147044407414e-05, 'epoch': 0.28}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6177, 'learning_rate': 2.849720615625919e-05, 'epoch': 0.28}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9156, 'learning_rate': 2.849426526811097e-05, 'epoch': 0.28}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8354, 'learning_rate': 2.849132437996275e-05, 'epoch': 0.28}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.937, 'learning_rate': 2.848838349181453e-05, 'epoch': 0.28}
{'loss': 6.9399, 'learning_rate': 2.848544260366631e-05, 'epoch': 0.28}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8507, 'learning_rate': 2.8482501715518085e-05, 'epoch': 0.28}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1999, 'learning_rate': 2.8479560827369865e-05, 'epoch': 0.29}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0791, 'learning_rate': 2.8476619939221648e-05, 'epoch': 0.29}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9274, 'learning_rate': 2.8473679051073424e-05, 'epoch': 0.29}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7731, 'learning_rate': 2.8470738162925204e-05, 'epoch': 0.29}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6717, 'learning_rate': 2.8467797274776984e-05, 'epoch': 0.29}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.973, 'learning_rate': 2.846485638662876e-05, 'epoch': 0.29}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9916, 'learning_rate': 2.8461915498480543e-05, 'epoch': 0.29}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9494, 'learning_rate': 2.8458974610332323e-05, 'epoch': 0.29}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6589, 'learning_rate': 2.84560337221841e-05, 'epoch': 0.29}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0333, 'learning_rate': 2.845309283403588e-05, 'epoch': 0.29}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0252, 'learning_rate': 2.845015194588766e-05, 'epoch': 0.29}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9196, 'learning_rate': 2.844721105773944e-05, 'epoch': 0.29}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9864, 'learning_rate': 2.8444270169591218e-05, 'epoch': 0.29}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8505, 'learning_rate': 2.8441329281442998e-05, 'epoch': 0.29}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0012, 'learning_rate': 2.8438388393294774e-05, 'epoch': 0.29}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9234, 'learning_rate': 2.8435447505146554e-05, 'epoch': 0.29}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8851, 'learning_rate': 2.8432506616998334e-05, 'epoch': 0.29}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6961, 'learning_rate': 2.8429565728850113e-05, 'epoch': 0.29}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1615, 'learning_rate': 2.8426624840701893e-05, 'epoch': 0.29}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8842, 'learning_rate': 2.842368395255367e-05, 'epoch': 0.29}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8317, 'learning_rate': 2.8420743064405453e-05, 'epoch': 0.29}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8674, 'learning_rate': 2.8417802176257232e-05, 'epoch': 0.29}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.04, 'learning_rate': 2.841486128810901e-05, 'epoch': 0.29}
[INFO|trainer.py:2366] 2022-03-02 10:25:31,132 >>   Num examples = 2642           | 998/10701 [40:47<3:34:47,  1.33s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0384, 'learning_rate': 2.841192039996079e-05, 'epoch': 0.29}
 10%|███████▏                                                                  | 1042/10701 [1:04:09<4:41:07,  1.75s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 10%|███████▏                                                                  | 1042/10701 [1:04:09<4:41:07,  1.75s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2544, 'learning_rate': 2.8406038623664348e-05, 'epoch': 0.29}
 10%|███████▏                                                                  | 1042/10701 [1:04:09<4:41:07,  1.75s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7397, 'learning_rate': 2.8403097735516128e-05, 'epoch': 0.29}
 10%|███████▏                                                                  | 1042/10701 [1:04:09<4:41:07,  1.75s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7008, 'learning_rate': 2.8400156847367907e-05, 'epoch': 0.29}
 10%|███████▏                                                                  | 1046/10701 [1:04:14<3:59:04,  1.49s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 10%|███████▏                                                                  | 1046/10701 [1:04:14<3:59:04,  1.49s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:48:58,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:48:58,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0749, 'learning_rate': 2.8391334182923243e-05, 'epoch': 0.29}
{'loss': 6.8583, 'learning_rate': 2.8388393294775023e-05, 'epoch': 0.29}
[WARNING|modeling_utils.py:388] 2022-03-02 10:48:58,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.3708, 'learning_rate': 2.8385452406626803e-05, 'epoch': 0.29}
[WARNING|modeling_utils.py:388] 2022-03-02 10:48:58,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7395, 'learning_rate': 2.838251151847858e-05, 'epoch': 0.29}
[WARNING|modeling_utils.py:388] 2022-03-02 10:48:58,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0559, 'learning_rate': 2.8379570630330362e-05, 'epoch': 0.29}
[WARNING|modeling_utils.py:388] 2022-03-02 10:48:58,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8899, 'learning_rate': 2.8376629742182142e-05, 'epoch': 0.29}
[WARNING|modeling_utils.py:388] 2022-03-02 10:48:58,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:48:58,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9247, 'learning_rate': 2.8373688854033918e-05, 'epoch': 0.3}
[WARNING|modeling_utils.py:388] 2022-03-02 10:48:58,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2518, 'learning_rate': 2.8370747965885698e-05, 'epoch': 0.3}
[WARNING|modeling_utils.py:388] 2022-03-02 10:48:58,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8164, 'learning_rate': 2.8367807077737478e-05, 'epoch': 0.3}
[WARNING|modeling_utils.py:388] 2022-03-02 10:48:58,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9069, 'learning_rate': 2.8364866189589257e-05, 'epoch': 0.3}
[WARNING|modeling_utils.py:388] 2022-03-02 10:48:58,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0456, 'learning_rate': 2.8361925301441037e-05, 'epoch': 0.3}
[WARNING|modeling_utils.py:388] 2022-03-02 10:48:58,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9897, 'learning_rate': 2.8358984413292817e-05, 'epoch': 0.3}
[WARNING|modeling_utils.py:388] 2022-03-02 10:48:58,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3994, 'learning_rate': 2.8356043525144593e-05, 'epoch': 0.3}
[WARNING|modeling_utils.py:388] 2022-03-02 10:48:58,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9956, 'learning_rate': 2.8353102636996373e-05, 'epoch': 0.3}
[WARNING|modeling_utils.py:388] 2022-03-02 10:48:58,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8172, 'learning_rate': 2.8350161748848153e-05, 'epoch': 0.3}
[WARNING|modeling_utils.py:388] 2022-03-02 10:48:58,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:48:58,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8261, 'learning_rate': 2.8347220860699932e-05, 'epoch': 0.3}
[WARNING|modeling_utils.py:388] 2022-03-02 10:48:58,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6547, 'learning_rate': 2.8344279972551712e-05, 'epoch': 0.3}
[WARNING|modeling_utils.py:388] 2022-03-02 10:48:58,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8997, 'learning_rate': 2.8341339084403488e-05, 'epoch': 0.3}
[WARNING|modeling_utils.py:388] 2022-03-02 10:48:58,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.839, 'learning_rate': 2.8338398196255268e-05, 'epoch': 0.3}
[WARNING|modeling_utils.py:388] 2022-03-02 10:48:58,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1495, 'learning_rate': 2.833545730810705e-05, 'epoch': 0.3}
[WARNING|modeling_utils.py:388] 2022-03-02 10:48:58,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0026, 'learning_rate': 2.8332516419958827e-05, 'epoch': 0.3}
[WARNING|modeling_utils.py:388] 2022-03-02 10:48:58,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9255, 'learning_rate': 2.8329575531810607e-05, 'epoch': 0.3}
[WARNING|modeling_utils.py:388] 2022-03-02 10:48:58,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0578, 'learning_rate': 2.8326634643662387e-05, 'epoch': 0.3}
[WARNING|modeling_utils.py:388] 2022-03-02 10:48:58,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8098, 'learning_rate': 2.8323693755514167e-05, 'epoch': 0.3}
[WARNING|modeling_utils.py:388] 2022-03-02 10:48:58,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9858, 'learning_rate': 2.8320752867365946e-05, 'epoch': 0.3}
[WARNING|modeling_utils.py:388] 2022-03-02 10:48:58,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7021, 'learning_rate': 2.8317811979217726e-05, 'epoch': 0.3}
[WARNING|modeling_utils.py:388] 2022-03-02 10:48:58,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.118, 'learning_rate': 2.8314871091069502e-05, 'epoch': 0.3}
[WARNING|modeling_utils.py:388] 2022-03-02 10:48:58,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7644, 'learning_rate': 2.8311930202921282e-05, 'epoch': 0.3}
[WARNING|modeling_utils.py:388] 2022-03-02 10:48:58,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 10%|███████▍                                                                  | 1076/10701 [1:05:18<5:28:31,  2.05s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 10%|███████▍                                                                  | 1076/10701 [1:05:18<5:28:31,  2.05s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.821, 'learning_rate': 2.830604842662484e-05, 'epoch': 0.3}
{'loss': 6.7665, 'learning_rate': 2.830310753847662e-05, 'epoch': 0.3}
 10%|███████▍                                                                  | 1076/10701 [1:05:18<5:28:31,  2.05s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0162, 'learning_rate': 2.8300166650328398e-05, 'epoch': 0.3}
 10%|███████▍                                                                  | 1076/10701 [1:05:18<5:28:31,  2.05s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9384, 'learning_rate': 2.8297225762180177e-05, 'epoch': 0.3}
 10%|███████▍                                                                  | 1076/10701 [1:05:18<5:28:31,  2.05s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.61, 'learning_rate': 2.829428487403196e-05, 'epoch': 0.3}
 10%|███████▍                                                                  | 1076/10701 [1:05:18<5:28:31,  2.05s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8063, 'learning_rate': 2.8291343985883737e-05, 'epoch': 0.3}
 10%|███████▍                                                                  | 1076/10701 [1:05:18<5:28:31,  2.05s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9943, 'learning_rate': 2.8288403097735517e-05, 'epoch': 0.3}
 10%|███████▍                                                                  | 1076/10701 [1:05:18<5:28:31,  2.05s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.732, 'learning_rate': 2.8285462209587296e-05, 'epoch': 0.3}
 10%|███████▍                                                                  | 1076/10701 [1:05:18<5:28:31,  2.05s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1713, 'learning_rate': 2.8282521321439073e-05, 'epoch': 0.3}
 10%|███████▍                                                                  | 1076/10701 [1:05:18<5:28:31,  2.05s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6698, 'learning_rate': 2.8279580433290856e-05, 'epoch': 0.3}
 10%|███████▍                                                                  | 1076/10701 [1:05:18<5:28:31,  2.05s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7547, 'learning_rate': 2.8276639545142635e-05, 'epoch': 0.3}
 10%|███████▍                                                                  | 1076/10701 [1:05:18<5:28:31,  2.05s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9489, 'learning_rate': 2.8273698656994412e-05, 'epoch': 0.3}
 10%|███████▍                                                                  | 1076/10701 [1:05:18<5:28:31,  2.05s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7395, 'learning_rate': 2.827075776884619e-05, 'epoch': 0.3}
 10%|███████▍                                                                  | 1076/10701 [1:05:18<5:28:31,  2.05s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9066, 'learning_rate': 2.8267816880697975e-05, 'epoch': 0.31}
 10%|███████▌                                                                  | 1090/10701 [1:05:44<4:41:54,  1.76s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 10%|███████▌                                                                  | 1090/10701 [1:05:44<4:41:54,  1.76s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9874, 'learning_rate': 2.826193510440153e-05, 'epoch': 0.31}
 10%|███████▌                                                                  | 1090/10701 [1:05:44<4:41:54,  1.76s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9788, 'learning_rate': 2.8258994216253307e-05, 'epoch': 0.31}
 10%|███████▌                                                                  | 1090/10701 [1:05:44<4:41:54,  1.76s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9526, 'learning_rate': 2.8256053328105087e-05, 'epoch': 0.31}
 10%|███████▌                                                                  | 1094/10701 [1:05:50<4:06:08,  1.54s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 10%|███████▌                                                                  | 1094/10701 [1:05:50<4:06:08,  1.54s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6802, 'learning_rate': 2.8250171551808646e-05, 'epoch': 0.31}
 10%|███████▌                                                                  | 1094/10701 [1:05:50<4:06:08,  1.54s/it]g-point operations will not be computed-02 10:12:20,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.4824, 'learning_rate': 2.8247230663660426e-05, 'epoch': 0.31}
 10%|███████▌                                                                  | 1097/10701 [1:05:54<3:34:36,  1.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:36,088 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 10%|███████▌                                                                  | 1097/10701 [1:05:54<3:34:36,  1.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:36,088 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.0083, 'learning_rate': 2.8238407999215765e-05, 'epoch': 0.31}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 5.9353, 'learning_rate': 2.8235467111067545e-05, 'epoch': 0.31}
{'loss': 7.2543, 'learning_rate': 2.823252622291932e-05, 'epoch': 0.31}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0264, 'learning_rate': 2.82295853347711e-05, 'epoch': 0.31}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0066, 'learning_rate': 2.822664444662288e-05, 'epoch': 0.31}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9665, 'learning_rate': 2.822370355847466e-05, 'epoch': 0.31}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0936, 'learning_rate': 2.822076267032644e-05, 'epoch': 0.31}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0666, 'learning_rate': 2.8217821782178216e-05, 'epoch': 0.31}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3494, 'learning_rate': 2.8214880894029996e-05, 'epoch': 0.31}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8571, 'learning_rate': 2.821194000588178e-05, 'epoch': 0.31}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0628, 'learning_rate': 2.8208999117733556e-05, 'epoch': 0.31}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8597, 'learning_rate': 2.8206058229585335e-05, 'epoch': 0.31}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8399, 'learning_rate': 2.8203117341437115e-05, 'epoch': 0.31}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0552, 'learning_rate': 2.820017645328889e-05, 'epoch': 0.31}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9788, 'learning_rate': 2.8197235565140675e-05, 'epoch': 0.31}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9705, 'learning_rate': 2.8194294676992454e-05, 'epoch': 0.31}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7258, 'learning_rate': 2.819135378884423e-05, 'epoch': 0.31}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.913, 'learning_rate': 2.818841290069601e-05, 'epoch': 0.31}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.743, 'learning_rate': 2.818547201254779e-05, 'epoch': 0.31}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9681, 'learning_rate': 2.818253112439957e-05, 'epoch': 0.31}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8862, 'learning_rate': 2.817959023625135e-05, 'epoch': 0.31}
{'loss': 6.815, 'learning_rate': 2.8176649348103126e-05, 'epoch': 0.31}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8461, 'learning_rate': 2.8173708459954906e-05, 'epoch': 0.31}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.919, 'learning_rate': 2.817076757180669e-05, 'epoch': 0.31}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.683, 'learning_rate': 2.8167826683658465e-05, 'epoch': 0.31}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.894, 'learning_rate': 2.8164885795510245e-05, 'epoch': 0.32}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6657, 'learning_rate': 2.8161944907362024e-05, 'epoch': 0.32}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7679, 'learning_rate': 2.81590040192138e-05, 'epoch': 0.32}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8771, 'learning_rate': 2.8156063131065584e-05, 'epoch': 0.32}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9711, 'learning_rate': 2.8153122242917364e-05, 'epoch': 0.32}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0036, 'learning_rate': 2.815018135476914e-05, 'epoch': 0.32}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8358, 'learning_rate': 2.814724046662092e-05, 'epoch': 0.32}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0496, 'learning_rate': 2.81442995784727e-05, 'epoch': 0.32}
{'loss': 6.6035, 'learning_rate': 2.814135869032448e-05, 'epoch': 0.32}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8085, 'learning_rate': 2.813841780217626e-05, 'epoch': 0.32}
{'loss': 7.068, 'learning_rate': 2.813547691402804e-05, 'epoch': 0.32}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9115, 'learning_rate': 2.8132536025879815e-05, 'epoch': 0.32}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.879, 'learning_rate': 2.8129595137731595e-05, 'epoch': 0.32}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.247, 'learning_rate': 2.8126654249583374e-05, 'epoch': 0.32}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1164, 'learning_rate': 2.8123713361435154e-05, 'epoch': 0.32}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.127, 'learning_rate': 2.8120772473286934e-05, 'epoch': 0.32}
 10%|███████▌                                                                  | 1099/10701 [1:05:56<3:11:48,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0315, 'learning_rate': 2.811783158513871e-05, 'epoch': 0.32}
 11%|███████▉                                                                  | 1142/10701 [1:07:23<4:16:51,  1.61s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 11%|███████▉                                                                  | 1142/10701 [1:07:23<4:16:51,  1.61s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8284, 'learning_rate': 2.8111949808842273e-05, 'epoch': 0.32}
 11%|███████▉                                                                  | 1142/10701 [1:07:23<4:16:51,  1.61s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6099, 'learning_rate': 2.810900892069405e-05, 'epoch': 0.32}
 11%|███████▉                                                                  | 1145/10701 [1:07:27<3:44:34,  1.41s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 11%|███████▉                                                                  | 1145/10701 [1:07:27<3:44:34,  1.41s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:52:10,987 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:52:10,987 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5098, 'learning_rate': 2.810018625624939e-05, 'epoch': 0.32}
{'loss': 6.5261, 'learning_rate': 2.8097245368101168e-05, 'epoch': 0.32}
[WARNING|modeling_utils.py:388] 2022-03-02 10:52:10,987 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8188, 'learning_rate': 2.8094304479952948e-05, 'epoch': 0.32}
 11%|███████▉                                                                  | 1150/10701 [1:07:33<3:28:22,  1.31s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 11%|███████▉                                                                  | 1150/10701 [1:07:33<3:28:22,  1.31s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.3335, 'learning_rate': 2.8088422703656504e-05, 'epoch': 0.32}
 11%|███████▉                                                                  | 1150/10701 [1:07:33<3:28:22,  1.31s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1317, 'learning_rate': 2.8085481815508284e-05, 'epoch': 0.32}
 11%|███████▉                                                                  | 1150/10701 [1:07:33<3:28:22,  1.31s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3101, 'learning_rate': 2.8082540927360064e-05, 'epoch': 0.32}
 11%|███████▉                                                                  | 1150/10701 [1:07:33<3:28:22,  1.31s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 11%|███████▉                                                                  | 1150/10701 [1:07:33<3:28:22,  1.31s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7636, 'learning_rate': 2.8079600039211843e-05, 'epoch': 0.32}
 11%|███████▉                                                                  | 1150/10701 [1:07:33<3:28:22,  1.31s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9544, 'learning_rate': 2.807665915106362e-05, 'epoch': 0.32}
 11%|███████▉                                                                  | 1150/10701 [1:07:33<3:28:22,  1.31s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0279, 'learning_rate': 2.8073718262915403e-05, 'epoch': 0.32}
 11%|███████▉                                                                  | 1150/10701 [1:07:33<3:28:22,  1.31s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7599, 'learning_rate': 2.8070777374767182e-05, 'epoch': 0.32}
 11%|███████▉                                                                  | 1150/10701 [1:07:33<3:28:22,  1.31s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1251, 'learning_rate': 2.806783648661896e-05, 'epoch': 0.32}
 11%|███████▉                                                                  | 1150/10701 [1:07:33<3:28:22,  1.31s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9719, 'learning_rate': 2.806489559847074e-05, 'epoch': 0.32}
 11%|███████▉                                                                  | 1150/10701 [1:07:33<3:28:22,  1.31s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.643, 'learning_rate': 2.8061954710322518e-05, 'epoch': 0.32}
 11%|███████▉                                                                  | 1150/10701 [1:07:33<3:28:22,  1.31s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7251, 'learning_rate': 2.8059013822174298e-05, 'epoch': 0.33}
 11%|███████▉                                                                  | 1150/10701 [1:07:33<3:28:22,  1.31s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7166, 'learning_rate': 2.8056072934026078e-05, 'epoch': 0.33}
 11%|███████▉                                                                  | 1150/10701 [1:07:33<3:28:22,  1.31s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9277, 'learning_rate': 2.8053132045877857e-05, 'epoch': 0.33}
 11%|███████▉                                                                  | 1150/10701 [1:07:33<3:28:22,  1.31s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
                                                                                                                        g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
                                                                                                                        g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8704, 'learning_rate': 2.8047250269581413e-05, 'epoch': 0.33}
                                                                                                                        g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3037, 'learning_rate': 2.8044309381433193e-05, 'epoch': 0.33}
                                                                                                                        g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0728, 'learning_rate': 2.8041368493284973e-05, 'epoch': 0.33}
                                                                                                                        g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1187, 'learning_rate': 2.8038427605136753e-05, 'epoch': 0.33}
                                                                                                                        g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 11%|████████                                                                  | 1169/10701 [1:08:15<5:39:07,  2.13s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 11%|████████                                                                  | 1169/10701 [1:08:15<5:39:07,  2.13s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9833, 'learning_rate': 2.803254582884031e-05, 'epoch': 0.33}
 11%|████████                                                                  | 1169/10701 [1:08:15<5:39:07,  2.13s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0103, 'learning_rate': 2.8029604940692092e-05, 'epoch': 0.33}
 11%|████████                                                                  | 1169/10701 [1:08:15<5:39:07,  2.13s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1411, 'learning_rate': 2.8026664052543868e-05, 'epoch': 0.33}
 11%|████████                                                                  | 1169/10701 [1:08:15<5:39:07,  2.13s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8121, 'learning_rate': 2.8023723164395648e-05, 'epoch': 0.33}
 11%|████████                                                                  | 1169/10701 [1:08:15<5:39:07,  2.13s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6768, 'learning_rate': 2.8020782276247428e-05, 'epoch': 0.33}
 11%|████████                                                                  | 1169/10701 [1:08:15<5:39:07,  2.13s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7375, 'learning_rate': 2.8017841388099207e-05, 'epoch': 0.33}
 11%|████████                                                                  | 1169/10701 [1:08:15<5:39:07,  2.13s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 11%|████████▏                                                                 | 1176/10701 [1:08:30<5:29:47,  2.08s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 11%|████████▏                                                                 | 1176/10701 [1:08:30<5:29:47,  2.08s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0318, 'learning_rate': 2.8011959611802767e-05, 'epoch': 0.33}
{'loss': 7.0415, 'learning_rate': 2.8009018723654543e-05, 'epoch': 0.33}
 11%|████████▏                                                                 | 1176/10701 [1:08:30<5:29:47,  2.08s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9432, 'learning_rate': 2.8006077835506323e-05, 'epoch': 0.33}
 11%|████████▏                                                                 | 1176/10701 [1:08:30<5:29:47,  2.08s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9082, 'learning_rate': 2.8003136947358106e-05, 'epoch': 0.33}
 11%|████████▏                                                                 | 1176/10701 [1:08:30<5:29:47,  2.08s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9504, 'learning_rate': 2.8000196059209882e-05, 'epoch': 0.33}
 11%|████████▏                                                                 | 1176/10701 [1:08:30<5:29:47,  2.08s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9385, 'learning_rate': 2.7997255171061662e-05, 'epoch': 0.33}
 11%|████████▏                                                                 | 1176/10701 [1:08:30<5:29:47,  2.08s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8954, 'learning_rate': 2.799431428291344e-05, 'epoch': 0.33}
 11%|████████▏                                                                 | 1176/10701 [1:08:30<5:29:47,  2.08s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7933, 'learning_rate': 2.7991373394765218e-05, 'epoch': 0.33}
 11%|████████▏                                                                 | 1176/10701 [1:08:30<5:29:47,  2.08s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8402, 'learning_rate': 2.7988432506617e-05, 'epoch': 0.33}
 11%|████████▏                                                                 | 1176/10701 [1:08:30<5:29:47,  2.08s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0305, 'learning_rate': 2.7985491618468778e-05, 'epoch': 0.33}
 11%|████████▏                                                                 | 1176/10701 [1:08:30<5:29:47,  2.08s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7114, 'learning_rate': 2.7982550730320557e-05, 'epoch': 0.33}
 11%|████████▏                                                                 | 1176/10701 [1:08:30<5:29:47,  2.08s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7692, 'learning_rate': 2.7979609842172337e-05, 'epoch': 0.33}
 11%|████████▏                                                                 | 1176/10701 [1:08:30<5:29:47,  2.08s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7189, 'learning_rate': 2.7976668954024113e-05, 'epoch': 0.33}
 11%|████████▏                                                                 | 1176/10701 [1:08:30<5:29:47,  2.08s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0215, 'learning_rate': 2.7973728065875896e-05, 'epoch': 0.33}
 11%|████████▏                                                                 | 1190/10701 [1:08:56<4:36:18,  1.74s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 11%|████████▏                                                                 | 1190/10701 [1:08:56<4:36:18,  1.74s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9663, 'learning_rate': 2.7967846289579452e-05, 'epoch': 0.33}
 11%|████████▏                                                                 | 1190/10701 [1:08:56<4:36:18,  1.74s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9699, 'learning_rate': 2.7964905401431232e-05, 'epoch': 0.33}
 11%|████████▏                                                                 | 1190/10701 [1:08:56<4:36:18,  1.74s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 11%|████████▎                                                                 | 1194/10701 [1:09:02<3:59:39,  1.51s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 11%|████████▎                                                                 | 1194/10701 [1:09:02<3:59:39,  1.51s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.814, 'learning_rate': 2.795902362513479e-05, 'epoch': 0.33}
{'loss': 6.8318, 'learning_rate': 2.795608273698657e-05, 'epoch': 0.33}
 11%|████████▎                                                                 | 1194/10701 [1:09:02<3:59:39,  1.51s/it]g-point operations will not be computed-02 10:50:38,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6244, 'learning_rate': 2.7953141848838348e-05, 'epoch': 0.34}
 11%|████████▎                                                                 | 1197/10701 [1:09:06<3:28:07,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:47,755 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 11%|████████▎                                                                 | 1197/10701 [1:09:06<3:28:07,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:47,755 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.085, 'learning_rate': 2.794726007254191e-05, 'epoch': 0.34}
 11%|████████▎                                                                 | 1199/10701 [1:09:08<3:07:30,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 11%|████████▎                                                                 | 1199/10701 [1:09:08<3:07:30,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9519, 'learning_rate': 2.7941378296245467e-05, 'epoch': 0.34}
 11%|████████▎                                                                 | 1199/10701 [1:09:08<3:07:30,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.648, 'learning_rate': 2.7938437408097246e-05, 'epoch': 0.34}
 11%|████████▎                                                                 | 1199/10701 [1:09:08<3:07:30,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 11%|████████▎                                                                 | 1199/10701 [1:09:08<3:07:30,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9385, 'learning_rate': 2.7935496519949023e-05, 'epoch': 0.34}
 11%|████████▎                                                                 | 1199/10701 [1:09:08<3:07:30,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 11%|████████▎                                                                 | 1199/10701 [1:09:08<3:07:30,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1416, 'learning_rate': 2.7932555631800806e-05, 'epoch': 0.34}
{'loss': 6.8488, 'learning_rate': 2.7929614743652586e-05, 'epoch': 0.34}
 11%|████████▎                                                                 | 1199/10701 [1:09:08<3:07:30,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7119, 'learning_rate': 2.7926673855504362e-05, 'epoch': 0.34}
 11%|████████▎                                                                 | 1199/10701 [1:09:08<3:07:30,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1416, 'learning_rate': 2.792373296735614e-05, 'epoch': 0.34}
 11%|████████▎                                                                 | 1199/10701 [1:09:08<3:07:30,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0268, 'learning_rate': 2.792079207920792e-05, 'epoch': 0.34}
 11%|████████▎                                                                 | 1199/10701 [1:09:08<3:07:30,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1531, 'learning_rate': 2.79178511910597e-05, 'epoch': 0.34}
 11%|████████▎                                                                 | 1199/10701 [1:09:08<3:07:30,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0705, 'learning_rate': 2.791491030291148e-05, 'epoch': 0.34}
 11%|████████▎                                                                 | 1199/10701 [1:09:08<3:07:30,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2293, 'learning_rate': 2.7911969414763257e-05, 'epoch': 0.34}
 11%|████████▎                                                                 | 1199/10701 [1:09:08<3:07:30,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0897, 'learning_rate': 2.7909028526615037e-05, 'epoch': 0.34}
 11%|████████▎                                                                 | 1199/10701 [1:09:08<3:07:30,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 11%|████████▎                                                                 | 1199/10701 [1:09:08<3:07:30,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8597, 'learning_rate': 2.790608763846682e-05, 'epoch': 0.34}
{'loss': 7.0049, 'learning_rate': 2.7903146750318596e-05, 'epoch': 0.34}
 11%|████████▎                                                                 | 1199/10701 [1:09:08<3:07:30,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 11%|████████▎                                                                 | 1199/10701 [1:09:08<3:07:30,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5824, 'learning_rate': 2.7900205862170376e-05, 'epoch': 0.34}
 11%|████████▎                                                                 | 1199/10701 [1:09:08<3:07:30,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2577, 'learning_rate': 2.7897264974022156e-05, 'epoch': 0.34}
 11%|████████▎                                                                 | 1199/10701 [1:09:08<3:07:30,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 11%|████████▍                                                                 | 1217/10701 [1:09:47<5:38:37,  2.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 11%|████████▍                                                                 | 1217/10701 [1:09:47<5:38:37,  2.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7937, 'learning_rate': 2.7891383197725715e-05, 'epoch': 0.34}
 11%|████████▍                                                                 | 1217/10701 [1:09:47<5:38:37,  2.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 11%|████████▍                                                                 | 1217/10701 [1:09:47<5:38:37,  2.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6766, 'learning_rate': 2.7888442309577495e-05, 'epoch': 0.34}
{'loss': 6.9057, 'learning_rate': 2.788550142142927e-05, 'epoch': 0.34}
 11%|████████▍                                                                 | 1217/10701 [1:09:47<5:38:37,  2.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.986, 'learning_rate': 2.788256053328105e-05, 'epoch': 0.34}
 11%|████████▍                                                                 | 1217/10701 [1:09:47<5:38:37,  2.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0165, 'learning_rate': 2.787961964513283e-05, 'epoch': 0.34}
 11%|████████▍                                                                 | 1217/10701 [1:09:47<5:38:37,  2.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7172, 'learning_rate': 2.787667875698461e-05, 'epoch': 0.34}
 11%|████████▍                                                                 | 1223/10701 [1:10:00<5:32:37,  2.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 11%|████████▍                                                                 | 1223/10701 [1:10:00<5:32:37,  2.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 11%|████████▍                                                                 | 1223/10701 [1:10:00<5:32:37,  2.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 11%|████████▍                                                                 | 1225/10701 [1:10:04<5:25:52,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 11%|████████▍                                                                 | 1225/10701 [1:10:04<5:25:52,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0382, 'learning_rate': 2.7867856092539946e-05, 'epoch': 0.34}
 11%|████████▍                                                                 | 1225/10701 [1:10:04<5:25:52,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9607, 'learning_rate': 2.786491520439173e-05, 'epoch': 0.34}
 11%|████████▍                                                                 | 1225/10701 [1:10:04<5:25:52,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7989, 'learning_rate': 2.7861974316243506e-05, 'epoch': 0.34}
{'loss': 7.0992, 'learning_rate': 2.7859033428095285e-05, 'epoch': 0.34}
 11%|████████▍                                                                 | 1225/10701 [1:10:04<5:25:52,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1111, 'learning_rate': 2.7856092539947065e-05, 'epoch': 0.34}
 11%|████████▍                                                                 | 1225/10701 [1:10:04<5:25:52,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6675, 'learning_rate': 2.785315165179884e-05, 'epoch': 0.34}
 11%|████████▍                                                                 | 1225/10701 [1:10:04<5:25:52,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1762, 'learning_rate': 2.7850210763650625e-05, 'epoch': 0.35}
 11%|████████▍                                                                 | 1225/10701 [1:10:04<5:25:52,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6834, 'learning_rate': 2.7847269875502404e-05, 'epoch': 0.35}
 11%|████████▍                                                                 | 1225/10701 [1:10:04<5:25:52,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9081, 'learning_rate': 2.784432898735418e-05, 'epoch': 0.35}
 11%|████████▍                                                                 | 1225/10701 [1:10:04<5:25:52,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8136, 'learning_rate': 2.784138809920596e-05, 'epoch': 0.35}
 11%|████████▍                                                                 | 1225/10701 [1:10:04<5:25:52,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8606, 'learning_rate': 2.783844721105774e-05, 'epoch': 0.35}
 11%|████████▍                                                                 | 1225/10701 [1:10:04<5:25:52,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7486, 'learning_rate': 2.783550632290952e-05, 'epoch': 0.35}
 11%|████████▍                                                                 | 1225/10701 [1:10:04<5:25:52,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9021, 'learning_rate': 2.78325654347613e-05, 'epoch': 0.35}
 11%|████████▍                                                                 | 1225/10701 [1:10:04<5:25:52,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8725, 'learning_rate': 2.782962454661308e-05, 'epoch': 0.35}
 11%|████████▍                                                                 | 1225/10701 [1:10:04<5:25:52,  2.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 12%|████████▌                                                                 | 1240/10701 [1:10:32<4:38:34,  1.77s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 12%|████████▌                                                                 | 1240/10701 [1:10:32<4:38:34,  1.77s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8018, 'learning_rate': 2.7823742770316635e-05, 'epoch': 0.35}
{'loss': 6.6303, 'learning_rate': 2.7820801882168415e-05, 'epoch': 0.35}
 12%|████████▌                                                                 | 1240/10701 [1:10:32<4:38:34,  1.77s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7952, 'learning_rate': 2.7817860994020195e-05, 'epoch': 0.35}
 12%|████████▌                                                                 | 1240/10701 [1:10:32<4:38:34,  1.77s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8232, 'learning_rate': 2.781197921772375e-05, 'epoch': 0.35}
{'loss': 6.601, 'learning_rate': 2.7809038329575534e-05, 'epoch': 0.35}
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.2624, 'learning_rate': 2.7806097441427314e-05, 'epoch': 0.35}
 12%|████████▌                                                                 | 1247/10701 [1:10:42<3:38:35,  1.39s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 12%|████████▌                                                                 | 1247/10701 [1:10:42<3:38:35,  1.39s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:55:25,937 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:55:25,937 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8366, 'learning_rate': 2.779727477698265e-05, 'epoch': 0.35}
{'loss': 6.2138, 'learning_rate': 2.779433388883443e-05, 'epoch': 0.35}
[WARNING|modeling_utils.py:388] 2022-03-02 10:55:25,937 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:55:25,937 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1606, 'learning_rate': 2.779139300068621e-05, 'epoch': 0.35}
[WARNING|modeling_utils.py:388] 2022-03-02 10:55:25,937 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0448, 'learning_rate': 2.778845211253799e-05, 'epoch': 0.35}
[WARNING|modeling_utils.py:388] 2022-03-02 10:55:25,937 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:55:25,937 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5988, 'learning_rate': 2.7785511224389765e-05, 'epoch': 0.35}
[WARNING|modeling_utils.py:388] 2022-03-02 10:55:25,937 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0043, 'learning_rate': 2.7782570336241545e-05, 'epoch': 0.35}
{'loss': 6.6703, 'learning_rate': 2.7779629448093324e-05, 'epoch': 0.35}
[WARNING|modeling_utils.py:388] 2022-03-02 10:55:25,937 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:55:25,937 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7234, 'learning_rate': 2.7776688559945104e-05, 'epoch': 0.35}
[WARNING|modeling_utils.py:388] 2022-03-02 10:55:25,937 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0496, 'learning_rate': 2.7773747671796884e-05, 'epoch': 0.35}
{'loss': 7.0381, 'learning_rate': 2.777080678364866e-05, 'epoch': 0.35}
[WARNING|modeling_utils.py:388] 2022-03-02 10:55:25,937 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7609, 'learning_rate': 2.776786589550044e-05, 'epoch': 0.35}
[WARNING|modeling_utils.py:388] 2022-03-02 10:55:25,937 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:55:25,937 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.503, 'learning_rate': 2.7764925007352223e-05, 'epoch': 0.35}
[WARNING|modeling_utils.py:388] 2022-03-02 10:55:25,937 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8785, 'learning_rate': 2.7761984119204e-05, 'epoch': 0.35}
[WARNING|modeling_utils.py:388] 2022-03-02 10:55:25,937 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9194, 'learning_rate': 2.775904323105578e-05, 'epoch': 0.35}
[WARNING|modeling_utils.py:388] 2022-03-02 10:55:25,937 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0796, 'learning_rate': 2.775610234290756e-05, 'epoch': 0.35}
[WARNING|modeling_utils.py:388] 2022-03-02 10:55:25,937 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8273, 'learning_rate': 2.775316145475934e-05, 'epoch': 0.35}
[WARNING|modeling_utils.py:388] 2022-03-02 10:55:25,937 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9619, 'learning_rate': 2.775022056661112e-05, 'epoch': 0.35}
[WARNING|modeling_utils.py:388] 2022-03-02 10:55:25,937 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 12%|████████▊                                                                 | 1267/10701 [1:11:24<5:36:38,  2.14s/it]g-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 12%|████████▊                                                                 | 1267/10701 [1:11:24<5:36:38,  2.14s/it]g-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8751, 'learning_rate': 2.7744338790314674e-05, 'epoch': 0.36}
 12%|████████▊                                                                 | 1267/10701 [1:11:24<5:36:38,  2.14s/it]g-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7993, 'learning_rate': 2.7741397902166454e-05, 'epoch': 0.36}
 12%|████████▊                                                                 | 1267/10701 [1:11:24<5:36:38,  2.14s/it]g-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.889, 'learning_rate': 2.7738457014018237e-05, 'epoch': 0.36}
 12%|████████▊                                                                 | 1267/10701 [1:11:24<5:36:38,  2.14s/it]g-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7536, 'learning_rate': 2.7735516125870014e-05, 'epoch': 0.36}
 12%|████████▊                                                                 | 1267/10701 [1:11:24<5:36:38,  2.14s/it]g-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.94, 'learning_rate': 2.7732575237721793e-05, 'epoch': 0.36}
 12%|████████▊                                                                 | 1267/10701 [1:11:24<5:36:38,  2.14s/it]g-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9895, 'learning_rate': 2.772963434957357e-05, 'epoch': 0.36}
 12%|████████▊                                                                 | 1267/10701 [1:11:24<5:36:38,  2.14s/it]g-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9397, 'learning_rate': 2.772669346142535e-05, 'epoch': 0.36}
 12%|████████▊                                                                 | 1267/10701 [1:11:24<5:36:38,  2.14s/it]g-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 12%|████████▊                                                                 | 1275/10701 [1:11:40<5:22:36,  2.05s/it]g-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 12%|████████▊                                                                 | 1275/10701 [1:11:40<5:22:36,  2.05s/it]g-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7816, 'learning_rate': 2.772081168512891e-05, 'epoch': 0.36}
 12%|████████▊                                                                 | 1275/10701 [1:11:40<5:22:36,  2.05s/it]g-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 12%|████████▊                                                                 | 1277/10701 [1:11:45<5:21:08,  2.04s/it]g-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 12%|████████▊                                                                 | 1277/10701 [1:11:45<5:21:08,  2.04s/it]g-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7717, 'learning_rate': 2.7714929908832468e-05, 'epoch': 0.36}
 12%|████████▊                                                                 | 1277/10701 [1:11:45<5:21:08,  2.04s/it]g-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.102, 'learning_rate': 2.7711989020684248e-05, 'epoch': 0.36}
 12%|████████▊                                                                 | 1277/10701 [1:11:45<5:21:08,  2.04s/it]g-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.99, 'learning_rate': 2.7709048132536028e-05, 'epoch': 0.36}
 12%|████████▊                                                                 | 1277/10701 [1:11:45<5:21:08,  2.04s/it]g-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0399, 'learning_rate': 2.7706107244387807e-05, 'epoch': 0.36}
 12%|████████▊                                                                 | 1277/10701 [1:11:45<5:21:08,  2.04s/it]g-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5584, 'learning_rate': 2.7703166356239584e-05, 'epoch': 0.36}
 12%|████████▊                                                                 | 1277/10701 [1:11:45<5:21:08,  2.04s/it]g-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7691, 'learning_rate': 2.7700225468091363e-05, 'epoch': 0.36}
 12%|████████▊                                                                 | 1277/10701 [1:11:45<5:21:08,  2.04s/it]g-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7683, 'learning_rate': 2.7697284579943147e-05, 'epoch': 0.36}
 12%|████████▊                                                                 | 1277/10701 [1:11:45<5:21:08,  2.04s/it]g-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0532, 'learning_rate': 2.7694343691794923e-05, 'epoch': 0.36}
 12%|████████▊                                                                 | 1277/10701 [1:11:45<5:21:08,  2.04s/it]g-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9148, 'learning_rate': 2.7691402803646703e-05, 'epoch': 0.36}
 12%|████████▊                                                                 | 1277/10701 [1:11:45<5:21:08,  2.04s/it]g-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.587, 'learning_rate': 2.768846191549848e-05, 'epoch': 0.36}
{'loss': 6.9412, 'learning_rate': 2.768552102735026e-05, 'epoch': 0.36}
 12%|████████▊                                                                 | 1277/10701 [1:11:45<5:21:08,  2.04s/it]g-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8656, 'learning_rate': 2.7682580139202042e-05, 'epoch': 0.36}
 12%|████████▊                                                                 | 1277/10701 [1:11:45<5:21:08,  2.04s/it]g-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 12%|████████▉                                                                 | 1290/10701 [1:12:08<4:25:15,  1.69s/it]g-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 12%|████████▉                                                                 | 1290/10701 [1:12:08<4:25:15,  1.69s/it]g-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0996, 'learning_rate': 2.7676698362905598e-05, 'epoch': 0.36}
 12%|████████▉                                                                 | 1290/10701 [1:12:08<4:25:15,  1.69s/it]g-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.551, 'learning_rate': 2.7673757474757378e-05, 'epoch': 0.36}
{'loss': 6.9375, 'learning_rate': 2.7670816586609154e-05, 'epoch': 0.36}
 12%|████████▉                                                                 | 1290/10701 [1:12:08<4:25:15,  1.69s/it]g-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 12%|████████▉                                                                 | 1294/10701 [1:12:14<3:55:33,  1.50s/it]g-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 12%|████████▉                                                                 | 1294/10701 [1:12:14<3:55:33,  1.50s/it]g-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7164, 'learning_rate': 2.7664934810312717e-05, 'epoch': 0.36}
[WARNING|modeling_utils.py:388] 2022-03-02 10:56:58,284 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:56:58,284 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7327, 'learning_rate': 2.7659053034016273e-05, 'epoch': 0.36}
{'loss': 6.7551, 'learning_rate': 2.7656112145868056e-05, 'epoch': 0.36}
[WARNING|modeling_utils.py:388] 2022-03-02 10:56:58,284 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:53:49,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 12%|████████▉                                                                 | 1299/10701 [1:12:20<3:14:18,  1.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 12%|████████▉                                                                 | 1299/10701 [1:12:20<3:14:18,  1.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9765, 'learning_rate': 2.7650230369571612e-05, 'epoch': 0.36}
{'loss': 6.4621, 'learning_rate': 2.764728948142339e-05, 'epoch': 0.36}
 12%|████████▉                                                                 | 1299/10701 [1:12:20<3:14:18,  1.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 12%|████████▉                                                                 | 1299/10701 [1:12:20<3:14:18,  1.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1223, 'learning_rate': 2.7644348593275168e-05, 'epoch': 0.36}
 12%|████████▉                                                                 | 1299/10701 [1:12:20<3:14:18,  1.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7953, 'learning_rate': 2.764140770512695e-05, 'epoch': 0.36}
 12%|████████▉                                                                 | 1299/10701 [1:12:20<3:14:18,  1.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0716, 'learning_rate': 2.7638466816978728e-05, 'epoch': 0.37}
 12%|████████▉                                                                 | 1299/10701 [1:12:20<3:14:18,  1.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2237, 'learning_rate': 2.7635525928830507e-05, 'epoch': 0.37}
 12%|████████▉                                                                 | 1299/10701 [1:12:20<3:14:18,  1.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7379, 'learning_rate': 2.7632585040682287e-05, 'epoch': 0.37}
 12%|████████▉                                                                 | 1299/10701 [1:12:20<3:14:18,  1.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8511, 'learning_rate': 2.7629644152534063e-05, 'epoch': 0.37}
 12%|████████▉                                                                 | 1299/10701 [1:12:20<3:14:18,  1.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0384, 'learning_rate': 2.7626703264385846e-05, 'epoch': 0.37}
 12%|████████▉                                                                 | 1299/10701 [1:12:20<3:14:18,  1.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 12%|████████▉                                                                 | 1299/10701 [1:12:20<3:14:18,  1.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9094, 'learning_rate': 2.7623762376237626e-05, 'epoch': 0.37}
 12%|████████▉                                                                 | 1299/10701 [1:12:20<3:14:18,  1.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8634, 'learning_rate': 2.7620821488089403e-05, 'epoch': 0.37}
{'loss': 7.0967, 'learning_rate': 2.7617880599941182e-05, 'epoch': 0.37}
 12%|████████▉                                                                 | 1299/10701 [1:12:20<3:14:18,  1.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7888, 'learning_rate': 2.7614939711792962e-05, 'epoch': 0.37}
 12%|████████▉                                                                 | 1299/10701 [1:12:20<3:14:18,  1.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 12%|████████▉                                                                 | 1299/10701 [1:12:20<3:14:18,  1.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8265, 'learning_rate': 2.7611998823644742e-05, 'epoch': 0.37}
 12%|████████▉                                                                 | 1299/10701 [1:12:20<3:14:18,  1.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8544, 'learning_rate': 2.760905793549652e-05, 'epoch': 0.37}
 12%|████████▉                                                                 | 1299/10701 [1:12:20<3:14:18,  1.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5861, 'learning_rate': 2.7606117047348298e-05, 'epoch': 0.37}
 12%|████████▉                                                                 | 1299/10701 [1:12:20<3:14:18,  1.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8803, 'learning_rate': 2.7603176159200078e-05, 'epoch': 0.37}
 12%|████████▉                                                                 | 1299/10701 [1:12:20<3:14:18,  1.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0343, 'learning_rate': 2.760023527105186e-05, 'epoch': 0.37}
 12%|████████▉                                                                 | 1299/10701 [1:12:20<3:14:18,  1.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9904, 'learning_rate': 2.7597294382903637e-05, 'epoch': 0.37}
 12%|████████▉                                                                 | 1299/10701 [1:12:20<3:14:18,  1.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7591, 'learning_rate': 2.7594353494755417e-05, 'epoch': 0.37}
 12%|████████▉                                                                 | 1299/10701 [1:12:20<3:14:18,  1.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8897, 'learning_rate': 2.7591412606607196e-05, 'epoch': 0.37}
 12%|████████▉                                                                 | 1299/10701 [1:12:20<3:14:18,  1.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8319, 'learning_rate': 2.7588471718458973e-05, 'epoch': 0.37}
 12%|████████▉                                                                 | 1299/10701 [1:12:20<3:14:18,  1.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9399, 'learning_rate': 2.7585530830310756e-05, 'epoch': 0.37}
 12%|████████▉                                                                 | 1299/10701 [1:12:20<3:14:18,  1.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8966, 'learning_rate': 2.7582589942162536e-05, 'epoch': 0.37}
 12%|████████▉                                                                 | 1299/10701 [1:12:20<3:14:18,  1.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8801, 'learning_rate': 2.7579649054014312e-05, 'epoch': 0.37}
 12%|████████▉                                                                 | 1299/10701 [1:12:20<3:14:18,  1.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.768, 'learning_rate': 2.757670816586609e-05, 'epoch': 0.37}
 12%|████████▉                                                                 | 1299/10701 [1:12:20<3:14:18,  1.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 12%|█████████▏                                                                | 1326/10701 [1:13:19<5:28:47,  2.10s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 12%|█████████▏                                                                | 1326/10701 [1:13:19<5:28:47,  2.10s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8372, 'learning_rate': 2.757082638956965e-05, 'epoch': 0.37}
 12%|█████████▏                                                                | 1326/10701 [1:13:19<5:28:47,  2.10s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8298, 'learning_rate': 2.756788550142143e-05, 'epoch': 0.37}
 12%|█████████▏                                                                | 1326/10701 [1:13:19<5:28:47,  2.10s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 12%|█████████▏                                                                | 1329/10701 [1:13:25<5:20:22,  2.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 12%|█████████▏                                                                | 1329/10701 [1:13:25<5:20:22,  2.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9296, 'learning_rate': 2.7562003725124987e-05, 'epoch': 0.37}
 12%|█████████▏                                                                | 1329/10701 [1:13:25<5:20:22,  2.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8162, 'learning_rate': 2.755906283697677e-05, 'epoch': 0.37}
 12%|█████████▏                                                                | 1329/10701 [1:13:25<5:20:22,  2.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8286, 'learning_rate': 2.7556121948828546e-05, 'epoch': 0.37}
 12%|█████████▏                                                                | 1329/10701 [1:13:25<5:20:22,  2.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.025, 'learning_rate': 2.7553181060680326e-05, 'epoch': 0.37}
 12%|█████████▏                                                                | 1329/10701 [1:13:25<5:20:22,  2.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9322, 'learning_rate': 2.7550240172532106e-05, 'epoch': 0.37}
 12%|█████████▏                                                                | 1329/10701 [1:13:25<5:20:22,  2.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0027, 'learning_rate': 2.7547299284383882e-05, 'epoch': 0.37}
 12%|█████████▏                                                                | 1329/10701 [1:13:25<5:20:22,  2.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0785, 'learning_rate': 2.7544358396235665e-05, 'epoch': 0.37}
 12%|█████████▏                                                                | 1329/10701 [1:13:25<5:20:22,  2.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8011, 'learning_rate': 2.7541417508087445e-05, 'epoch': 0.37}
 12%|█████████▏                                                                | 1329/10701 [1:13:25<5:20:22,  2.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.836, 'learning_rate': 2.753847661993922e-05, 'epoch': 0.37}
{'loss': 6.7265, 'learning_rate': 2.7535535731791e-05, 'epoch': 0.38}
 12%|█████████▏                                                                | 1329/10701 [1:13:25<5:20:22,  2.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0186, 'learning_rate': 2.753259484364278e-05, 'epoch': 0.38}
 12%|█████████▏                                                                | 1329/10701 [1:13:25<5:20:22,  2.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7846, 'learning_rate': 2.752671306734634e-05, 'epoch': 0.38}
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.633, 'learning_rate': 2.752377217919812e-05, 'epoch': 0.38}
{'loss': 6.9476, 'learning_rate': 2.7520831291049896e-05, 'epoch': 0.38}
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 13%|█████████▎                                                                | 1345/10701 [1:13:53<3:49:32,  1.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 13%|█████████▎                                                                | 1345/10701 [1:13:53<3:49:32,  1.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5607, 'learning_rate': 2.7514949514753456e-05, 'epoch': 0.38}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:36,319 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:36,319 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5331, 'learning_rate': 2.7509067738457015e-05, 'epoch': 0.38}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.4266, 'learning_rate': 2.7503185962160575e-05, 'epoch': 0.38}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5799, 'learning_rate': 2.7500245074012354e-05, 'epoch': 0.38}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9894, 'learning_rate': 2.749730418586413e-05, 'epoch': 0.38}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7707, 'learning_rate': 2.749436329771591e-05, 'epoch': 0.38}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.927, 'learning_rate': 2.749142240956769e-05, 'epoch': 0.38}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7503, 'learning_rate': 2.748848152141947e-05, 'epoch': 0.38}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9562, 'learning_rate': 2.748554063327125e-05, 'epoch': 0.38}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7761, 'learning_rate': 2.748259974512303e-05, 'epoch': 0.38}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6871, 'learning_rate': 2.7479658856974806e-05, 'epoch': 0.38}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7656, 'learning_rate': 2.7476717968826585e-05, 'epoch': 0.38}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1848, 'learning_rate': 2.7473777080678365e-05, 'epoch': 0.38}
{'loss': 6.9756, 'learning_rate': 2.7470836192530145e-05, 'epoch': 0.38}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1096, 'learning_rate': 2.7467895304381925e-05, 'epoch': 0.38}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8882, 'learning_rate': 2.74649544162337e-05, 'epoch': 0.38}
{'loss': 6.9257, 'learning_rate': 2.746201352808548e-05, 'epoch': 0.38}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0702, 'learning_rate': 2.7459072639937264e-05, 'epoch': 0.38}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0521, 'learning_rate': 2.745613175178904e-05, 'epoch': 0.38}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8883, 'learning_rate': 2.745319086364082e-05, 'epoch': 0.38}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.81, 'learning_rate': 2.74502499754926e-05, 'epoch': 0.38}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9115, 'learning_rate': 2.744730908734438e-05, 'epoch': 0.38}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8577, 'learning_rate': 2.744436819919616e-05, 'epoch': 0.38}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6356, 'learning_rate': 2.744142731104794e-05, 'epoch': 0.38}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2588, 'learning_rate': 2.7438486422899715e-05, 'epoch': 0.38}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9462, 'learning_rate': 2.7435545534751495e-05, 'epoch': 0.38}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.585, 'learning_rate': 2.7432604646603278e-05, 'epoch': 0.38}
{'loss': 6.8127, 'learning_rate': 2.7429663758455054e-05, 'epoch': 0.39}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9275, 'learning_rate': 2.7426722870306834e-05, 'epoch': 0.39}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9738, 'learning_rate': 2.742378198215861e-05, 'epoch': 0.39}
{'loss': 6.7099, 'learning_rate': 2.742084109401039e-05, 'epoch': 0.39}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8177, 'learning_rate': 2.7417900205862173e-05, 'epoch': 0.39}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0233, 'learning_rate': 2.741495931771395e-05, 'epoch': 0.39}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9008, 'learning_rate': 2.741201842956573e-05, 'epoch': 0.39}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8685, 'learning_rate': 2.740907754141751e-05, 'epoch': 0.39}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9283, 'learning_rate': 2.740613665326929e-05, 'epoch': 0.39}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6742, 'learning_rate': 2.740319576512107e-05, 'epoch': 0.39}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.809, 'learning_rate': 2.7400254876972848e-05, 'epoch': 0.39}
{'loss': 7.0134, 'learning_rate': 2.7397313988824624e-05, 'epoch': 0.39}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0448, 'learning_rate': 2.7394373100676404e-05, 'epoch': 0.39}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5904, 'learning_rate': 2.7391432212528187e-05, 'epoch': 0.39}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.054, 'learning_rate': 2.7388491324379964e-05, 'epoch': 0.39}
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 10:58:38,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 13%|█████████▌                                                                | 1391/10701 [1:15:24<4:46:54,  1.85s/it]g-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 13%|█████████▌                                                                | 1391/10701 [1:15:24<4:46:54,  1.85s/it]g-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6593, 'learning_rate': 2.738260954808352e-05, 'epoch': 0.39}
{'loss': 6.9682, 'learning_rate': 2.73796686599353e-05, 'epoch': 0.39}
 13%|█████████▌                                                                | 1391/10701 [1:15:24<4:46:54,  1.85s/it]g-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8843, 'learning_rate': 2.7376727771787083e-05, 'epoch': 0.39}
 13%|█████████▌                                                                | 1391/10701 [1:15:24<4:46:54,  1.85s/it]g-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.133, 'learning_rate': 2.737378688363886e-05, 'epoch': 0.39}
 13%|█████████▌                                                                | 1391/10701 [1:15:24<4:46:54,  1.85s/it]g-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9505, 'learning_rate': 2.737084599549064e-05, 'epoch': 0.39}
{'loss': 7.019, 'learning_rate': 2.736790510734242e-05, 'epoch': 0.39}
 13%|█████████▌                                                                | 1391/10701 [1:15:24<4:46:54,  1.85s/it]g-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8108, 'learning_rate': 2.7362023331045978e-05, 'epoch': 0.39}
{'loss': 6.6232, 'learning_rate': 2.7359082442897757e-05, 'epoch': 0.39}
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9843, 'learning_rate': 2.7356141554749534e-05, 'epoch': 0.39}
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7643, 'learning_rate': 2.7353200666601314e-05, 'epoch': 0.39}
{'loss': 6.733, 'learning_rate': 2.7350259778453097e-05, 'epoch': 0.39}
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8906, 'learning_rate': 2.7347318890304873e-05, 'epoch': 0.39}
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0806, 'learning_rate': 2.7344378002156653e-05, 'epoch': 0.39}
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0235, 'learning_rate': 2.734143711400843e-05, 'epoch': 0.39}
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8949, 'learning_rate': 2.733849622586021e-05, 'epoch': 0.39}
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1112, 'learning_rate': 2.7335555337711992e-05, 'epoch': 0.39}
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7836, 'learning_rate': 2.7332614449563768e-05, 'epoch': 0.39}
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0777, 'learning_rate': 2.7329673561415548e-05, 'epoch': 0.39}
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9202, 'learning_rate': 2.7326732673267328e-05, 'epoch': 0.39}
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9537, 'learning_rate': 2.7323791785119104e-05, 'epoch': 0.4}
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7935, 'learning_rate': 2.7320850896970887e-05, 'epoch': 0.4}
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9252, 'learning_rate': 2.7317910008822667e-05, 'epoch': 0.4}
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0937, 'learning_rate': 2.7314969120674443e-05, 'epoch': 0.4}
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8879, 'learning_rate': 2.7312028232526223e-05, 'epoch': 0.4}
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8059, 'learning_rate': 2.7309087344378003e-05, 'epoch': 0.4}
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8497, 'learning_rate': 2.7306146456229782e-05, 'epoch': 0.4}
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7187, 'learning_rate': 2.7303205568081562e-05, 'epoch': 0.4}
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8428, 'learning_rate': 2.7300264679933342e-05, 'epoch': 0.4}
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6953, 'learning_rate': 2.7297323791785118e-05, 'epoch': 0.4}
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7316, 'learning_rate': 2.72943829036369e-05, 'epoch': 0.4}
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3282, 'learning_rate': 2.7291442015488678e-05, 'epoch': 0.4}
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9646, 'learning_rate': 2.7288501127340457e-05, 'epoch': 0.4}
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1839, 'learning_rate': 2.7285560239192237e-05, 'epoch': 0.4}
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8855, 'learning_rate': 2.7282619351044013e-05, 'epoch': 0.4}
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7435, 'learning_rate': 2.7279678462895797e-05, 'epoch': 0.4}
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5975, 'learning_rate': 2.7276737574747576e-05, 'epoch': 0.4}
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.4534, 'learning_rate': 2.7273796686599353e-05, 'epoch': 0.4}
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8189, 'learning_rate': 2.7270855798451132e-05, 'epoch': 0.4}
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7664, 'learning_rate': 2.7267914910302912e-05, 'epoch': 0.4}
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8494, 'learning_rate': 2.7264974022154692e-05, 'epoch': 0.4}
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7743, 'learning_rate': 2.726203313400647e-05, 'epoch': 0.4}
{'loss': 6.8222, 'learning_rate': 2.725909224585825e-05, 'epoch': 0.4}
[WARNING|modeling_utils.py:388] 2022-03-02 11:00:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 13%|█████████▉                                                                | 1434/10701 [1:16:54<5:21:46,  2.08s/it]g-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 13%|█████████▉                                                                | 1434/10701 [1:16:54<5:21:46,  2.08s/it]g-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7199, 'learning_rate': 2.7253210469561807e-05, 'epoch': 0.4}
{'loss': 6.8603, 'learning_rate': 2.7250269581413587e-05, 'epoch': 0.4}
 13%|█████████▉                                                                | 1434/10701 [1:16:54<5:21:46,  2.08s/it]g-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 13%|█████████▉                                                                | 1434/10701 [1:16:54<5:21:46,  2.08s/it]g-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.726, 'learning_rate': 2.7247328693265367e-05, 'epoch': 0.4}
 13%|█████████▉                                                                | 1434/10701 [1:16:54<5:21:46,  2.08s/it]g-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0833, 'learning_rate': 2.7244387805117146e-05, 'epoch': 0.4}
 13%|█████████▉                                                                | 1434/10701 [1:16:54<5:21:46,  2.08s/it]g-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7333, 'learning_rate': 2.7241446916968923e-05, 'epoch': 0.4}
 13%|█████████▉                                                                | 1434/10701 [1:16:54<5:21:46,  2.08s/it]g-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0335, 'learning_rate': 2.7238506028820706e-05, 'epoch': 0.4}
 13%|█████████▉                                                                | 1434/10701 [1:16:54<5:21:46,  2.08s/it]g-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9929, 'learning_rate': 2.7235565140672486e-05, 'epoch': 0.4}
 13%|█████████▉                                                                | 1434/10701 [1:16:54<5:21:46,  2.08s/it]g-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.863, 'learning_rate': 2.7232624252524262e-05, 'epoch': 0.4}
 13%|█████████▉                                                                | 1434/10701 [1:16:54<5:21:46,  2.08s/it]g-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9386, 'learning_rate': 2.7229683364376042e-05, 'epoch': 0.4}
 13%|█████████▉                                                                | 1434/10701 [1:16:54<5:21:46,  2.08s/it]g-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 14%|█████████▉                                                                | 1445/10701 [1:17:14<4:26:57,  1.73s/it]g-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 14%|█████████▉                                                                | 1445/10701 [1:17:14<4:26:57,  1.73s/it]g-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1915, 'learning_rate': 2.72238015880796e-05, 'epoch': 0.4}
{'loss': 6.8256, 'learning_rate': 2.722086069993138e-05, 'epoch': 0.41}
 14%|█████████▉                                                                | 1445/10701 [1:17:14<4:26:57,  1.73s/it]g-point operations will not be computed-02 10:57:02,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.4665, 'learning_rate': 2.7214978923634937e-05, 'epoch': 0.41}
{'loss': 6.6596, 'learning_rate': 2.7212038035486717e-05, 'epoch': 0.41}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.2023, 'learning_rate': 2.7209097147338496e-05, 'epoch': 0.41}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.4564, 'learning_rate': 2.7206156259190276e-05, 'epoch': 0.41}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7539, 'learning_rate': 2.7203215371042056e-05, 'epoch': 0.41}
{'loss': 6.9169, 'learning_rate': 2.7200274482893832e-05, 'epoch': 0.41}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7805, 'learning_rate': 2.7197333594745615e-05, 'epoch': 0.41}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7913, 'learning_rate': 2.7194392706597395e-05, 'epoch': 0.41}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.737, 'learning_rate': 2.719145181844917e-05, 'epoch': 0.41}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0632, 'learning_rate': 2.718851093030095e-05, 'epoch': 0.41}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9273, 'learning_rate': 2.718557004215273e-05, 'epoch': 0.41}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0976, 'learning_rate': 2.718262915400451e-05, 'epoch': 0.41}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0011, 'learning_rate': 2.717968826585629e-05, 'epoch': 0.41}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7753, 'learning_rate': 2.717674737770807e-05, 'epoch': 0.41}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0045, 'learning_rate': 2.7173806489559846e-05, 'epoch': 0.41}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.81, 'learning_rate': 2.7170865601411626e-05, 'epoch': 0.41}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6725, 'learning_rate': 2.716792471326341e-05, 'epoch': 0.41}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0726, 'learning_rate': 2.7164983825115186e-05, 'epoch': 0.41}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8488, 'learning_rate': 2.7162042936966965e-05, 'epoch': 0.41}
{'loss': 7.0543, 'learning_rate': 2.715910204881874e-05, 'epoch': 0.41}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9015, 'learning_rate': 2.715616116067052e-05, 'epoch': 0.41}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0122, 'learning_rate': 2.7153220272522304e-05, 'epoch': 0.41}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.798, 'learning_rate': 2.715027938437408e-05, 'epoch': 0.41}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0462, 'learning_rate': 2.714733849622586e-05, 'epoch': 0.41}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8871, 'learning_rate': 2.714439760807764e-05, 'epoch': 0.41}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9782, 'learning_rate': 2.714145671992942e-05, 'epoch': 0.41}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9502, 'learning_rate': 2.71385158317812e-05, 'epoch': 0.41}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0882, 'learning_rate': 2.713557494363298e-05, 'epoch': 0.41}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9422, 'learning_rate': 2.7132634055484756e-05, 'epoch': 0.41}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9081, 'learning_rate': 2.7129693167336535e-05, 'epoch': 0.41}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7146, 'learning_rate': 2.712675227918832e-05, 'epoch': 0.41}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9938, 'learning_rate': 2.7123811391040095e-05, 'epoch': 0.41}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8343, 'learning_rate': 2.7120870502891875e-05, 'epoch': 0.41}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.133, 'learning_rate': 2.711792961474365e-05, 'epoch': 0.41}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6559, 'learning_rate': 2.711498872659543e-05, 'epoch': 0.42}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2087, 'learning_rate': 2.7112047838447214e-05, 'epoch': 0.42}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8306, 'learning_rate': 2.710910695029899e-05, 'epoch': 0.42}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1284, 'learning_rate': 2.710616606215077e-05, 'epoch': 0.42}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7169, 'learning_rate': 2.710322517400255e-05, 'epoch': 0.42}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5723, 'learning_rate': 2.7100284285854326e-05, 'epoch': 0.42}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8379, 'learning_rate': 2.709734339770611e-05, 'epoch': 0.42}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8784, 'learning_rate': 2.709440250955789e-05, 'epoch': 0.42}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7025, 'learning_rate': 2.7091461621409665e-05, 'epoch': 0.42}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0435, 'learning_rate': 2.7088520733261445e-05, 'epoch': 0.42}
{'loss': 6.9039, 'learning_rate': 2.7085579845113228e-05, 'epoch': 0.42}
 14%|██████████                                                                | 1448/10701 [1:17:18<3:35:58,  1.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 14%|██████████▎                                                               | 1493/10701 [1:18:53<4:23:11,  1.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 14%|██████████▎                                                               | 1493/10701 [1:18:53<4:23:11,  1.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0233, 'learning_rate': 2.7079698068816784e-05, 'epoch': 0.42}
 14%|██████████▎                                                               | 1493/10701 [1:18:53<4:23:11,  1.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8674, 'learning_rate': 2.707675718066856e-05, 'epoch': 0.42}
{'loss': 6.8969, 'learning_rate': 2.707381629252034e-05, 'epoch': 0.42}
 14%|██████████▎                                                               | 1493/10701 [1:18:53<4:23:11,  1.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.3101, 'learning_rate': 2.7070875404372123e-05, 'epoch': 0.42}
 14%|██████████▎                                                               | 1493/10701 [1:18:53<4:23:11,  1.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8164, 'learning_rate': 2.70679345162239e-05, 'epoch': 0.42}
[WARNING|modeling_utils.py:388] 2022-03-02 11:03:42,225 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 11:03:42,225 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2364] 2022-03-02 11:03:44,122 >> ***** Running Evaluation *****e number of tokens of the input, floating-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2364] 2022-03-02 11:03:44,122 >> ***** Running Evaluation *****e number of tokens of the input, floating-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
03/02/2022 11:29:35 - INFO - datasets.metric - Removing /home/sanchit_huggingface_co/.cache/huggingface/metrics/wer/default/default_experiment-1-0.arrow
{'eval_loss': 6.791457653045654, 'eval_wer': 1.820663928408437, 'eval_runtime': 1551.0153, 'eval_samples_per_second': 1.703, 'eval_steps_per_second': 0.426, 'epoch': 0.42}
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9962, 'learning_rate': 2.705617096363102e-05, 'epoch': 0.42}
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.71, 'learning_rate': 2.7053230075482798e-05, 'epoch': 0.42}
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8531, 'learning_rate': 2.7050289187334574e-05, 'epoch': 0.42}
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.789, 'learning_rate': 2.7047348299186354e-05, 'epoch': 0.42}
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0406, 'learning_rate': 2.7044407411038134e-05, 'epoch': 0.42}
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8864, 'learning_rate': 2.7041466522889914e-05, 'epoch': 0.42}
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8469, 'learning_rate': 2.7038525634741693e-05, 'epoch': 0.42}
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1525, 'learning_rate': 2.7035584746593473e-05, 'epoch': 0.42}
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7669, 'learning_rate': 2.703264385844525e-05, 'epoch': 0.42}
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6609, 'learning_rate': 2.7029702970297033e-05, 'epoch': 0.42}
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9273, 'learning_rate': 2.702676208214881e-05, 'epoch': 0.42}
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.4918, 'learning_rate': 2.702382119400059e-05, 'epoch': 0.42}
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9102, 'learning_rate': 2.702088030585237e-05, 'epoch': 0.42}
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9306, 'learning_rate': 2.7017939417704145e-05, 'epoch': 0.42}
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.703, 'learning_rate': 2.7014998529555928e-05, 'epoch': 0.42}
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6673, 'learning_rate': 2.7012057641407708e-05, 'epoch': 0.42}
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8153, 'learning_rate': 2.7009116753259484e-05, 'epoch': 0.43}
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7086, 'learning_rate': 2.7006175865111264e-05, 'epoch': 0.43}
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7924, 'learning_rate': 2.7003234976963043e-05, 'epoch': 0.43}
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9113, 'learning_rate': 2.7000294088814823e-05, 'epoch': 0.43}
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6635, 'learning_rate': 2.6997353200666603e-05, 'epoch': 0.43}
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1254, 'learning_rate': 2.6994412312518383e-05, 'epoch': 0.43}
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.4814, 'learning_rate': 2.699147142437016e-05, 'epoch': 0.43}
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.162, 'learning_rate': 2.6988530536221942e-05, 'epoch': 0.43}
{'loss': 6.7503, 'learning_rate': 2.6985589648073718e-05, 'epoch': 0.43}
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8683, 'learning_rate': 2.6982648759925498e-05, 'epoch': 0.43}
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7824, 'learning_rate': 2.6979707871777278e-05, 'epoch': 0.43}
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8141, 'learning_rate': 2.6976766983629054e-05, 'epoch': 0.43}
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0538, 'learning_rate': 2.6973826095480837e-05, 'epoch': 0.43}
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8724, 'learning_rate': 2.6970885207332617e-05, 'epoch': 0.43}
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7838, 'learning_rate': 2.6967944319184393e-05, 'epoch': 0.43}
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9947, 'learning_rate': 2.6965003431036173e-05, 'epoch': 0.43}
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1371, 'learning_rate': 2.6962062542887953e-05, 'epoch': 0.43}
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7519, 'learning_rate': 2.6959121654739732e-05, 'epoch': 0.43}
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0739, 'learning_rate': 2.6956180766591512e-05, 'epoch': 0.43}
  0%|                                                                                           | 0/661 [00:00<?, ?it/s]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 14%|██████████▋                                                               | 1537/10701 [1:48:14<4:43:51,  1.86s/it]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 14%|██████████▋                                                               | 1537/10701 [1:48:14<4:43:51,  1.86s/it]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9752, 'learning_rate': 2.6950298990295068e-05, 'epoch': 0.43}
 14%|██████████▋                                                               | 1537/10701 [1:48:14<4:43:51,  1.86s/it]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9095, 'learning_rate': 2.6947358102146848e-05, 'epoch': 0.43}
 14%|██████████▋                                                               | 1537/10701 [1:48:14<4:43:51,  1.86s/it]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8314, 'learning_rate': 2.6944417213998628e-05, 'epoch': 0.43}
 14%|██████████▋                                                               | 1537/10701 [1:48:14<4:43:51,  1.86s/it]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9618, 'learning_rate': 2.6941476325850407e-05, 'epoch': 0.43}
{'loss': 6.6797, 'learning_rate': 2.6938535437702187e-05, 'epoch': 0.43}
 14%|██████████▋                                                               | 1537/10701 [1:48:14<4:43:51,  1.86s/it]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8252, 'learning_rate': 2.6935594549553963e-05, 'epoch': 0.43}
 14%|██████████▋                                                               | 1537/10701 [1:48:14<4:43:51,  1.86s/it]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5774, 'learning_rate': 2.6932653661405747e-05, 'epoch': 0.43}
 14%|██████████▋                                                               | 1537/10701 [1:48:14<4:43:51,  1.86s/it]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9778, 'learning_rate': 2.6929712773257526e-05, 'epoch': 0.43}
 14%|██████████▋                                                               | 1546/10701 [1:48:28<3:44:05,  1.47s/it]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 14%|██████████▋                                                               | 1546/10701 [1:48:28<3:44:05,  1.47s/it]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5446, 'learning_rate': 2.6923830996961082e-05, 'epoch': 0.43}
 14%|██████████▋                                                               | 1546/10701 [1:48:28<3:44:05,  1.47s/it]g-point operations will not be computed-02 11:02:00,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.3486, 'learning_rate': 2.6920890108812862e-05, 'epoch': 0.43}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7098, 'learning_rate': 2.691500833251642e-05, 'epoch': 0.43}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7221, 'learning_rate': 2.69120674443682e-05, 'epoch': 0.43}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9709, 'learning_rate': 2.6909126556219978e-05, 'epoch': 0.43}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9185, 'learning_rate': 2.6906185668071757e-05, 'epoch': 0.44}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8465, 'learning_rate': 2.6903244779923537e-05, 'epoch': 0.44}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9248, 'learning_rate': 2.6900303891775317e-05, 'epoch': 0.44}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6583, 'learning_rate': 2.6897363003627097e-05, 'epoch': 0.44}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0092, 'learning_rate': 2.6894422115478873e-05, 'epoch': 0.44}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9584, 'learning_rate': 2.6891481227330656e-05, 'epoch': 0.44}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9427, 'learning_rate': 2.6888540339182436e-05, 'epoch': 0.44}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6341, 'learning_rate': 2.6885599451034212e-05, 'epoch': 0.44}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0647, 'learning_rate': 2.6882658562885992e-05, 'epoch': 0.44}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7306, 'learning_rate': 2.687971767473777e-05, 'epoch': 0.44}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9654, 'learning_rate': 2.687677678658955e-05, 'epoch': 0.44}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9406, 'learning_rate': 2.687383589844133e-05, 'epoch': 0.44}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7688, 'learning_rate': 2.687089501029311e-05, 'epoch': 0.44}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8931, 'learning_rate': 2.6867954122144887e-05, 'epoch': 0.44}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.801, 'learning_rate': 2.6865013233996667e-05, 'epoch': 0.44}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9394, 'learning_rate': 2.686207234584845e-05, 'epoch': 0.44}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8028, 'learning_rate': 2.6859131457700226e-05, 'epoch': 0.44}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0527, 'learning_rate': 2.6856190569552006e-05, 'epoch': 0.44}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7991, 'learning_rate': 2.6853249681403782e-05, 'epoch': 0.44}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7984, 'learning_rate': 2.6850308793255562e-05, 'epoch': 0.44}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.116, 'learning_rate': 2.6847367905107345e-05, 'epoch': 0.44}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8148, 'learning_rate': 2.684442701695912e-05, 'epoch': 0.44}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7748, 'learning_rate': 2.68414861288109e-05, 'epoch': 0.44}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.857, 'learning_rate': 2.683854524066268e-05, 'epoch': 0.44}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7113, 'learning_rate': 2.683560435251446e-05, 'epoch': 0.44}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0578, 'learning_rate': 2.683266346436624e-05, 'epoch': 0.44}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9016, 'learning_rate': 2.682972257621802e-05, 'epoch': 0.44}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.812, 'learning_rate': 2.6826781688069796e-05, 'epoch': 0.44}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9172, 'learning_rate': 2.6823840799921576e-05, 'epoch': 0.44}
{'loss': 6.8404, 'learning_rate': 2.682089991177336e-05, 'epoch': 0.44}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9054, 'learning_rate': 2.6817959023625136e-05, 'epoch': 0.44}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7881, 'learning_rate': 2.6815018135476915e-05, 'epoch': 0.44}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8758, 'learning_rate': 2.681207724732869e-05, 'epoch': 0.44}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7941, 'learning_rate': 2.680913635918047e-05, 'epoch': 0.44}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6414, 'learning_rate': 2.6806195471032254e-05, 'epoch': 0.44}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8484, 'learning_rate': 2.680325458288403e-05, 'epoch': 0.44}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5479, 'learning_rate': 2.680031369473581e-05, 'epoch': 0.45}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9739, 'learning_rate': 2.679737280658759e-05, 'epoch': 0.45}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9905, 'learning_rate': 2.6794431918439367e-05, 'epoch': 0.45}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9687, 'learning_rate': 2.679149103029115e-05, 'epoch': 0.45}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.545, 'learning_rate': 2.678855014214293e-05, 'epoch': 0.45}
{'loss': 7.0376, 'learning_rate': 2.6785609253994706e-05, 'epoch': 0.45}
 14%|██████████▋                                                               | 1549/10701 [1:48:32<3:20:31,  1.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 15%|███████████                                                               | 1595/10701 [1:50:08<4:02:17,  1.60s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 15%|███████████                                                               | 1595/10701 [1:50:08<4:02:17,  1.60s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6054, 'learning_rate': 2.677972747769827e-05, 'epoch': 0.45}
 15%|███████████                                                               | 1595/10701 [1:50:08<4:02:17,  1.60s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:33:13,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8554, 'learning_rate': 2.6776786589550045e-05, 'epoch': 0.45}
 15%|███████████                                                               | 1598/10701 [1:50:12<3:24:04,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 15%|███████████                                                               | 1598/10701 [1:50:12<3:24:04,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9702, 'learning_rate': 2.67709048132536e-05, 'epoch': 0.45}
{'loss': 6.6804, 'learning_rate': 2.676796392510538e-05, 'epoch': 0.45}
 15%|███████████                                                               | 1598/10701 [1:50:12<3:24:04,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.4009, 'learning_rate': 2.6765023036957164e-05, 'epoch': 0.45}
 15%|███████████                                                               | 1598/10701 [1:50:12<3:24:04,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9914, 'learning_rate': 2.676208214880894e-05, 'epoch': 0.45}
 15%|███████████                                                               | 1598/10701 [1:50:12<3:24:04,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2238, 'learning_rate': 2.675914126066072e-05, 'epoch': 0.45}
 15%|███████████                                                               | 1598/10701 [1:50:12<3:24:04,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 15%|███████████                                                               | 1598/10701 [1:50:12<3:24:04,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9428, 'learning_rate': 2.67562003725125e-05, 'epoch': 0.45}
 15%|███████████                                                               | 1598/10701 [1:50:12<3:24:04,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8564, 'learning_rate': 2.6753259484364276e-05, 'epoch': 0.45}
 15%|███████████                                                               | 1598/10701 [1:50:12<3:24:04,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1067, 'learning_rate': 2.675031859621606e-05, 'epoch': 0.45}
 15%|███████████                                                               | 1598/10701 [1:50:12<3:24:04,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9942, 'learning_rate': 2.674737770806784e-05, 'epoch': 0.45}
 15%|███████████                                                               | 1598/10701 [1:50:12<3:24:04,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8358, 'learning_rate': 2.6744436819919615e-05, 'epoch': 0.45}
 15%|███████████                                                               | 1598/10701 [1:50:12<3:24:04,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0237, 'learning_rate': 2.6741495931771395e-05, 'epoch': 0.45}
 15%|███████████                                                               | 1598/10701 [1:50:12<3:24:04,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 15%|███████████▏                                                              | 1610/10701 [1:50:37<5:32:10,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 15%|███████████▏                                                              | 1610/10701 [1:50:37<5:32:10,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.787, 'learning_rate': 2.6735614155474954e-05, 'epoch': 0.45}
 15%|███████████▏                                                              | 1610/10701 [1:50:37<5:32:10,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8887, 'learning_rate': 2.6732673267326734e-05, 'epoch': 0.45}
 15%|███████████▏                                                              | 1610/10701 [1:50:37<5:32:10,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8753, 'learning_rate': 2.6729732379178514e-05, 'epoch': 0.45}
 15%|███████████▏                                                              | 1610/10701 [1:50:37<5:32:10,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7789, 'learning_rate': 2.672679149103029e-05, 'epoch': 0.45}
 15%|███████████▏                                                              | 1610/10701 [1:50:37<5:32:10,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9999, 'learning_rate': 2.6723850602882073e-05, 'epoch': 0.45}
 15%|███████████▏                                                              | 1610/10701 [1:50:37<5:32:10,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 15%|███████████▏                                                              | 1610/10701 [1:50:37<5:32:10,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8266, 'learning_rate': 2.672090971473385e-05, 'epoch': 0.45}
 15%|███████████▏                                                              | 1610/10701 [1:50:37<5:32:10,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0016, 'learning_rate': 2.671796882658563e-05, 'epoch': 0.45}
 15%|███████████▏                                                              | 1610/10701 [1:50:37<5:32:10,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8707, 'learning_rate': 2.671502793843741e-05, 'epoch': 0.45}
 15%|███████████▏                                                              | 1610/10701 [1:50:37<5:32:10,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.82, 'learning_rate': 2.6712087050289185e-05, 'epoch': 0.45}
 15%|███████████▏                                                              | 1610/10701 [1:50:37<5:32:10,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.082, 'learning_rate': 2.670914616214097e-05, 'epoch': 0.45}
 15%|███████████▏                                                              | 1610/10701 [1:50:37<5:32:10,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0073, 'learning_rate': 2.6706205273992748e-05, 'epoch': 0.45}
 15%|███████████▏                                                              | 1610/10701 [1:50:37<5:32:10,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1281, 'learning_rate': 2.6703264385844525e-05, 'epoch': 0.45}
{'loss': 6.9391, 'learning_rate': 2.6700323497696304e-05, 'epoch': 0.45}
 15%|███████████▏                                                              | 1610/10701 [1:50:37<5:32:10,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2243, 'learning_rate': 2.6697382609548084e-05, 'epoch': 0.45}
 15%|███████████▏                                                              | 1610/10701 [1:50:37<5:32:10,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 15%|███████████▏                                                              | 1610/10701 [1:50:37<5:32:10,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8037, 'learning_rate': 2.6694441721399864e-05, 'epoch': 0.46}
 15%|███████████▏                                                              | 1610/10701 [1:50:37<5:32:10,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9303, 'learning_rate': 2.6691500833251643e-05, 'epoch': 0.46}
{'loss': 6.7596, 'learning_rate': 2.6688559945103423e-05, 'epoch': 0.46}
 15%|███████████▏                                                              | 1610/10701 [1:50:37<5:32:10,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 15%|███████████▏                                                              | 1610/10701 [1:50:37<5:32:10,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0072, 'learning_rate': 2.66856190569552e-05, 'epoch': 0.46}
 15%|███████████▏                                                              | 1610/10701 [1:50:37<5:32:10,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6211, 'learning_rate': 2.6682678168806983e-05, 'epoch': 0.46}
{'loss': 6.8436, 'learning_rate': 2.667973728065876e-05, 'epoch': 0.46}
 15%|███████████▏                                                              | 1610/10701 [1:50:37<5:32:10,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9587, 'learning_rate': 2.667679639251054e-05, 'epoch': 0.46}
 15%|███████████▏                                                              | 1610/10701 [1:50:37<5:32:10,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7781, 'learning_rate': 2.667385550436232e-05, 'epoch': 0.46}
 15%|███████████▏                                                              | 1610/10701 [1:50:37<5:32:10,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0024, 'learning_rate': 2.6670914616214095e-05, 'epoch': 0.46}
 15%|███████████▏                                                              | 1610/10701 [1:50:37<5:32:10,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 15%|███████████▏                                                              | 1610/10701 [1:50:37<5:32:10,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.921, 'learning_rate': 2.6667973728065878e-05, 'epoch': 0.46}
{'loss': 6.6917, 'learning_rate': 2.6665032839917658e-05, 'epoch': 0.46}
 15%|███████████▏                                                              | 1610/10701 [1:50:37<5:32:10,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5651, 'learning_rate': 2.6662091951769434e-05, 'epoch': 0.46}
 15%|███████████▏                                                              | 1610/10701 [1:50:37<5:32:10,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9405, 'learning_rate': 2.6659151063621214e-05, 'epoch': 0.46}
 15%|███████████▏                                                              | 1610/10701 [1:50:37<5:32:10,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5105, 'learning_rate': 2.6656210175472993e-05, 'epoch': 0.46}
 15%|███████████▏                                                              | 1610/10701 [1:50:37<5:32:10,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1642, 'learning_rate': 2.6653269287324773e-05, 'epoch': 0.46}
 15%|███████████▏                                                              | 1610/10701 [1:50:37<5:32:10,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7208, 'learning_rate': 2.6650328399176553e-05, 'epoch': 0.46}
 15%|███████████▏                                                              | 1610/10701 [1:50:37<5:32:10,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 15%|███████████▎                                                              | 1641/10701 [1:51:41<4:35:42,  1.83s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 15%|███████████▎                                                              | 1641/10701 [1:51:41<4:35:42,  1.83s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.048, 'learning_rate': 2.664444662288011e-05, 'epoch': 0.46}
{'loss': 6.9429, 'learning_rate': 2.664150573473189e-05, 'epoch': 0.46}
 15%|███████████▎                                                              | 1641/10701 [1:51:41<4:35:42,  1.83s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.959, 'learning_rate': 2.663856484658367e-05, 'epoch': 0.46}
 15%|███████████▎                                                              | 1641/10701 [1:51:41<4:35:42,  1.83s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7317, 'learning_rate': 2.6635623958435448e-05, 'epoch': 0.46}
{'loss': 7.2339, 'learning_rate': 2.6632683070287228e-05, 'epoch': 0.46}
 15%|███████████▎                                                              | 1641/10701 [1:51:41<4:35:42,  1.83s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.594, 'learning_rate': 2.6629742182139004e-05, 'epoch': 0.46}
 15%|███████████▎                                                              | 1641/10701 [1:51:41<4:35:42,  1.83s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8891, 'learning_rate': 2.6626801293990787e-05, 'epoch': 0.46}
[WARNING|modeling_utils.py:388] 2022-03-02 11:36:32,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 11:36:32,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6531, 'learning_rate': 2.6620919517694343e-05, 'epoch': 0.46}
[WARNING|modeling_utils.py:388] 2022-03-02 11:36:32,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.4846, 'learning_rate': 2.6617978629546123e-05, 'epoch': 0.46}
[WARNING|modeling_utils.py:388] 2022-03-02 11:36:32,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1359, 'learning_rate': 2.6615037741397903e-05, 'epoch': 0.46}
[WARNING|modeling_utils.py:388] 2022-03-02 11:36:32,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0511, 'learning_rate': 2.6612096853249682e-05, 'epoch': 0.46}
[WARNING|modeling_utils.py:388] 2022-03-02 11:36:32,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9024, 'learning_rate': 2.6609155965101462e-05, 'epoch': 0.46}
[WARNING|modeling_utils.py:388] 2022-03-02 11:36:32,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7885, 'learning_rate': 2.6606215076953242e-05, 'epoch': 0.46}
[WARNING|modeling_utils.py:388] 2022-03-02 11:36:32,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0845, 'learning_rate': 2.6603274188805018e-05, 'epoch': 0.46}
[WARNING|modeling_utils.py:388] 2022-03-02 11:36:32,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 11:36:32,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2311, 'learning_rate': 2.6600333300656798e-05, 'epoch': 0.46}
[WARNING|modeling_utils.py:388] 2022-03-02 11:36:32,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7279, 'learning_rate': 2.659739241250858e-05, 'epoch': 0.46}
[WARNING|modeling_utils.py:388] 2022-03-02 11:36:32,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7327, 'learning_rate': 2.6594451524360357e-05, 'epoch': 0.46}
[WARNING|modeling_utils.py:388] 2022-03-02 11:36:32,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8692, 'learning_rate': 2.6591510636212137e-05, 'epoch': 0.47}
[WARNING|modeling_utils.py:388] 2022-03-02 11:36:32,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0406, 'learning_rate': 2.6588569748063914e-05, 'epoch': 0.47}
[WARNING|modeling_utils.py:388] 2022-03-02 11:36:32,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 11:36:32,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8203, 'learning_rate': 2.6585628859915693e-05, 'epoch': 0.47}
[WARNING|modeling_utils.py:388] 2022-03-02 11:36:32,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0237, 'learning_rate': 2.6582687971767476e-05, 'epoch': 0.47}
{'loss': 7.474, 'learning_rate': 2.6579747083619253e-05, 'epoch': 0.47}
[WARNING|modeling_utils.py:388] 2022-03-02 11:36:32,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 11:36:32,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8786, 'learning_rate': 2.6576806195471032e-05, 'epoch': 0.47}
[WARNING|modeling_utils.py:388] 2022-03-02 11:36:32,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8215, 'learning_rate': 2.6573865307322812e-05, 'epoch': 0.47}
{'loss': 6.6645, 'learning_rate': 2.6570924419174592e-05, 'epoch': 0.47}
[WARNING|modeling_utils.py:388] 2022-03-02 11:36:32,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9194, 'learning_rate': 2.656798353102637e-05, 'epoch': 0.47}
[WARNING|modeling_utils.py:388] 2022-03-02 11:36:32,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0328, 'learning_rate': 2.656504264287815e-05, 'epoch': 0.47}
[WARNING|modeling_utils.py:388] 2022-03-02 11:36:32,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 11:36:32,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7424, 'learning_rate': 2.6562101754729928e-05, 'epoch': 0.47}
[WARNING|modeling_utils.py:388] 2022-03-02 11:36:32,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9089, 'learning_rate': 2.6559160866581707e-05, 'epoch': 0.47}
[WARNING|modeling_utils.py:388] 2022-03-02 11:36:32,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0994, 'learning_rate': 2.655621997843349e-05, 'epoch': 0.47}
{'loss': 6.7106, 'learning_rate': 2.6553279090285267e-05, 'epoch': 0.47}
[WARNING|modeling_utils.py:388] 2022-03-02 11:36:32,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 16%|███████████▌                                                              | 1673/10701 [1:52:45<5:16:34,  2.10s/it]g-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 16%|███████████▌                                                              | 1673/10701 [1:52:45<5:16:34,  2.10s/it]g-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 16%|███████████▌                                                              | 1673/10701 [1:52:45<5:16:34,  2.10s/it]g-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.775, 'learning_rate': 2.6547397313988823e-05, 'epoch': 0.47}
 16%|███████████▌                                                              | 1675/10701 [1:52:49<5:12:04,  2.07s/it]g-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 16%|███████████▌                                                              | 1675/10701 [1:52:49<5:12:04,  2.07s/it]g-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 16%|███████████▌                                                              | 1675/10701 [1:52:49<5:12:04,  2.07s/it]g-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2638, 'learning_rate': 2.6541515537692386e-05, 'epoch': 0.47}
 16%|███████████▌                                                              | 1675/10701 [1:52:49<5:12:04,  2.07s/it]g-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6513, 'learning_rate': 2.6538574649544162e-05, 'epoch': 0.47}
{'loss': 6.8855, 'learning_rate': 2.6535633761395942e-05, 'epoch': 0.47}
 16%|███████████▌                                                              | 1675/10701 [1:52:49<5:12:04,  2.07s/it]g-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8517, 'learning_rate': 2.653269287324772e-05, 'epoch': 0.47}
 16%|███████████▌                                                              | 1675/10701 [1:52:49<5:12:04,  2.07s/it]g-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2724, 'learning_rate': 2.65297519850995e-05, 'epoch': 0.47}
 16%|███████████▌                                                              | 1675/10701 [1:52:49<5:12:04,  2.07s/it]g-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9118, 'learning_rate': 2.652681109695128e-05, 'epoch': 0.47}
 16%|███████████▌                                                              | 1675/10701 [1:52:49<5:12:04,  2.07s/it]g-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9854, 'learning_rate': 2.652387020880306e-05, 'epoch': 0.47}
 16%|███████████▌                                                              | 1675/10701 [1:52:49<5:12:04,  2.07s/it]g-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6728, 'learning_rate': 2.6520929320654837e-05, 'epoch': 0.47}
 16%|███████████▌                                                              | 1675/10701 [1:52:49<5:12:04,  2.07s/it]g-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7321, 'learning_rate': 2.6517988432506617e-05, 'epoch': 0.47}
 16%|███████████▌                                                              | 1675/10701 [1:52:49<5:12:04,  2.07s/it]g-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9385, 'learning_rate': 2.65150475443584e-05, 'epoch': 0.47}
 16%|███████████▌                                                              | 1675/10701 [1:52:49<5:12:04,  2.07s/it]g-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5241, 'learning_rate': 2.6512106656210176e-05, 'epoch': 0.47}
 16%|███████████▌                                                              | 1675/10701 [1:52:49<5:12:04,  2.07s/it]g-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9102, 'learning_rate': 2.6509165768061956e-05, 'epoch': 0.47}
 16%|███████████▌                                                              | 1675/10701 [1:52:49<5:12:04,  2.07s/it]g-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8152, 'learning_rate': 2.6506224879913732e-05, 'epoch': 0.47}
 16%|███████████▌                                                              | 1675/10701 [1:52:49<5:12:04,  2.07s/it]g-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8767, 'learning_rate': 2.6503283991765512e-05, 'epoch': 0.47}
 16%|███████████▌                                                              | 1675/10701 [1:52:49<5:12:04,  2.07s/it]g-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6585, 'learning_rate': 2.6500343103617295e-05, 'epoch': 0.47}
 16%|███████████▌                                                              | 1675/10701 [1:52:49<5:12:04,  2.07s/it]g-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6551, 'learning_rate': 2.649740221546907e-05, 'epoch': 0.47}
{'loss': 6.8221, 'learning_rate': 2.649446132732085e-05, 'epoch': 0.47}
 16%|███████████▌                                                              | 1675/10701 [1:52:49<5:12:04,  2.07s/it]g-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9805, 'learning_rate': 2.649152043917263e-05, 'epoch': 0.47}
 16%|███████████▌                                                              | 1675/10701 [1:52:49<5:12:04,  2.07s/it]g-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5544, 'learning_rate': 2.6488579551024407e-05, 'epoch': 0.47}
 16%|███████████▋                                                              | 1695/10701 [1:53:25<3:45:11,  1.50s/it]g-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 16%|███████████▋                                                              | 1695/10701 [1:53:25<3:45:11,  1.50s/it]g-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6131, 'learning_rate': 2.648269777472797e-05, 'epoch': 0.48}
 16%|███████████▋                                                              | 1695/10701 [1:53:25<3:45:11,  1.50s/it]g-point operations will not be computed-02 11:34:53,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8125, 'learning_rate': 2.6479756886579746e-05, 'epoch': 0.48}
 16%|███████████▋                                                              | 1698/10701 [1:53:29<3:14:07,  1.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 16%|███████████▋                                                              | 1698/10701 [1:53:29<3:14:07,  1.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6354, 'learning_rate': 2.647387511028331e-05, 'epoch': 0.48}
 16%|███████████▋                                                              | 1698/10701 [1:53:29<3:14:07,  1.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.3913, 'learning_rate': 2.6470934222135086e-05, 'epoch': 0.48}
 16%|███████████▋                                                              | 1698/10701 [1:53:29<3:14:07,  1.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6814, 'learning_rate': 2.6467993333986865e-05, 'epoch': 0.48}
 16%|███████████▋                                                              | 1698/10701 [1:53:29<3:14:07,  1.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0953, 'learning_rate': 2.6465052445838645e-05, 'epoch': 0.48}
 16%|███████████▋                                                              | 1698/10701 [1:53:29<3:14:07,  1.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7125, 'learning_rate': 2.646211155769042e-05, 'epoch': 0.48}
 16%|███████████▋                                                              | 1698/10701 [1:53:29<3:14:07,  1.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0439, 'learning_rate': 2.6459170669542205e-05, 'epoch': 0.48}
 16%|███████████▋                                                              | 1698/10701 [1:53:29<3:14:07,  1.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 16%|███████████▋                                                              | 1698/10701 [1:53:29<3:14:07,  1.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8435, 'learning_rate': 2.645622978139398e-05, 'epoch': 0.48}
 16%|███████████▋                                                              | 1698/10701 [1:53:29<3:14:07,  1.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.967, 'learning_rate': 2.645328889324576e-05, 'epoch': 0.48}
 16%|███████████▋                                                              | 1698/10701 [1:53:29<3:14:07,  1.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0536, 'learning_rate': 2.645034800509754e-05, 'epoch': 0.48}
 16%|███████████▋                                                              | 1698/10701 [1:53:29<3:14:07,  1.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8638, 'learning_rate': 2.6447407116949317e-05, 'epoch': 0.48}
 16%|███████████▋                                                              | 1698/10701 [1:53:29<3:14:07,  1.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9726, 'learning_rate': 2.64444662288011e-05, 'epoch': 0.48}
 16%|███████████▋                                                              | 1698/10701 [1:53:29<3:14:07,  1.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5765, 'learning_rate': 2.644152534065288e-05, 'epoch': 0.48}
 16%|███████████▋                                                              | 1698/10701 [1:53:29<3:14:07,  1.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0056, 'learning_rate': 2.6438584452504656e-05, 'epoch': 0.48}
 16%|███████████▋                                                              | 1698/10701 [1:53:29<3:14:07,  1.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6124, 'learning_rate': 2.6435643564356436e-05, 'epoch': 0.48}
 16%|███████████▋                                                              | 1698/10701 [1:53:29<3:14:07,  1.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6363, 'learning_rate': 2.6432702676208215e-05, 'epoch': 0.48}
 16%|███████████▋                                                              | 1698/10701 [1:53:29<3:14:07,  1.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.882, 'learning_rate': 2.6429761788059995e-05, 'epoch': 0.48}
 16%|███████████▋                                                              | 1698/10701 [1:53:29<3:14:07,  1.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7963, 'learning_rate': 2.6426820899911775e-05, 'epoch': 0.48}
 16%|███████████▋                                                              | 1698/10701 [1:53:29<3:14:07,  1.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7426, 'learning_rate': 2.6423880011763554e-05, 'epoch': 0.48}
 16%|███████████▋                                                              | 1698/10701 [1:53:29<3:14:07,  1.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.4883, 'learning_rate': 2.642093912361533e-05, 'epoch': 0.48}
 16%|███████████▋                                                              | 1698/10701 [1:53:29<3:14:07,  1.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 16%|███████████▋                                                              | 1698/10701 [1:53:29<3:14:07,  1.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1498, 'learning_rate': 2.6417998235467114e-05, 'epoch': 0.48}
 16%|███████████▉                                                              | 1720/10701 [1:54:15<5:17:53,  2.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 16%|███████████▉                                                              | 1720/10701 [1:54:15<5:17:53,  2.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.015, 'learning_rate': 2.641211645917067e-05, 'epoch': 0.48}
{'loss': 6.9645, 'learning_rate': 2.640917557102245e-05, 'epoch': 0.48}
 16%|███████████▉                                                              | 1720/10701 [1:54:15<5:17:53,  2.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 16%|███████████▉                                                              | 1720/10701 [1:54:15<5:17:53,  2.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9873, 'learning_rate': 2.6406234682874226e-05, 'epoch': 0.48}
 16%|███████████▉                                                              | 1720/10701 [1:54:15<5:17:53,  2.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8577, 'learning_rate': 2.640329379472601e-05, 'epoch': 0.48}
 16%|███████████▉                                                              | 1720/10701 [1:54:15<5:17:53,  2.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0963, 'learning_rate': 2.640035290657779e-05, 'epoch': 0.48}
 16%|███████████▉                                                              | 1720/10701 [1:54:15<5:17:53,  2.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9173, 'learning_rate': 2.6397412018429565e-05, 'epoch': 0.48}
 16%|███████████▉                                                              | 1720/10701 [1:54:15<5:17:53,  2.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.017, 'learning_rate': 2.6394471130281345e-05, 'epoch': 0.48}
 16%|███████████▉                                                              | 1720/10701 [1:54:15<5:17:53,  2.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 16%|███████████▉                                                              | 1728/10701 [1:54:32<5:03:27,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 16%|███████████▉                                                              | 1728/10701 [1:54:32<5:03:27,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7346, 'learning_rate': 2.6388589353984904e-05, 'epoch': 0.48}
 16%|███████████▉                                                              | 1728/10701 [1:54:32<5:03:27,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6197, 'learning_rate': 2.6385648465836684e-05, 'epoch': 0.48}
 16%|███████████▉                                                              | 1728/10701 [1:54:32<5:03:27,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8883, 'learning_rate': 2.6382707577688464e-05, 'epoch': 0.48}
 16%|███████████▉                                                              | 1728/10701 [1:54:32<5:03:27,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8428, 'learning_rate': 2.637976668954024e-05, 'epoch': 0.49}
 16%|███████████▉                                                              | 1728/10701 [1:54:32<5:03:27,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 16%|███████████▉                                                              | 1728/10701 [1:54:32<5:03:27,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6392, 'learning_rate': 2.6376825801392023e-05, 'epoch': 0.49}
{'loss': 6.9188, 'learning_rate': 2.63738849132438e-05, 'epoch': 0.49}
 16%|███████████▉                                                              | 1728/10701 [1:54:32<5:03:27,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7021, 'learning_rate': 2.637094402509558e-05, 'epoch': 0.49}
{'loss': 7.0137, 'learning_rate': 2.636800313694736e-05, 'epoch': 0.49}
 16%|███████████▉                                                              | 1728/10701 [1:54:32<5:03:27,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8959, 'learning_rate': 2.6365062248799135e-05, 'epoch': 0.49}
 16%|███████████▉                                                              | 1728/10701 [1:54:32<5:03:27,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0967, 'learning_rate': 2.636212136065092e-05, 'epoch': 0.49}
 16%|███████████▉                                                              | 1728/10701 [1:54:32<5:03:27,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7107, 'learning_rate': 2.6356239584354475e-05, 'epoch': 0.49}
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1808, 'learning_rate': 2.6353298696206254e-05, 'epoch': 0.49}
{'loss': 6.9626, 'learning_rate': 2.6350357808058034e-05, 'epoch': 0.49}
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.834, 'learning_rate': 2.6347416919909814e-05, 'epoch': 0.49}
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6956, 'learning_rate': 2.6344476031761593e-05, 'epoch': 0.49}
{'loss': 6.8529, 'learning_rate': 2.6341535143613373e-05, 'epoch': 0.49}
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 16%|████████████                                                              | 1746/10701 [1:55:03<3:31:28,  1.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 16%|████████████                                                              | 1746/10701 [1:55:03<3:31:28,  1.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6588, 'learning_rate': 2.633565336731693e-05, 'epoch': 0.49}
{'loss': 6.9423, 'learning_rate': 2.6332712479168712e-05, 'epoch': 0.49}
 16%|████████████                                                              | 1746/10701 [1:55:03<3:31:28,  1.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:38:10,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6167, 'learning_rate': 2.632977159102049e-05, 'epoch': 0.49}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.3212, 'learning_rate': 2.6323889814724045e-05, 'epoch': 0.49}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9511, 'learning_rate': 2.6320948926575828e-05, 'epoch': 0.49}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9282, 'learning_rate': 2.6318008038427608e-05, 'epoch': 0.49}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8627, 'learning_rate': 2.6315067150279384e-05, 'epoch': 0.49}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8535, 'learning_rate': 2.6312126262131164e-05, 'epoch': 0.49}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.912, 'learning_rate': 2.6309185373982943e-05, 'epoch': 0.49}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7815, 'learning_rate': 2.6306244485834723e-05, 'epoch': 0.49}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9489, 'learning_rate': 2.6303303597686503e-05, 'epoch': 0.49}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0542, 'learning_rate': 2.6300362709538283e-05, 'epoch': 0.49}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8281, 'learning_rate': 2.629742182139006e-05, 'epoch': 0.49}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3003, 'learning_rate': 2.629448093324184e-05, 'epoch': 0.49}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7086, 'learning_rate': 2.6291540045093622e-05, 'epoch': 0.49}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9902, 'learning_rate': 2.6288599156945398e-05, 'epoch': 0.49}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7896, 'learning_rate': 2.6285658268797178e-05, 'epoch': 0.49}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5535, 'learning_rate': 2.6282717380648954e-05, 'epoch': 0.49}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6314, 'learning_rate': 2.6279776492500734e-05, 'epoch': 0.49}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8584, 'learning_rate': 2.6276835604352517e-05, 'epoch': 0.5}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9312, 'learning_rate': 2.6273894716204293e-05, 'epoch': 0.5}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9643, 'learning_rate': 2.6270953828056073e-05, 'epoch': 0.5}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8578, 'learning_rate': 2.6268012939907853e-05, 'epoch': 0.5}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8337, 'learning_rate': 2.6265072051759633e-05, 'epoch': 0.5}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9365, 'learning_rate': 2.6262131163611412e-05, 'epoch': 0.5}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6765, 'learning_rate': 2.6259190275463192e-05, 'epoch': 0.5}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8455, 'learning_rate': 2.625624938731497e-05, 'epoch': 0.5}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.74, 'learning_rate': 2.6253308499166748e-05, 'epoch': 0.5}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9146, 'learning_rate': 2.625036761101853e-05, 'epoch': 0.5}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1684, 'learning_rate': 2.6247426722870308e-05, 'epoch': 0.5}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2195, 'learning_rate': 2.6244485834722087e-05, 'epoch': 0.5}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1798, 'learning_rate': 2.6241544946573864e-05, 'epoch': 0.5}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8269, 'learning_rate': 2.6238604058425643e-05, 'epoch': 0.5}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8427, 'learning_rate': 2.6235663170277426e-05, 'epoch': 0.5}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3122, 'learning_rate': 2.6232722282129203e-05, 'epoch': 0.5}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.908, 'learning_rate': 2.6229781393980982e-05, 'epoch': 0.5}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.937, 'learning_rate': 2.6226840505832762e-05, 'epoch': 0.5}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1261, 'learning_rate': 2.6223899617684542e-05, 'epoch': 0.5}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8735, 'learning_rate': 2.622095872953632e-05, 'epoch': 0.5}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1306, 'learning_rate': 2.62180178413881e-05, 'epoch': 0.5}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0269, 'learning_rate': 2.6215076953239878e-05, 'epoch': 0.5}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7739, 'learning_rate': 2.6212136065091657e-05, 'epoch': 0.5}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.725, 'learning_rate': 2.620919517694344e-05, 'epoch': 0.5}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7392, 'learning_rate': 2.6206254288795217e-05, 'epoch': 0.5}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5852, 'learning_rate': 2.6203313400646997e-05, 'epoch': 0.5}
{'loss': 6.8317, 'learning_rate': 2.6200372512498773e-05, 'epoch': 0.5}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8104, 'learning_rate': 2.6197431624350553e-05, 'epoch': 0.5}
 16%|████████████                                                              | 1749/10701 [1:55:06<3:08:20,  1.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 17%|████████████▍                                                             | 1795/10701 [1:56:43<3:52:25,  1.57s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 17%|████████████▍                                                             | 1795/10701 [1:56:43<3:52:25,  1.57s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6442, 'learning_rate': 2.6191549848054112e-05, 'epoch': 0.5}
{'loss': 6.5882, 'learning_rate': 2.6188608959905892e-05, 'epoch': 0.5}
 17%|████████████▍                                                             | 1795/10701 [1:56:43<3:52:25,  1.57s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:39:48,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.2228, 'learning_rate': 2.618566807175767e-05, 'epoch': 0.5}
 17%|████████████▍                                                             | 1798/10701 [1:56:47<3:16:31,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 17%|████████████▍                                                             | 1798/10701 [1:56:47<3:16:31,  1.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.415, 'learning_rate': 2.617684540731301e-05, 'epoch': 0.5}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.791, 'learning_rate': 2.6173904519164787e-05, 'epoch': 0.5}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9943, 'learning_rate': 2.6170963631016567e-05, 'epoch': 0.51}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1975, 'learning_rate': 2.616802274286835e-05, 'epoch': 0.51}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8821, 'learning_rate': 2.6165081854720126e-05, 'epoch': 0.51}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9842, 'learning_rate': 2.6162140966571906e-05, 'epoch': 0.51}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2291, 'learning_rate': 2.6159200078423686e-05, 'epoch': 0.51}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8906, 'learning_rate': 2.6156259190275462e-05, 'epoch': 0.51}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0065, 'learning_rate': 2.6153318302127245e-05, 'epoch': 0.51}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7536, 'learning_rate': 2.615037741397902e-05, 'epoch': 0.51}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1298, 'learning_rate': 2.61474365258308e-05, 'epoch': 0.51}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7291, 'learning_rate': 2.614449563768258e-05, 'epoch': 0.51}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0438, 'learning_rate': 2.6141554749534357e-05, 'epoch': 0.51}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5945, 'learning_rate': 2.613861386138614e-05, 'epoch': 0.51}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9657, 'learning_rate': 2.613567297323792e-05, 'epoch': 0.51}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0949, 'learning_rate': 2.6132732085089696e-05, 'epoch': 0.51}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5891, 'learning_rate': 2.6129791196941476e-05, 'epoch': 0.51}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8305, 'learning_rate': 2.6126850308793256e-05, 'epoch': 0.51}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9648, 'learning_rate': 2.6123909420645036e-05, 'epoch': 0.51}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0946, 'learning_rate': 2.6120968532496815e-05, 'epoch': 0.51}
{'loss': 6.9264, 'learning_rate': 2.6118027644348595e-05, 'epoch': 0.51}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5587, 'learning_rate': 2.611508675620037e-05, 'epoch': 0.51}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5979, 'learning_rate': 2.6112145868052155e-05, 'epoch': 0.51}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8238, 'learning_rate': 2.610920497990393e-05, 'epoch': 0.51}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.905, 'learning_rate': 2.610626409175571e-05, 'epoch': 0.51}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1408, 'learning_rate': 2.610332320360749e-05, 'epoch': 0.51}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8494, 'learning_rate': 2.6100382315459267e-05, 'epoch': 0.51}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6422, 'learning_rate': 2.609744142731105e-05, 'epoch': 0.51}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.854, 'learning_rate': 2.609450053916283e-05, 'epoch': 0.51}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7383, 'learning_rate': 2.6091559651014606e-05, 'epoch': 0.51}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7039, 'learning_rate': 2.6088618762866386e-05, 'epoch': 0.51}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6673, 'learning_rate': 2.6085677874718165e-05, 'epoch': 0.51}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1636, 'learning_rate': 2.6082736986569945e-05, 'epoch': 0.51}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.64, 'learning_rate': 2.6079796098421725e-05, 'epoch': 0.51}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9548, 'learning_rate': 2.6076855210273504e-05, 'epoch': 0.51}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.977, 'learning_rate': 2.607391432212528e-05, 'epoch': 0.51}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.583, 'learning_rate': 2.607097343397706e-05, 'epoch': 0.51}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0903, 'learning_rate': 2.606803254582884e-05, 'epoch': 0.51}
{'loss': 6.7082, 'learning_rate': 2.606509165768062e-05, 'epoch': 0.52}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6618, 'learning_rate': 2.60621507695324e-05, 'epoch': 0.52}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9312, 'learning_rate': 2.6059209881384176e-05, 'epoch': 0.52}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.774, 'learning_rate': 2.605626899323596e-05, 'epoch': 0.52}
 17%|████████████▍                                                             | 1800/10701 [1:56:49<3:19:42,  1.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8477, 'learning_rate': 2.605332810508774e-05, 'epoch': 0.52}
 17%|████████████▊                                                             | 1844/10701 [1:58:20<3:50:01,  1.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 17%|████████████▊                                                             | 1844/10701 [1:58:20<3:50:01,  1.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 11:43:03,682 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 11:43:03,682 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8462, 'learning_rate': 2.6044505440643075e-05, 'epoch': 0.52}
{'loss': 6.7664, 'learning_rate': 2.6041564552494854e-05, 'epoch': 0.52}
[WARNING|modeling_utils.py:388] 2022-03-02 11:43:03,682 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 17%|████████████▊                                                             | 1849/10701 [1:58:26<3:03:25,  1.24s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 17%|████████████▊                                                             | 1849/10701 [1:58:26<3:03:25,  1.24s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6873, 'learning_rate': 2.6035682776198414e-05, 'epoch': 0.52}
{'loss': 6.5773, 'learning_rate': 2.603274188805019e-05, 'epoch': 0.52}
 17%|████████████▊                                                             | 1849/10701 [1:58:26<3:03:25,  1.24s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.4937, 'learning_rate': 2.602980099990197e-05, 'epoch': 0.52}
 17%|████████████▊                                                             | 1849/10701 [1:58:26<3:03:25,  1.24s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7531, 'learning_rate': 2.6026860111753753e-05, 'epoch': 0.52}
 17%|████████████▊                                                             | 1849/10701 [1:58:26<3:03:25,  1.24s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9362, 'learning_rate': 2.602391922360553e-05, 'epoch': 0.52}
 17%|████████████▊                                                             | 1849/10701 [1:58:26<3:03:25,  1.24s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9697, 'learning_rate': 2.602097833545731e-05, 'epoch': 0.52}
 17%|████████████▊                                                             | 1849/10701 [1:58:26<3:03:25,  1.24s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9297, 'learning_rate': 2.6018037447309085e-05, 'epoch': 0.52}
 17%|████████████▊                                                             | 1849/10701 [1:58:26<3:03:25,  1.24s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 17%|████████████▊                                                             | 1849/10701 [1:58:26<3:03:25,  1.24s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0727, 'learning_rate': 2.601509655916087e-05, 'epoch': 0.52}
 17%|████████████▊                                                             | 1849/10701 [1:58:26<3:03:25,  1.24s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8941, 'learning_rate': 2.6012155671012648e-05, 'epoch': 0.52}
 17%|████████████▊                                                             | 1849/10701 [1:58:26<3:03:25,  1.24s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8598, 'learning_rate': 2.6009214782864425e-05, 'epoch': 0.52}
 17%|████████████▊                                                             | 1849/10701 [1:58:26<3:03:25,  1.24s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7963, 'learning_rate': 2.6006273894716204e-05, 'epoch': 0.52}
 17%|████████████▊                                                             | 1849/10701 [1:58:26<3:03:25,  1.24s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9727, 'learning_rate': 2.6003333006567984e-05, 'epoch': 0.52}
 17%|████████████▊                                                             | 1849/10701 [1:58:26<3:03:25,  1.24s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9553, 'learning_rate': 2.6000392118419764e-05, 'epoch': 0.52}
 17%|████████████▊                                                             | 1849/10701 [1:58:26<3:03:25,  1.24s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8855, 'learning_rate': 2.5997451230271544e-05, 'epoch': 0.52}
 17%|████████████▊                                                             | 1849/10701 [1:58:26<3:03:25,  1.24s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9693, 'learning_rate': 2.5994510342123323e-05, 'epoch': 0.52}
 17%|████████████▊                                                             | 1849/10701 [1:58:26<3:03:25,  1.24s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.573, 'learning_rate': 2.59915694539751e-05, 'epoch': 0.52}
 17%|████████████▊                                                             | 1849/10701 [1:58:26<3:03:25,  1.24s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0523, 'learning_rate': 2.598862856582688e-05, 'epoch': 0.52}
 17%|████████████▊                                                             | 1849/10701 [1:58:26<3:03:25,  1.24s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 17%|████████████▉                                                             | 1866/10701 [1:59:03<5:21:31,  2.18s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 17%|████████████▉                                                             | 1866/10701 [1:59:03<5:21:31,  2.18s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8458, 'learning_rate': 2.598274678953044e-05, 'epoch': 0.52}
 17%|████████████▉                                                             | 1866/10701 [1:59:03<5:21:31,  2.18s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 17%|████████████▉                                                             | 1866/10701 [1:59:03<5:21:31,  2.18s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 17%|████████████▉                                                             | 1869/10701 [1:59:10<5:16:23,  2.15s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 17%|████████████▉                                                             | 1869/10701 [1:59:10<5:16:23,  2.15s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8093, 'learning_rate': 2.5976865013233995e-05, 'epoch': 0.52}
 17%|████████████▉                                                             | 1869/10701 [1:59:10<5:16:23,  2.15s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7217, 'learning_rate': 2.5973924125085775e-05, 'epoch': 0.52}
{'loss': 6.5345, 'learning_rate': 2.5970983236937558e-05, 'epoch': 0.52}
 17%|████████████▉                                                             | 1869/10701 [1:59:10<5:16:23,  2.15s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1134, 'learning_rate': 2.5968042348789334e-05, 'epoch': 0.52}
 17%|████████████▉                                                             | 1869/10701 [1:59:10<5:16:23,  2.15s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8014, 'learning_rate': 2.5965101460641114e-05, 'epoch': 0.52}
 17%|████████████▉                                                             | 1869/10701 [1:59:10<5:16:23,  2.15s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6718, 'learning_rate': 2.5962160572492893e-05, 'epoch': 0.53}
 17%|████████████▉                                                             | 1869/10701 [1:59:10<5:16:23,  2.15s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0966, 'learning_rate': 2.5959219684344673e-05, 'epoch': 0.53}
 17%|████████████▉                                                             | 1869/10701 [1:59:10<5:16:23,  2.15s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7542, 'learning_rate': 2.5956278796196453e-05, 'epoch': 0.53}
 17%|████████████▉                                                             | 1869/10701 [1:59:10<5:16:23,  2.15s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8391, 'learning_rate': 2.5953337908048233e-05, 'epoch': 0.53}
{'loss': 7.0162, 'learning_rate': 2.595039701990001e-05, 'epoch': 0.53}
 17%|████████████▉                                                             | 1869/10701 [1:59:10<5:16:23,  2.15s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 17%|████████████▉                                                             | 1869/10701 [1:59:10<5:16:23,  2.15s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6009, 'learning_rate': 2.594745613175179e-05, 'epoch': 0.53}
 17%|████████████▉                                                             | 1869/10701 [1:59:10<5:16:23,  2.15s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9212, 'learning_rate': 2.5944515243603572e-05, 'epoch': 0.53}
{'loss': 7.1144, 'learning_rate': 2.5941574355455348e-05, 'epoch': 0.53}
 17%|████████████▉                                                             | 1869/10701 [1:59:10<5:16:23,  2.15s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0602, 'learning_rate': 2.5938633467307128e-05, 'epoch': 0.53}
 17%|████████████▉                                                             | 1869/10701 [1:59:10<5:16:23,  2.15s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0328, 'learning_rate': 2.5935692579158904e-05, 'epoch': 0.53}
 17%|████████████▉                                                             | 1869/10701 [1:59:10<5:16:23,  2.15s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7034, 'learning_rate': 2.5932751691010684e-05, 'epoch': 0.53}
 17%|████████████▉                                                             | 1869/10701 [1:59:10<5:16:23,  2.15s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8623, 'learning_rate': 2.5929810802862467e-05, 'epoch': 0.53}
 17%|████████████▉                                                             | 1869/10701 [1:59:10<5:16:23,  2.15s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8943, 'learning_rate': 2.5926869914714243e-05, 'epoch': 0.53}
 17%|████████████▉                                                             | 1869/10701 [1:59:10<5:16:23,  2.15s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0403, 'learning_rate': 2.5923929026566023e-05, 'epoch': 0.53}
 17%|████████████▉                                                             | 1869/10701 [1:59:10<5:16:23,  2.15s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9519, 'learning_rate': 2.5920988138417803e-05, 'epoch': 0.53}
 17%|████████████▉                                                             | 1869/10701 [1:59:10<5:16:23,  2.15s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 17%|████████████▉                                                             | 1869/10701 [1:59:10<5:16:23,  2.15s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6534, 'learning_rate': 2.591804725026958e-05, 'epoch': 0.53}
 17%|████████████▉                                                             | 1869/10701 [1:59:10<5:16:23,  2.15s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6456, 'learning_rate': 2.5915106362121362e-05, 'epoch': 0.53}
 17%|████████████▉                                                             | 1869/10701 [1:59:10<5:16:23,  2.15s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 18%|█████████████                                                             | 1892/10701 [1:59:54<4:13:39,  1.73s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 18%|█████████████                                                             | 1892/10701 [1:59:54<4:13:39,  1.73s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7544, 'learning_rate': 2.590922458582492e-05, 'epoch': 0.53}
{'loss': 7.2367, 'learning_rate': 2.5906283697676698e-05, 'epoch': 0.53}
 18%|█████████████                                                             | 1892/10701 [1:59:54<4:13:39,  1.73s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 18%|█████████████                                                             | 1895/10701 [1:59:59<3:40:33,  1.50s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 18%|█████████████                                                             | 1895/10701 [1:59:59<3:40:33,  1.50s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0329, 'learning_rate': 2.5900401921380258e-05, 'epoch': 0.53}
{'loss': 7.0995, 'learning_rate': 2.5897461033232037e-05, 'epoch': 0.53}
 18%|█████████████                                                             | 1895/10701 [1:59:59<3:40:33,  1.50s/it]g-point operations will not be computed-02 11:41:28,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.1077, 'learning_rate': 2.5891579256935593e-05, 'epoch': 0.53}
{'loss': 6.997, 'learning_rate': 2.5888638368787376e-05, 'epoch': 0.53}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.3895, 'learning_rate': 2.5885697480639153e-05, 'epoch': 0.53}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.3324, 'learning_rate': 2.5882756592490933e-05, 'epoch': 0.53}
{'loss': 7.0491, 'learning_rate': 2.5879815704342712e-05, 'epoch': 0.53}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8832, 'learning_rate': 2.587687481619449e-05, 'epoch': 0.53}
{'loss': 6.9032, 'learning_rate': 2.5873933928046272e-05, 'epoch': 0.53}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7519, 'learning_rate': 2.587099303989805e-05, 'epoch': 0.53}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8001, 'learning_rate': 2.5868052151749828e-05, 'epoch': 0.53}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0718, 'learning_rate': 2.5865111263601607e-05, 'epoch': 0.53}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9452, 'learning_rate': 2.586217037545339e-05, 'epoch': 0.53}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8769, 'learning_rate': 2.5859229487305167e-05, 'epoch': 0.53}
{'loss': 7.1186, 'learning_rate': 2.5856288599156947e-05, 'epoch': 0.54}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7519, 'learning_rate': 2.5853347711008726e-05, 'epoch': 0.54}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.817, 'learning_rate': 2.5850406822860503e-05, 'epoch': 0.54}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8367, 'learning_rate': 2.5847465934712286e-05, 'epoch': 0.54}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0768, 'learning_rate': 2.5844525046564062e-05, 'epoch': 0.54}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8444, 'learning_rate': 2.5841584158415842e-05, 'epoch': 0.54}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9229, 'learning_rate': 2.583864327026762e-05, 'epoch': 0.54}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7204, 'learning_rate': 2.5835702382119398e-05, 'epoch': 0.54}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6724, 'learning_rate': 2.583276149397118e-05, 'epoch': 0.54}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7789, 'learning_rate': 2.582982060582296e-05, 'epoch': 0.54}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8132, 'learning_rate': 2.5826879717674737e-05, 'epoch': 0.54}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7303, 'learning_rate': 2.5823938829526517e-05, 'epoch': 0.54}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.08, 'learning_rate': 2.5820997941378297e-05, 'epoch': 0.54}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7745, 'learning_rate': 2.5818057053230076e-05, 'epoch': 0.54}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8486, 'learning_rate': 2.5815116165081856e-05, 'epoch': 0.54}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0211, 'learning_rate': 2.5812175276933636e-05, 'epoch': 0.54}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7057, 'learning_rate': 2.5809234388785412e-05, 'epoch': 0.54}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1332, 'learning_rate': 2.5806293500637195e-05, 'epoch': 0.54}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9705, 'learning_rate': 2.580335261248897e-05, 'epoch': 0.54}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7513, 'learning_rate': 2.580041172434075e-05, 'epoch': 0.54}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7446, 'learning_rate': 2.579747083619253e-05, 'epoch': 0.54}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7757, 'learning_rate': 2.5794529948044307e-05, 'epoch': 0.54}
{'loss': 6.5069, 'learning_rate': 2.579158905989609e-05, 'epoch': 0.54}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7707, 'learning_rate': 2.578864817174787e-05, 'epoch': 0.54}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0178, 'learning_rate': 2.5785707283599647e-05, 'epoch': 0.54}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0389, 'learning_rate': 2.5782766395451426e-05, 'epoch': 0.54}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7174, 'learning_rate': 2.5779825507303206e-05, 'epoch': 0.54}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6284, 'learning_rate': 2.5776884619154986e-05, 'epoch': 0.54}
{'loss': 6.8459, 'learning_rate': 2.5773943731006765e-05, 'epoch': 0.54}
{'loss': 6.8277, 'learning_rate': 2.5771002842858545e-05, 'epoch': 0.54}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1642, 'learning_rate': 2.576806195471032e-05, 'epoch': 0.54}
{'loss': 6.7875, 'learning_rate': 2.57651210665621e-05, 'epoch': 0.54}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5538, 'learning_rate': 2.5762180178413884e-05, 'epoch': 0.54}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7863, 'learning_rate': 2.575923929026566e-05, 'epoch': 0.54}
{'loss': 6.6633, 'learning_rate': 2.575629840211744e-05, 'epoch': 0.54}
 18%|█████████████▏                                                            | 1898/10701 [2:00:02<3:08:27,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 11:46:16,915 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 11:46:16,915 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6408, 'learning_rate': 2.5750416625821e-05, 'epoch': 0.55}
{'loss': 5.9805, 'learning_rate': 2.574747573767278e-05, 'epoch': 0.55}
[WARNING|modeling_utils.py:388] 2022-03-02 11:46:16,915 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.307, 'learning_rate': 2.5744534849524556e-05, 'epoch': 0.55}
 18%|█████████████▍                                                            | 1948/10701 [2:01:39<3:07:42,  1.29s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 18%|█████████████▍                                                            | 1948/10701 [2:01:39<3:07:42,  1.29s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7126, 'learning_rate': 2.5738653073228115e-05, 'epoch': 0.55}
 18%|█████████████▍                                                            | 1948/10701 [2:01:39<3:07:42,  1.29s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.3463, 'learning_rate': 2.5735712185079895e-05, 'epoch': 0.55}
 18%|█████████████▍                                                            | 1948/10701 [2:01:39<3:07:42,  1.29s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1986, 'learning_rate': 2.5732771296931675e-05, 'epoch': 0.55}
 18%|█████████████▍                                                            | 1948/10701 [2:01:39<3:07:42,  1.29s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7742, 'learning_rate': 2.5729830408783455e-05, 'epoch': 0.55}
 18%|█████████████▍                                                            | 1948/10701 [2:01:39<3:07:42,  1.29s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9881, 'learning_rate': 2.572688952063523e-05, 'epoch': 0.55}
 18%|█████████████▍                                                            | 1948/10701 [2:01:39<3:07:42,  1.29s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1541, 'learning_rate': 2.572394863248701e-05, 'epoch': 0.55}
 18%|█████████████▍                                                            | 1948/10701 [2:01:39<3:07:42,  1.29s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 18%|█████████████▍                                                            | 1948/10701 [2:01:39<3:07:42,  1.29s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6856, 'learning_rate': 2.5721007744338794e-05, 'epoch': 0.55}
 18%|█████████████▍                                                            | 1948/10701 [2:01:39<3:07:42,  1.29s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0676, 'learning_rate': 2.571806685619057e-05, 'epoch': 0.55}
 18%|█████████████▍                                                            | 1948/10701 [2:01:39<3:07:42,  1.29s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7071, 'learning_rate': 2.571512596804235e-05, 'epoch': 0.55}
 18%|█████████████▍                                                            | 1948/10701 [2:01:39<3:07:42,  1.29s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8802, 'learning_rate': 2.5712185079894126e-05, 'epoch': 0.55}
 18%|█████████████▍                                                            | 1948/10701 [2:01:39<3:07:42,  1.29s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8981, 'learning_rate': 2.570924419174591e-05, 'epoch': 0.55}
 18%|█████████████▍                                                            | 1948/10701 [2:01:39<3:07:42,  1.29s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
                                                                                                                        g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
                                                                                                                        g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6975, 'learning_rate': 2.5703362415449465e-05, 'epoch': 0.55}
                                                                                                                        g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8373, 'learning_rate': 2.5700421527301245e-05, 'epoch': 0.55}
                                                                                                                        g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6784, 'learning_rate': 2.5697480639153025e-05, 'epoch': 0.55}
                                                                                                                        g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7507, 'learning_rate': 2.5694539751004804e-05, 'epoch': 0.55}
                                                                                                                        g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8517, 'learning_rate': 2.5691598862856584e-05, 'epoch': 0.55}
                                                                                                                        g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8052, 'learning_rate': 2.5688657974708364e-05, 'epoch': 0.55}
                                                                                                                        g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0495, 'learning_rate': 2.568571708656014e-05, 'epoch': 0.55}
                                                                                                                        g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9128, 'learning_rate': 2.568277619841192e-05, 'epoch': 0.55}
                                                                                                                        g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1664, 'learning_rate': 2.5679835310263703e-05, 'epoch': 0.55}
                                                                                                                        g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7858, 'learning_rate': 2.567689442211548e-05, 'epoch': 0.55}
                                                                                                                        g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8448, 'learning_rate': 2.567395353396726e-05, 'epoch': 0.55}
                                                                                                                        g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8092, 'learning_rate': 2.5671012645819036e-05, 'epoch': 0.55}
                                                                                                                        g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0978, 'learning_rate': 2.5668071757670815e-05, 'epoch': 0.55}
                                                                                                                        g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
                                                                                                                        g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1248, 'learning_rate': 2.56651308695226e-05, 'epoch': 0.55}
{'loss': 6.6199, 'learning_rate': 2.5662189981374375e-05, 'epoch': 0.55}
                                                                                                                        g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7471, 'learning_rate': 2.5659249093226154e-05, 'epoch': 0.55}
                                                                                                                        g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
                                                                                                                        g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2782, 'learning_rate': 2.5656308205077934e-05, 'epoch': 0.55}
                                                                                                                        g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0666, 'learning_rate': 2.5653367316929714e-05, 'epoch': 0.55}
                                                                                                                        g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9217, 'learning_rate': 2.5650426428781494e-05, 'epoch': 0.55}
                                                                                                                        g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7061, 'learning_rate': 2.5647485540633273e-05, 'epoch': 0.56}
                                                                                                                        g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5039, 'learning_rate': 2.564454465248505e-05, 'epoch': 0.56}
                                                                                                                        g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6259, 'learning_rate': 2.564160376433683e-05, 'epoch': 0.56}
                                                                                                                        g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8876, 'learning_rate': 2.5638662876188612e-05, 'epoch': 0.56}
                                                                                                                        g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.536, 'learning_rate': 2.563572198804039e-05, 'epoch': 0.56}
                                                                                                                        g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9839, 'learning_rate': 2.563278109989217e-05, 'epoch': 0.56}
{'loss': 7.1828, 'learning_rate': 2.5629840211743948e-05, 'epoch': 0.56}
                                                                                                                        g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0482, 'learning_rate': 2.5626899323595725e-05, 'epoch': 0.56}
 19%|█████████████▋                                                            | 1988/10701 [2:03:02<4:25:51,  1.83s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 19%|█████████████▋                                                            | 1988/10701 [2:03:02<4:25:51,  1.83s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9153, 'learning_rate': 2.5621017547299284e-05, 'epoch': 0.56}
 19%|█████████████▋                                                            | 1988/10701 [2:03:02<4:25:51,  1.83s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7758, 'learning_rate': 2.5618076659151064e-05, 'epoch': 0.56}
 19%|█████████████▋                                                            | 1988/10701 [2:03:02<4:25:51,  1.83s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9849, 'learning_rate': 2.5615135771002844e-05, 'epoch': 0.56}
 19%|█████████████▋                                                            | 1988/10701 [2:03:02<4:25:51,  1.83s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7537, 'learning_rate': 2.561219488285462e-05, 'epoch': 0.56}
{'loss': 7.0098, 'learning_rate': 2.5609253994706403e-05, 'epoch': 0.56}
 19%|█████████████▋                                                            | 1988/10701 [2:03:02<4:25:51,  1.83s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 11:47:53,737 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 11:47:53,737 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8096, 'learning_rate': 2.560337221840996e-05, 'epoch': 0.56}
{'loss': 6.3656, 'learning_rate': 2.560043133026174e-05, 'epoch': 0.56}
[WARNING|modeling_utils.py:388] 2022-03-02 11:47:53,737 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7421, 'learning_rate': 2.5597490442113522e-05, 'epoch': 0.56}
 19%|█████████████▊                                                            | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 19%|█████████████▊                                                            | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6391, 'learning_rate': 2.5591608665817078e-05, 'epoch': 0.56}
 19%|█████████████▊                                                            | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
03/02/2022 12:11:02 - INFO - datasets.metric - Removing /home/sanchit_huggingface_co/.cache/huggingface/metrics/wer/default/default_experiment-1-0.arrow
{'eval_loss': 6.748763084411621, 'eval_wer': 1.2782506895251702, 'eval_runtime': 1382.3061, 'eval_samples_per_second': 1.911, 'eval_steps_per_second': 0.478, 'epoch': 0.56}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2529, 'learning_rate': 2.5585726889520634e-05, 'epoch': 0.56}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7807, 'learning_rate': 2.5582786001372417e-05, 'epoch': 0.56}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9268, 'learning_rate': 2.5579845113224193e-05, 'epoch': 0.56}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7207, 'learning_rate': 2.5576904225075973e-05, 'epoch': 0.56}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9993, 'learning_rate': 2.5573963336927753e-05, 'epoch': 0.56}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0005, 'learning_rate': 2.557102244877953e-05, 'epoch': 0.56}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0106, 'learning_rate': 2.5568081560631312e-05, 'epoch': 0.56}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8497, 'learning_rate': 2.5565140672483092e-05, 'epoch': 0.56}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0406, 'learning_rate': 2.556219978433487e-05, 'epoch': 0.56}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8372, 'learning_rate': 2.5559258896186648e-05, 'epoch': 0.56}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8471, 'learning_rate': 2.5556318008038428e-05, 'epoch': 0.56}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8796, 'learning_rate': 2.5553377119890208e-05, 'epoch': 0.56}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8546, 'learning_rate': 2.5550436231741987e-05, 'epoch': 0.56}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0028, 'learning_rate': 2.5547495343593767e-05, 'epoch': 0.56}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0575, 'learning_rate': 2.5544554455445543e-05, 'epoch': 0.56}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6909, 'learning_rate': 2.5541613567297327e-05, 'epoch': 0.57}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8336, 'learning_rate': 2.5538672679149103e-05, 'epoch': 0.57}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0707, 'learning_rate': 2.5535731791000883e-05, 'epoch': 0.57}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.219, 'learning_rate': 2.5532790902852662e-05, 'epoch': 0.57}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0066, 'learning_rate': 2.552985001470444e-05, 'epoch': 0.57}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.85, 'learning_rate': 2.5526909126556222e-05, 'epoch': 0.57}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5039, 'learning_rate': 2.5523968238408e-05, 'epoch': 0.57}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.626, 'learning_rate': 2.5521027350259778e-05, 'epoch': 0.57}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9316, 'learning_rate': 2.5518086462111558e-05, 'epoch': 0.57}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.4664, 'learning_rate': 2.5515145573963337e-05, 'epoch': 0.57}
{'loss': 6.8161, 'learning_rate': 2.5512204685815117e-05, 'epoch': 0.57}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9745, 'learning_rate': 2.5509263797666897e-05, 'epoch': 0.57}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1103, 'learning_rate': 2.5506322909518676e-05, 'epoch': 0.57}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7426, 'learning_rate': 2.5503382021370453e-05, 'epoch': 0.57}
{'loss': 6.8236, 'learning_rate': 2.5500441133222236e-05, 'epoch': 0.57}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5952, 'learning_rate': 2.5497500245074012e-05, 'epoch': 0.57}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8875, 'learning_rate': 2.5494559356925792e-05, 'epoch': 0.57}
{'loss': 6.8926, 'learning_rate': 2.549161846877757e-05, 'epoch': 0.57}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9122, 'learning_rate': 2.5488677580629348e-05, 'epoch': 0.57}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0434, 'learning_rate': 2.548573669248113e-05, 'epoch': 0.57}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8169, 'learning_rate': 2.548279580433291e-05, 'epoch': 0.57}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5802, 'learning_rate': 2.5479854916184687e-05, 'epoch': 0.57}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9485, 'learning_rate': 2.5476914028036467e-05, 'epoch': 0.57}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0152, 'learning_rate': 2.5473973139888247e-05, 'epoch': 0.57}
[INFO|trainer.py:2366] 2022-03-02 11:48:00,169 >>   Num examples = 2642        | 1998/10701 [2:03:16<3:04:00,  1.27s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 19%|██████████████                                                            | 2041/10701 [2:29:50<4:18:26,  1.79s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 19%|██████████████                                                            | 2041/10701 [2:29:50<4:18:26,  1.79s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1451, 'learning_rate': 2.5468091363591806e-05, 'epoch': 0.57}
 19%|██████████████                                                            | 2041/10701 [2:29:50<4:18:26,  1.79s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.702, 'learning_rate': 2.5465150475443586e-05, 'epoch': 0.57}
 19%|██████████████                                                            | 2041/10701 [2:29:50<4:18:26,  1.79s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.728, 'learning_rate': 2.5462209587295362e-05, 'epoch': 0.57}
 19%|██████████████▏                                                           | 2045/10701 [2:29:55<3:40:41,  1.53s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 19%|██████████████▏                                                           | 2045/10701 [2:29:55<3:40:41,  1.53s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0544, 'learning_rate': 2.5456327810998925e-05, 'epoch': 0.57}
 19%|██████████████▏                                                           | 2045/10701 [2:29:55<3:40:41,  1.53s/it]g-point operations will not be computed-02 11:44:44,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5764, 'learning_rate': 2.54533869228507e-05, 'epoch': 0.57}
 19%|██████████████▏                                                           | 2048/10701 [2:29:59<3:16:55,  1.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 19%|██████████████▏                                                           | 2048/10701 [2:29:59<3:16:55,  1.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.3314, 'learning_rate': 2.5447505146554257e-05, 'epoch': 0.57}
{'loss': 6.6911, 'learning_rate': 2.544456425840604e-05, 'epoch': 0.57}
 19%|██████████████▏                                                           | 2048/10701 [2:29:59<3:16:55,  1.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0426, 'learning_rate': 2.544162337025782e-05, 'epoch': 0.57}
 19%|██████████████▏                                                           | 2048/10701 [2:29:59<3:16:55,  1.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8477, 'learning_rate': 2.5438682482109597e-05, 'epoch': 0.57}
 19%|██████████████▏                                                           | 2048/10701 [2:29:59<3:16:55,  1.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 19%|██████████████▏                                                           | 2048/10701 [2:29:59<3:16:55,  1.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2061, 'learning_rate': 2.5435741593961376e-05, 'epoch': 0.58}
 19%|██████████████▏                                                           | 2048/10701 [2:29:59<3:16:55,  1.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9241, 'learning_rate': 2.5432800705813156e-05, 'epoch': 0.58}
 19%|██████████████▏                                                           | 2048/10701 [2:29:59<3:16:55,  1.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0111, 'learning_rate': 2.5429859817664936e-05, 'epoch': 0.58}
 19%|██████████████▏                                                           | 2048/10701 [2:29:59<3:16:55,  1.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1695, 'learning_rate': 2.5426918929516715e-05, 'epoch': 0.58}
 19%|██████████████▏                                                           | 2048/10701 [2:29:59<3:16:55,  1.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0406, 'learning_rate': 2.5423978041368495e-05, 'epoch': 0.58}
 19%|██████████████▏                                                           | 2048/10701 [2:29:59<3:16:55,  1.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6943, 'learning_rate': 2.542103715322027e-05, 'epoch': 0.58}
 19%|██████████████▏                                                           | 2048/10701 [2:29:59<3:16:55,  1.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 19%|██████████████▏                                                           | 2048/10701 [2:29:59<3:16:55,  1.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8738, 'learning_rate': 2.541809626507205e-05, 'epoch': 0.58}
 19%|██████████████▏                                                           | 2048/10701 [2:29:59<3:16:55,  1.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7525, 'learning_rate': 2.5415155376923834e-05, 'epoch': 0.58}
 19%|██████████████▏                                                           | 2048/10701 [2:29:59<3:16:55,  1.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7762, 'learning_rate': 2.541221448877561e-05, 'epoch': 0.58}
 19%|██████████████▏                                                           | 2048/10701 [2:29:59<3:16:55,  1.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7266, 'learning_rate': 2.540927360062739e-05, 'epoch': 0.58}
 19%|██████████████▏                                                           | 2048/10701 [2:29:59<3:16:55,  1.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9103, 'learning_rate': 2.5406332712479167e-05, 'epoch': 0.58}
 19%|██████████████▏                                                           | 2048/10701 [2:29:59<3:16:55,  1.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0276, 'learning_rate': 2.5403391824330947e-05, 'epoch': 0.58}
 19%|██████████████▏                                                           | 2048/10701 [2:29:59<3:16:55,  1.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9163, 'learning_rate': 2.540045093618273e-05, 'epoch': 0.58}
 19%|██████████████▏                                                           | 2048/10701 [2:29:59<3:16:55,  1.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0119, 'learning_rate': 2.5397510048034506e-05, 'epoch': 0.58}
 19%|██████████████▏                                                           | 2048/10701 [2:29:59<3:16:55,  1.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7238, 'learning_rate': 2.5394569159886286e-05, 'epoch': 0.58}
 19%|██████████████▏                                                           | 2048/10701 [2:29:59<3:16:55,  1.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8647, 'learning_rate': 2.5391628271738065e-05, 'epoch': 0.58}
 19%|██████████████▏                                                           | 2048/10701 [2:29:59<3:16:55,  1.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 19%|██████████████▏                                                           | 2048/10701 [2:29:59<3:16:55,  1.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7229, 'learning_rate': 2.5388687383589845e-05, 'epoch': 0.58}
 19%|██████████████▏                                                           | 2048/10701 [2:29:59<3:16:55,  1.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0329, 'learning_rate': 2.5385746495441625e-05, 'epoch': 0.58}
 19%|██████████████▏                                                           | 2048/10701 [2:29:59<3:16:55,  1.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0927, 'learning_rate': 2.5382805607293405e-05, 'epoch': 0.58}
 19%|██████████████▏                                                           | 2048/10701 [2:29:59<3:16:55,  1.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8634, 'learning_rate': 2.537986471914518e-05, 'epoch': 0.58}
 19%|██████████████▏                                                           | 2048/10701 [2:29:59<3:16:55,  1.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6757, 'learning_rate': 2.537692383099696e-05, 'epoch': 0.58}
 19%|██████████████▏                                                           | 2048/10701 [2:29:59<3:16:55,  1.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.797, 'learning_rate': 2.5373982942848744e-05, 'epoch': 0.58}
 19%|██████████████▏                                                           | 2048/10701 [2:29:59<3:16:55,  1.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 19%|██████████████▎                                                           | 2075/10701 [2:30:59<5:17:18,  2.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 19%|██████████████▎                                                           | 2075/10701 [2:30:59<5:17:18,  2.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7577, 'learning_rate': 2.53681011665523e-05, 'epoch': 0.58}
 19%|██████████████▎                                                           | 2075/10701 [2:30:59<5:17:18,  2.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2069, 'learning_rate': 2.5365160278404076e-05, 'epoch': 0.58}
 19%|██████████████▎                                                           | 2075/10701 [2:30:59<5:17:18,  2.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9445, 'learning_rate': 2.5362219390255856e-05, 'epoch': 0.58}
 19%|██████████████▎                                                           | 2075/10701 [2:30:59<5:17:18,  2.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0651, 'learning_rate': 2.535927850210764e-05, 'epoch': 0.58}
 19%|██████████████▎                                                           | 2075/10701 [2:30:59<5:17:18,  2.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 19%|██████████████▍                                                           | 2080/10701 [2:31:10<4:57:17,  2.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 19%|██████████████▍                                                           | 2080/10701 [2:31:10<4:57:17,  2.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9648, 'learning_rate': 2.5353396725811195e-05, 'epoch': 0.58}
 19%|██████████████▍                                                           | 2080/10701 [2:31:10<4:57:17,  2.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2632, 'learning_rate': 2.5350455837662975e-05, 'epoch': 0.58}
 19%|██████████████▍                                                           | 2080/10701 [2:31:10<4:57:17,  2.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9391, 'learning_rate': 2.5347514949514755e-05, 'epoch': 0.58}
 19%|██████████████▍                                                           | 2080/10701 [2:31:10<4:57:17,  2.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5422, 'learning_rate': 2.5344574061366534e-05, 'epoch': 0.58}
 19%|██████████████▍                                                           | 2080/10701 [2:31:10<4:57:17,  2.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.74, 'learning_rate': 2.5341633173218314e-05, 'epoch': 0.58}
 19%|██████████████▍                                                           | 2080/10701 [2:31:10<4:57:17,  2.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6081, 'learning_rate': 2.533869228507009e-05, 'epoch': 0.58}
 19%|██████████████▍                                                           | 2080/10701 [2:31:10<4:57:17,  2.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7957, 'learning_rate': 2.533575139692187e-05, 'epoch': 0.58}
 19%|██████████████▍                                                           | 2080/10701 [2:31:10<4:57:17,  2.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.981, 'learning_rate': 2.5332810508773653e-05, 'epoch': 0.59}
 19%|██████████████▍                                                           | 2080/10701 [2:31:10<4:57:17,  2.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6058, 'learning_rate': 2.532986962062543e-05, 'epoch': 0.59}
{'loss': 6.6644, 'learning_rate': 2.532692873247721e-05, 'epoch': 0.59}
 19%|██████████████▍                                                           | 2080/10701 [2:31:10<4:57:17,  2.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 20%|██████████████▍                                                           | 2091/10701 [2:31:30<4:05:41,  1.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 20%|██████████████▍                                                           | 2091/10701 [2:31:30<4:05:41,  1.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8494, 'learning_rate': 2.5321046956180765e-05, 'epoch': 0.59}
 20%|██████████████▍                                                           | 2091/10701 [2:31:30<4:05:41,  1.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.648, 'learning_rate': 2.531810606803255e-05, 'epoch': 0.59}
 20%|██████████████▍                                                           | 2094/10701 [2:31:34<3:30:58,  1.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 20%|██████████████▍                                                           | 2094/10701 [2:31:34<3:30:58,  1.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9315, 'learning_rate': 2.5312224291736104e-05, 'epoch': 0.59}
 20%|██████████████▍                                                           | 2094/10701 [2:31:34<3:30:58,  1.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:14:41,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6861, 'learning_rate': 2.5309283403587884e-05, 'epoch': 0.59}
 20%|██████████████▌                                                           | 2097/10701 [2:31:37<3:03:08,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:19,457 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 20%|██████████████▌                                                           | 2097/10701 [2:31:37<3:03:08,  1.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:19,457 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.4268, 'learning_rate': 2.5303401627291444e-05, 'epoch': 0.59}
 20%|██████████████▌                                                           | 2099/10701 [2:31:40<2:48:37,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 20%|██████████████▌                                                           | 2099/10701 [2:31:40<2:48:37,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.2542, 'learning_rate': 2.5297519850995e-05, 'epoch': 0.59}
 20%|██████████████▌                                                           | 2099/10701 [2:31:40<2:48:37,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.4378, 'learning_rate': 2.529457896284678e-05, 'epoch': 0.59}
 20%|██████████████▌                                                           | 2099/10701 [2:31:40<2:48:37,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1437, 'learning_rate': 2.5291638074698563e-05, 'epoch': 0.59}
 20%|██████████████▌                                                           | 2099/10701 [2:31:40<2:48:37,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8464, 'learning_rate': 2.528869718655034e-05, 'epoch': 0.59}
 20%|██████████████▌                                                           | 2099/10701 [2:31:40<2:48:37,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0435, 'learning_rate': 2.528575629840212e-05, 'epoch': 0.59}
 20%|██████████████▌                                                           | 2099/10701 [2:31:40<2:48:37,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8449, 'learning_rate': 2.52828154102539e-05, 'epoch': 0.59}
 20%|██████████████▌                                                           | 2099/10701 [2:31:40<2:48:37,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7904, 'learning_rate': 2.5279874522105675e-05, 'epoch': 0.59}
 20%|██████████████▌                                                           | 2099/10701 [2:31:40<2:48:37,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 20%|██████████████▌                                                           | 2099/10701 [2:31:40<2:48:37,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0654, 'learning_rate': 2.5276933633957458e-05, 'epoch': 0.59}
 20%|██████████████▌                                                           | 2099/10701 [2:31:40<2:48:37,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8283, 'learning_rate': 2.5273992745809234e-05, 'epoch': 0.59}
 20%|██████████████▌                                                           | 2099/10701 [2:31:40<2:48:37,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0349, 'learning_rate': 2.5271051857661014e-05, 'epoch': 0.59}
 20%|██████████████▌                                                           | 2099/10701 [2:31:40<2:48:37,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8222, 'learning_rate': 2.5268110969512794e-05, 'epoch': 0.59}
 20%|██████████████▌                                                           | 2099/10701 [2:31:40<2:48:37,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0891, 'learning_rate': 2.526517008136457e-05, 'epoch': 0.59}
 20%|██████████████▌                                                           | 2099/10701 [2:31:40<2:48:37,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.766, 'learning_rate': 2.5262229193216353e-05, 'epoch': 0.59}
 20%|██████████████▌                                                           | 2099/10701 [2:31:40<2:48:37,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0072, 'learning_rate': 2.5259288305068133e-05, 'epoch': 0.59}
 20%|██████████████▌                                                           | 2099/10701 [2:31:40<2:48:37,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8928, 'learning_rate': 2.525634741691991e-05, 'epoch': 0.59}
 20%|██████████████▌                                                           | 2099/10701 [2:31:40<2:48:37,  1.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 20%|██████████████▋                                                           | 2115/10701 [2:32:15<5:13:27,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 20%|██████████████▋                                                           | 2115/10701 [2:32:15<5:13:27,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8821, 'learning_rate': 2.525046564062347e-05, 'epoch': 0.59}
 20%|██████████████▋                                                           | 2115/10701 [2:32:15<5:13:27,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8985, 'learning_rate': 2.5247524752475248e-05, 'epoch': 0.59}
 20%|██████████████▋                                                           | 2115/10701 [2:32:15<5:13:27,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1426, 'learning_rate': 2.5244583864327028e-05, 'epoch': 0.59}
 20%|██████████████▋                                                           | 2115/10701 [2:32:15<5:13:27,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7592, 'learning_rate': 2.5241642976178808e-05, 'epoch': 0.59}
 20%|██████████████▋                                                           | 2115/10701 [2:32:15<5:13:27,  2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 20%|██████████████▋                                                           | 2120/10701 [2:32:26<5:07:55,  2.15s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 20%|██████████████▋                                                           | 2120/10701 [2:32:26<5:07:55,  2.15s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8894, 'learning_rate': 2.5235761199882367e-05, 'epoch': 0.59}
 20%|██████████████▋                                                           | 2120/10701 [2:32:26<5:07:55,  2.15s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8893, 'learning_rate': 2.5232820311734144e-05, 'epoch': 0.59}
 20%|██████████████▋                                                           | 2120/10701 [2:32:26<5:07:55,  2.15s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.623, 'learning_rate': 2.5229879423585923e-05, 'epoch': 0.59}
 20%|██████████████▋                                                           | 2120/10701 [2:32:26<5:07:55,  2.15s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8369, 'learning_rate': 2.5226938535437703e-05, 'epoch': 0.6}
 20%|██████████████▋                                                           | 2120/10701 [2:32:26<5:07:55,  2.15s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7556, 'learning_rate': 2.522399764728948e-05, 'epoch': 0.6}
 20%|██████████████▋                                                           | 2120/10701 [2:32:26<5:07:55,  2.15s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6919, 'learning_rate': 2.5221056759141262e-05, 'epoch': 0.6}
 20%|██████████████▋                                                           | 2120/10701 [2:32:26<5:07:55,  2.15s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 20%|██████████████▋                                                           | 2120/10701 [2:32:26<5:07:55,  2.15s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9774, 'learning_rate': 2.5218115870993042e-05, 'epoch': 0.6}
{'loss': 6.7836, 'learning_rate': 2.521517498284482e-05, 'epoch': 0.6}
 20%|██████████████▋                                                           | 2120/10701 [2:32:26<5:07:55,  2.15s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0906, 'learning_rate': 2.5212234094696598e-05, 'epoch': 0.6}
 20%|██████████████▋                                                           | 2120/10701 [2:32:26<5:07:55,  2.15s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8424, 'learning_rate': 2.5209293206548378e-05, 'epoch': 0.6}
 20%|██████████████▋                                                           | 2120/10701 [2:32:26<5:07:55,  2.15s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8095, 'learning_rate': 2.5206352318400158e-05, 'epoch': 0.6}
 20%|██████████████▋                                                           | 2120/10701 [2:32:26<5:07:55,  2.15s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8061, 'learning_rate': 2.5203411430251937e-05, 'epoch': 0.6}
{'loss': 6.6988, 'learning_rate': 2.5200470542103717e-05, 'epoch': 0.6}
 20%|██████████████▋                                                           | 2120/10701 [2:32:26<5:07:55,  2.15s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.477, 'learning_rate': 2.5197529653955493e-05, 'epoch': 0.6}
 20%|██████████████▋                                                           | 2120/10701 [2:32:26<5:07:55,  2.15s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9797, 'learning_rate': 2.5194588765807277e-05, 'epoch': 0.6}
 20%|██████████████▋                                                           | 2120/10701 [2:32:26<5:07:55,  2.15s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8108, 'learning_rate': 2.5191647877659056e-05, 'epoch': 0.6}
 20%|██████████████▋                                                           | 2120/10701 [2:32:26<5:07:55,  2.15s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8487, 'learning_rate': 2.5188706989510833e-05, 'epoch': 0.6}
 20%|██████████████▋                                                           | 2120/10701 [2:32:26<5:07:55,  2.15s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2335, 'learning_rate': 2.5185766101362612e-05, 'epoch': 0.6}
 20%|██████████████▋                                                           | 2120/10701 [2:32:26<5:07:55,  2.15s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0233, 'learning_rate': 2.518282521321439e-05, 'epoch': 0.6}
 20%|██████████████▋                                                           | 2120/10701 [2:32:26<5:07:55,  2.15s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0357, 'learning_rate': 2.5179884325066172e-05, 'epoch': 0.6}
 20%|██████████████▋                                                           | 2120/10701 [2:32:26<5:07:55,  2.15s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0419, 'learning_rate': 2.517694343691795e-05, 'epoch': 0.6}
 20%|██████████████▋                                                           | 2120/10701 [2:32:26<5:07:55,  2.15s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 20%|██████████████▋                                                           | 2120/10701 [2:32:26<5:07:55,  2.15s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9021, 'learning_rate': 2.5174002548769728e-05, 'epoch': 0.6}
{'loss': 6.6978, 'learning_rate': 2.5171061660621508e-05, 'epoch': 0.6}
{'loss': 6.8562, 'learning_rate': 2.5168120772473287e-05, 'epoch': 0.6}
 20%|██████████████▊                                                           | 2145/10701 [2:33:13<3:28:06,  1.46s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 20%|██████████████▊                                                           | 2145/10701 [2:33:13<3:28:06,  1.46s/it][WARNING|modeling_utils.py:388] 2022-03-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.4105, 'learning_rate': 2.5165179884325067e-05, 'epoch': 0.6}
[WARNING|modeling_utils.py:388] 2022-03-02 12:17:56,508 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 12:17:56,508 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5476, 'learning_rate': 2.5159298108028626e-05, 'epoch': 0.6}
[WARNING|modeling_utils.py:388] 2022-03-02 12:17:58,742 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 12:17:58,742 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3122, 'learning_rate': 2.5153416331732183e-05, 'epoch': 0.6}
[WARNING|modeling_utils.py:388] 2022-03-02 12:17:58,742 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.3256, 'learning_rate': 2.5150475443583966e-05, 'epoch': 0.6}
{'loss': 6.3035, 'learning_rate': 2.5147534555435742e-05, 'epoch': 0.6}
[WARNING|modeling_utils.py:388] 2022-03-02 12:17:58,742 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.187, 'learning_rate': 2.5144593667287522e-05, 'epoch': 0.6}
[WARNING|modeling_utils.py:388] 2022-03-02 12:17:58,742 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0149, 'learning_rate': 2.5141652779139298e-05, 'epoch': 0.6}
[WARNING|modeling_utils.py:388] 2022-03-02 12:17:58,742 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6033, 'learning_rate': 2.513871189099108e-05, 'epoch': 0.6}
[WARNING|modeling_utils.py:388] 2022-03-02 12:17:58,742 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8444, 'learning_rate': 2.513577100284286e-05, 'epoch': 0.6}
[WARNING|modeling_utils.py:388] 2022-03-02 12:17:58,742 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9551, 'learning_rate': 2.5132830114694637e-05, 'epoch': 0.6}
[WARNING|modeling_utils.py:388] 2022-03-02 12:17:58,742 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8481, 'learning_rate': 2.5129889226546417e-05, 'epoch': 0.6}
[WARNING|modeling_utils.py:388] 2022-03-02 12:17:58,742 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1317, 'learning_rate': 2.5126948338398197e-05, 'epoch': 0.6}
[WARNING|modeling_utils.py:388] 2022-03-02 12:17:58,742 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1236, 'learning_rate': 2.5124007450249976e-05, 'epoch': 0.6}
[WARNING|modeling_utils.py:388] 2022-03-02 12:17:58,742 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 12:17:58,742 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7455, 'learning_rate': 2.5121066562101756e-05, 'epoch': 0.61}
[WARNING|modeling_utils.py:388] 2022-03-02 12:17:58,742 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9741, 'learning_rate': 2.5118125673953536e-05, 'epoch': 0.61}
[WARNING|modeling_utils.py:388] 2022-03-02 12:17:58,742 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8069, 'learning_rate': 2.5115184785805312e-05, 'epoch': 0.61}
[WARNING|modeling_utils.py:388] 2022-03-02 12:17:58,742 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7409, 'learning_rate': 2.5112243897657092e-05, 'epoch': 0.61}
[WARNING|modeling_utils.py:388] 2022-03-02 12:17:58,742 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1262, 'learning_rate': 2.5109303009508875e-05, 'epoch': 0.61}
[WARNING|modeling_utils.py:388] 2022-03-02 12:17:58,742 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2317, 'learning_rate': 2.510636212136065e-05, 'epoch': 0.61}
[WARNING|modeling_utils.py:388] 2022-03-02 12:17:58,742 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0101, 'learning_rate': 2.510342123321243e-05, 'epoch': 0.61}
[WARNING|modeling_utils.py:388] 2022-03-02 12:17:58,742 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 20%|██████████████▉                                                           | 2167/10701 [2:33:57<5:09:05,  2.17s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 20%|██████████████▉                                                           | 2167/10701 [2:33:57<5:09:05,  2.17s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0596, 'learning_rate': 2.5097539456915987e-05, 'epoch': 0.61}
 20%|██████████████▉                                                           | 2167/10701 [2:33:57<5:09:05,  2.17s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9852, 'learning_rate': 2.509459856876777e-05, 'epoch': 0.61}
 20%|██████████████▉                                                           | 2167/10701 [2:33:57<5:09:05,  2.17s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9108, 'learning_rate': 2.5091657680619547e-05, 'epoch': 0.61}
 20%|██████████████▉                                                           | 2167/10701 [2:33:57<5:09:05,  2.17s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9707, 'learning_rate': 2.5088716792471326e-05, 'epoch': 0.61}
 20%|██████████████▉                                                           | 2167/10701 [2:33:57<5:09:05,  2.17s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.62, 'learning_rate': 2.5085775904323106e-05, 'epoch': 0.61}
 20%|██████████████▉                                                           | 2167/10701 [2:33:57<5:09:05,  2.17s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 20%|███████████████                                                           | 2173/10701 [2:34:10<4:59:53,  2.11s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 20%|███████████████                                                           | 2173/10701 [2:34:10<4:59:53,  2.11s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9883, 'learning_rate': 2.5079894128026666e-05, 'epoch': 0.61}
 20%|███████████████                                                           | 2173/10701 [2:34:10<4:59:53,  2.11s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9878, 'learning_rate': 2.5076953239878445e-05, 'epoch': 0.61}
 20%|███████████████                                                           | 2173/10701 [2:34:10<4:59:53,  2.11s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9056, 'learning_rate': 2.507401235173022e-05, 'epoch': 0.61}
 20%|███████████████                                                           | 2173/10701 [2:34:10<4:59:53,  2.11s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9464, 'learning_rate': 2.5071071463582e-05, 'epoch': 0.61}
 20%|███████████████                                                           | 2173/10701 [2:34:10<4:59:53,  2.11s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6717, 'learning_rate': 2.5068130575433784e-05, 'epoch': 0.61}
 20%|███████████████                                                           | 2173/10701 [2:34:10<4:59:53,  2.11s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 20%|███████████████                                                           | 2179/10701 [2:34:22<4:53:35,  2.07s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 20%|███████████████                                                           | 2179/10701 [2:34:22<4:53:35,  2.07s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7154, 'learning_rate': 2.506224879913734e-05, 'epoch': 0.61}
 20%|███████████████                                                           | 2179/10701 [2:34:22<4:53:35,  2.07s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8589, 'learning_rate': 2.505930791098912e-05, 'epoch': 0.61}
 20%|███████████████                                                           | 2179/10701 [2:34:22<4:53:35,  2.07s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0209, 'learning_rate': 2.5056367022840897e-05, 'epoch': 0.61}
 20%|███████████████                                                           | 2179/10701 [2:34:22<4:53:35,  2.07s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9438, 'learning_rate': 2.505342613469268e-05, 'epoch': 0.61}
 20%|███████████████                                                           | 2179/10701 [2:34:22<4:53:35,  2.07s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8995, 'learning_rate': 2.5050485246544456e-05, 'epoch': 0.61}
 20%|███████████████                                                           | 2179/10701 [2:34:22<4:53:35,  2.07s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1835, 'learning_rate': 2.5047544358396236e-05, 'epoch': 0.61}
 20%|███████████████                                                           | 2179/10701 [2:34:22<4:53:35,  2.07s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8466, 'learning_rate': 2.5044603470248015e-05, 'epoch': 0.61}
 20%|███████████████                                                           | 2179/10701 [2:34:22<4:53:35,  2.07s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0257, 'learning_rate': 2.5041662582099795e-05, 'epoch': 0.61}
 20%|███████████████                                                           | 2179/10701 [2:34:22<4:53:35,  2.07s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8457, 'learning_rate': 2.5038721693951575e-05, 'epoch': 0.61}
 20%|███████████████                                                           | 2179/10701 [2:34:22<4:53:35,  2.07s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1464, 'learning_rate': 2.5035780805803355e-05, 'epoch': 0.61}
{'loss': 6.9535, 'learning_rate': 2.503283991765513e-05, 'epoch': 0.61}
 20%|███████████████                                                           | 2179/10701 [2:34:22<4:53:35,  2.07s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7381, 'learning_rate': 2.502989902950691e-05, 'epoch': 0.61}
 20%|███████████████                                                           | 2179/10701 [2:34:22<4:53:35,  2.07s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8755, 'learning_rate': 2.5026958141358694e-05, 'epoch': 0.61}
 20%|███████████████                                                           | 2179/10701 [2:34:22<4:53:35,  2.07s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1157, 'learning_rate': 2.502401725321047e-05, 'epoch': 0.61}
 20%|███████████████                                                           | 2179/10701 [2:34:22<4:53:35,  2.07s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7526, 'learning_rate': 2.502107636506225e-05, 'epoch': 0.61}
 21%|███████████████▏                                                          | 2195/10701 [2:34:50<3:26:24,  1.46s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 21%|███████████████▏                                                          | 2195/10701 [2:34:50<3:26:24,  1.46s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.3763, 'learning_rate': 2.5015194588765806e-05, 'epoch': 0.62}
[WARNING|modeling_utils.py:388] 2022-03-02 12:19:34,158 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 12:19:34,158 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7009, 'learning_rate': 2.5009312812469365e-05, 'epoch': 0.62}
[WARNING|modeling_utils.py:388] 2022-03-02 12:19:34,158 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8125, 'learning_rate': 2.5006371924321145e-05, 'epoch': 0.62}
                                                                                                                        g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
                                                                                                                        g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.1904, 'learning_rate': 2.50004901480247e-05, 'epoch': 0.62}
                                                                                                                        g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9823, 'learning_rate': 2.4994608371728264e-05, 'epoch': 0.62}
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9227, 'learning_rate': 2.499166748358004e-05, 'epoch': 0.62}
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6842, 'learning_rate': 2.498872659543182e-05, 'epoch': 0.62}
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8098, 'learning_rate': 2.4985785707283603e-05, 'epoch': 0.62}
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7564, 'learning_rate': 2.498284481913538e-05, 'epoch': 0.62}
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8285, 'learning_rate': 2.497990393098716e-05, 'epoch': 0.62}
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8036, 'learning_rate': 2.497696304283894e-05, 'epoch': 0.62}
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8416, 'learning_rate': 2.4974022154690715e-05, 'epoch': 0.62}
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7425, 'learning_rate': 2.49710812665425e-05, 'epoch': 0.62}
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1926, 'learning_rate': 2.4968140378394275e-05, 'epoch': 0.62}
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8501, 'learning_rate': 2.4965199490246055e-05, 'epoch': 0.62}
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7603, 'learning_rate': 2.4962258602097834e-05, 'epoch': 0.62}
{'loss': 6.9305, 'learning_rate': 2.495931771394961e-05, 'epoch': 0.62}
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5075, 'learning_rate': 2.4956376825801394e-05, 'epoch': 0.62}
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.727, 'learning_rate': 2.4953435937653173e-05, 'epoch': 0.62}
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9947, 'learning_rate': 2.495049504950495e-05, 'epoch': 0.62}
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9936, 'learning_rate': 2.494755416135673e-05, 'epoch': 0.62}
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6914, 'learning_rate': 2.494461327320851e-05, 'epoch': 0.62}
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8348, 'learning_rate': 2.494167238506029e-05, 'epoch': 0.62}
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.772, 'learning_rate': 2.493873149691207e-05, 'epoch': 0.62}
{'loss': 6.7343, 'learning_rate': 2.493579060876385e-05, 'epoch': 0.62}
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8278, 'learning_rate': 2.4932849720615625e-05, 'epoch': 0.62}
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.4863, 'learning_rate': 2.4929908832467408e-05, 'epoch': 0.62}
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9877, 'learning_rate': 2.4926967944319188e-05, 'epoch': 0.62}
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7837, 'learning_rate': 2.4924027056170964e-05, 'epoch': 0.62}
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6712, 'learning_rate': 2.4921086168022744e-05, 'epoch': 0.62}
{'loss': 6.8164, 'learning_rate': 2.491814527987452e-05, 'epoch': 0.62}
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2298, 'learning_rate': 2.4915204391726303e-05, 'epoch': 0.62}
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7221, 'learning_rate': 2.4912263503578083e-05, 'epoch': 0.63}
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9758, 'learning_rate': 2.490932261542986e-05, 'epoch': 0.63}
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0163, 'learning_rate': 2.490638172728164e-05, 'epoch': 0.63}
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8394, 'learning_rate': 2.490344083913342e-05, 'epoch': 0.63}
{'loss': 6.7994, 'learning_rate': 2.49004999509852e-05, 'epoch': 0.63}
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9099, 'learning_rate': 2.4897559062836978e-05, 'epoch': 0.63}
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7927, 'learning_rate': 2.4894618174688758e-05, 'epoch': 0.63}
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6468, 'learning_rate': 2.4891677286540534e-05, 'epoch': 0.63}
 21%|███████████████▏                                                          | 2202/10701 [2:35:01<4:20:43,  1.84s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 21%|███████████████▍                                                          | 2239/10701 [2:36:19<4:21:03,  1.85s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 21%|███████████████▍                                                          | 2239/10701 [2:36:19<4:21:03,  1.85s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9126, 'learning_rate': 2.4885795510244097e-05, 'epoch': 0.63}
 21%|███████████████▍                                                          | 2239/10701 [2:36:19<4:21:03,  1.85s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8136, 'learning_rate': 2.4882854622095873e-05, 'epoch': 0.63}
 21%|███████████████▍                                                          | 2239/10701 [2:36:19<4:21:03,  1.85s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.738, 'learning_rate': 2.4879913733947653e-05, 'epoch': 0.63}
 21%|███████████████▍                                                          | 2239/10701 [2:36:19<4:21:03,  1.85s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8697, 'learning_rate': 2.487697284579943e-05, 'epoch': 0.63}
{'loss': 7.039, 'learning_rate': 2.4874031957651212e-05, 'epoch': 0.63}
 21%|███████████████▍                                                          | 2239/10701 [2:36:19<4:21:03,  1.85s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7648, 'learning_rate': 2.4871091069502992e-05, 'epoch': 0.63}
 21%|███████████████▍                                                          | 2239/10701 [2:36:19<4:21:03,  1.85s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9251, 'learning_rate': 2.486815018135477e-05, 'epoch': 0.63}
 21%|███████████████▍                                                          | 2239/10701 [2:36:19<4:21:03,  1.85s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5631, 'learning_rate': 2.4865209293206548e-05, 'epoch': 0.63}
 21%|███████████████▌                                                          | 2248/10701 [2:36:33<3:18:55,  1.41s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 21%|███████████████▌                                                          | 2248/10701 [2:36:33<3:18:55,  1.41s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6605, 'learning_rate': 2.4859327516910108e-05, 'epoch': 0.63}
{'loss': 6.3269, 'learning_rate': 2.4856386628761887e-05, 'epoch': 0.63}
 21%|███████████████▌                                                          | 2248/10701 [2:36:33<3:18:55,  1.41s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.0006, 'learning_rate': 2.4853445740613667e-05, 'epoch': 0.63}
 21%|███████████████▌                                                          | 2248/10701 [2:36:33<3:18:55,  1.41s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.888, 'learning_rate': 2.4850504852465443e-05, 'epoch': 0.63}
 21%|███████████████▌                                                          | 2248/10701 [2:36:33<3:18:55,  1.41s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.107, 'learning_rate': 2.4847563964317223e-05, 'epoch': 0.63}
 21%|███████████████▌                                                          | 2248/10701 [2:36:33<3:18:55,  1.41s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 21%|███████████████▌                                                          | 2248/10701 [2:36:33<3:18:55,  1.41s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9466, 'learning_rate': 2.4844623076169006e-05, 'epoch': 0.63}
 21%|███████████████▌                                                          | 2248/10701 [2:36:33<3:18:55,  1.41s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0241, 'learning_rate': 2.4841682188020783e-05, 'epoch': 0.63}
 21%|███████████████▌                                                          | 2248/10701 [2:36:33<3:18:55,  1.41s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9211, 'learning_rate': 2.4838741299872562e-05, 'epoch': 0.63}
 21%|███████████████▌                                                          | 2248/10701 [2:36:33<3:18:55,  1.41s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8263, 'learning_rate': 2.483580041172434e-05, 'epoch': 0.63}
 21%|███████████████▌                                                          | 2248/10701 [2:36:33<3:18:55,  1.41s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8122, 'learning_rate': 2.4832859523576122e-05, 'epoch': 0.63}
 21%|███████████████▌                                                          | 2248/10701 [2:36:33<3:18:55,  1.41s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9481, 'learning_rate': 2.48299186354279e-05, 'epoch': 0.63}
 21%|███████████████▌                                                          | 2248/10701 [2:36:33<3:18:55,  1.41s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8112, 'learning_rate': 2.4826977747279678e-05, 'epoch': 0.63}
 21%|███████████████▌                                                          | 2248/10701 [2:36:33<3:18:55,  1.41s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2194, 'learning_rate': 2.4824036859131458e-05, 'epoch': 0.63}
 21%|███████████████▌                                                          | 2248/10701 [2:36:33<3:18:55,  1.41s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8623, 'learning_rate': 2.4821095970983237e-05, 'epoch': 0.63}
 21%|███████████████▌                                                          | 2248/10701 [2:36:33<3:18:55,  1.41s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6788, 'learning_rate': 2.4818155082835017e-05, 'epoch': 0.63}
 21%|███████████████▌                                                          | 2248/10701 [2:36:33<3:18:55,  1.41s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8644, 'learning_rate': 2.4815214194686797e-05, 'epoch': 0.63}
{'loss': 6.9722, 'learning_rate': 2.4812273306538577e-05, 'epoch': 0.63}
 21%|███████████████▌                                                          | 2248/10701 [2:36:33<3:18:55,  1.41s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 21%|███████████████▌                                                          | 2248/10701 [2:36:33<3:18:55,  1.41s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 21%|███████████████▋                                                          | 2266/10701 [2:37:11<5:07:17,  2.19s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 21%|███████████████▋                                                          | 2266/10701 [2:37:11<5:07:17,  2.19s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1041, 'learning_rate': 2.4806391530242133e-05, 'epoch': 0.64}
 21%|███████████████▋                                                          | 2266/10701 [2:37:11<5:07:17,  2.19s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8392, 'learning_rate': 2.4803450642093916e-05, 'epoch': 0.64}
 21%|███████████████▋                                                          | 2266/10701 [2:37:11<5:07:17,  2.19s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7503, 'learning_rate': 2.4800509753945692e-05, 'epoch': 0.64}
 21%|███████████████▋                                                          | 2266/10701 [2:37:11<5:07:17,  2.19s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7762, 'learning_rate': 2.4797568865797472e-05, 'epoch': 0.64}
 21%|███████████████▋                                                          | 2266/10701 [2:37:11<5:07:17,  2.19s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0508, 'learning_rate': 2.4794627977649248e-05, 'epoch': 0.64}
 21%|███████████████▋                                                          | 2266/10701 [2:37:11<5:07:17,  2.19s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1428, 'learning_rate': 2.4791687089501028e-05, 'epoch': 0.64}
 21%|███████████████▋                                                          | 2266/10701 [2:37:11<5:07:17,  2.19s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9214, 'learning_rate': 2.478874620135281e-05, 'epoch': 0.64}
 21%|███████████████▋                                                          | 2266/10701 [2:37:11<5:07:17,  2.19s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6641, 'learning_rate': 2.4785805313204587e-05, 'epoch': 0.64}
 21%|███████████████▋                                                          | 2266/10701 [2:37:11<5:07:17,  2.19s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9814, 'learning_rate': 2.4782864425056367e-05, 'epoch': 0.64}
 21%|███████████████▋                                                          | 2266/10701 [2:37:11<5:07:17,  2.19s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6925, 'learning_rate': 2.4779923536908147e-05, 'epoch': 0.64}
 21%|███████████████▋                                                          | 2266/10701 [2:37:11<5:07:17,  2.19s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7863, 'learning_rate': 2.4776982648759926e-05, 'epoch': 0.64}
 21%|███████████████▋                                                          | 2266/10701 [2:37:11<5:07:17,  2.19s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2637, 'learning_rate': 2.4774041760611706e-05, 'epoch': 0.64}
 21%|███████████████▋                                                          | 2266/10701 [2:37:11<5:07:17,  2.19s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8437, 'learning_rate': 2.4771100872463486e-05, 'epoch': 0.64}
 21%|███████████████▋                                                          | 2266/10701 [2:37:11<5:07:17,  2.19s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.968, 'learning_rate': 2.4768159984315262e-05, 'epoch': 0.64}
 21%|███████████████▋                                                          | 2266/10701 [2:37:11<5:07:17,  2.19s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 21%|███████████████▊                                                          | 2281/10701 [2:37:42<4:44:06,  2.02s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 21%|███████████████▊                                                          | 2281/10701 [2:37:42<4:44:06,  2.02s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0874, 'learning_rate': 2.4762278208018825e-05, 'epoch': 0.64}
 21%|███████████████▊                                                          | 2281/10701 [2:37:42<4:44:06,  2.02s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6583, 'learning_rate': 2.47593373198706e-05, 'epoch': 0.64}
 21%|███████████████▊                                                          | 2281/10701 [2:37:42<4:44:06,  2.02s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8549, 'learning_rate': 2.475639643172238e-05, 'epoch': 0.64}
 21%|███████████████▊                                                          | 2281/10701 [2:37:42<4:44:06,  2.02s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8765, 'learning_rate': 2.475345554357416e-05, 'epoch': 0.64}
 21%|███████████████▊                                                          | 2281/10701 [2:37:42<4:44:06,  2.02s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.879, 'learning_rate': 2.4750514655425937e-05, 'epoch': 0.64}
 21%|███████████████▊                                                          | 2281/10701 [2:37:42<4:44:06,  2.02s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6174, 'learning_rate': 2.474757376727772e-05, 'epoch': 0.64}
 21%|███████████████▊                                                          | 2281/10701 [2:37:42<4:44:06,  2.02s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9599, 'learning_rate': 2.4744632879129497e-05, 'epoch': 0.64}
 21%|███████████████▊                                                          | 2281/10701 [2:37:42<4:44:06,  2.02s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5322, 'learning_rate': 2.4741691990981276e-05, 'epoch': 0.64}
{'loss': 7.0712, 'learning_rate': 2.4738751102833056e-05, 'epoch': 0.64}
 21%|███████████████▊                                                          | 2281/10701 [2:37:42<4:44:06,  2.02s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.808, 'learning_rate': 2.4735810214684836e-05, 'epoch': 0.64}
 21%|███████████████▊                                                          | 2281/10701 [2:37:42<4:44:06,  2.02s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1345, 'learning_rate': 2.4732869326536616e-05, 'epoch': 0.64}
 21%|███████████████▊                                                          | 2281/10701 [2:37:42<4:44:06,  2.02s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.086, 'learning_rate': 2.4729928438388395e-05, 'epoch': 0.64}
 21%|███████████████▊                                                          | 2281/10701 [2:37:42<4:44:06,  2.02s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8686, 'learning_rate': 2.472698755024017e-05, 'epoch': 0.64}
 21%|███████████████▊                                                          | 2295/10701 [2:38:06<3:31:00,  1.51s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 21%|███████████████▊                                                          | 2295/10701 [2:38:06<3:31:00,  1.51s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:50,196 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:50,196 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6613, 'learning_rate': 2.471816488579551e-05, 'epoch': 0.64}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5487, 'learning_rate': 2.471228310949907e-05, 'epoch': 0.64}
{'loss': 6.5772, 'learning_rate': 2.4709342221350847e-05, 'epoch': 0.64}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7512, 'learning_rate': 2.470640133320263e-05, 'epoch': 0.64}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6604, 'learning_rate': 2.4703460445054406e-05, 'epoch': 0.64}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0851, 'learning_rate': 2.4700519556906186e-05, 'epoch': 0.65}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9067, 'learning_rate': 2.4697578668757966e-05, 'epoch': 0.65}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1143, 'learning_rate': 2.4694637780609742e-05, 'epoch': 0.65}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8595, 'learning_rate': 2.4691696892461525e-05, 'epoch': 0.65}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8289, 'learning_rate': 2.4688756004313305e-05, 'epoch': 0.65}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8376, 'learning_rate': 2.468581511616508e-05, 'epoch': 0.65}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9008, 'learning_rate': 2.468287422801686e-05, 'epoch': 0.65}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7559, 'learning_rate': 2.4679933339868644e-05, 'epoch': 0.65}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6985, 'learning_rate': 2.467699245172042e-05, 'epoch': 0.65}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8373, 'learning_rate': 2.46740515635722e-05, 'epoch': 0.65}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2238, 'learning_rate': 2.467111067542398e-05, 'epoch': 0.65}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.4143, 'learning_rate': 2.4668169787275756e-05, 'epoch': 0.65}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8707, 'learning_rate': 2.466522889912754e-05, 'epoch': 0.65}
{'loss': 6.5043, 'learning_rate': 2.4662288010979315e-05, 'epoch': 0.65}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1419, 'learning_rate': 2.4659347122831095e-05, 'epoch': 0.65}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9513, 'learning_rate': 2.4656406234682875e-05, 'epoch': 0.65}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7298, 'learning_rate': 2.465346534653465e-05, 'epoch': 0.65}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7767, 'learning_rate': 2.4650524458386434e-05, 'epoch': 0.65}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7147, 'learning_rate': 2.4647583570238214e-05, 'epoch': 0.65}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8178, 'learning_rate': 2.464464268208999e-05, 'epoch': 0.65}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9289, 'learning_rate': 2.464170179394177e-05, 'epoch': 0.65}
{'loss': 7.0849, 'learning_rate': 2.463876090579355e-05, 'epoch': 0.65}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.775, 'learning_rate': 2.463582001764533e-05, 'epoch': 0.65}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0105, 'learning_rate': 2.463287912949711e-05, 'epoch': 0.65}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5959, 'learning_rate': 2.462993824134889e-05, 'epoch': 0.65}
{'loss': 6.6418, 'learning_rate': 2.4626997353200665e-05, 'epoch': 0.65}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9249, 'learning_rate': 2.462405646505245e-05, 'epoch': 0.65}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5713, 'learning_rate': 2.4621115576904228e-05, 'epoch': 0.65}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6784, 'learning_rate': 2.4618174688756005e-05, 'epoch': 0.65}
{'loss': 6.9778, 'learning_rate': 2.4615233800607784e-05, 'epoch': 0.65}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.058, 'learning_rate': 2.461229291245956e-05, 'epoch': 0.65}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9259, 'learning_rate': 2.4609352024311344e-05, 'epoch': 0.65}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8505, 'learning_rate': 2.4606411136163123e-05, 'epoch': 0.65}
{'loss': 6.8393, 'learning_rate': 2.46034702480149e-05, 'epoch': 0.65}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9141, 'learning_rate': 2.460052935986668e-05, 'epoch': 0.65}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.989, 'learning_rate': 2.459758847171846e-05, 'epoch': 0.66}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9925, 'learning_rate': 2.459464758357024e-05, 'epoch': 0.66}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8841, 'learning_rate': 2.459170669542202e-05, 'epoch': 0.66}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7949, 'learning_rate': 2.45887658072738e-05, 'epoch': 0.66}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.614, 'learning_rate': 2.4585824919125575e-05, 'epoch': 0.66}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6697, 'learning_rate': 2.4582884030977354e-05, 'epoch': 0.66}
{'loss': 6.9234, 'learning_rate': 2.4579943142829138e-05, 'epoch': 0.66}
[WARNING|modeling_utils.py:388] 2022-03-02 12:22:52,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 22%|████████████████▏                                                         | 2345/10701 [2:39:45<3:35:18,  1.55s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 22%|████████████████▏                                                         | 2345/10701 [2:39:45<3:35:18,  1.55s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5758, 'learning_rate': 2.4574061366532694e-05, 'epoch': 0.66}
[WARNING|modeling_utils.py:388] 2022-03-02 12:24:29,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 12:24:29,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.4825, 'learning_rate': 2.4568179590236253e-05, 'epoch': 0.66}
[WARNING|modeling_utils.py:388] 2022-03-02 12:24:29,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.43, 'learning_rate': 2.4565238702088033e-05, 'epoch': 0.66}
 22%|████████████████▎                                                         | 2350/10701 [2:39:52<3:10:37,  1.37s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 22%|████████████████▎                                                         | 2350/10701 [2:39:52<3:10:37,  1.37s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.4602, 'learning_rate': 2.455935692579159e-05, 'epoch': 0.66}
 22%|████████████████▎                                                         | 2350/10701 [2:39:52<3:10:37,  1.37s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1339, 'learning_rate': 2.455641603764337e-05, 'epoch': 0.66}
 22%|████████████████▎                                                         | 2350/10701 [2:39:52<3:10:37,  1.37s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.078, 'learning_rate': 2.455347514949515e-05, 'epoch': 0.66}
 22%|████████████████▎                                                         | 2350/10701 [2:39:52<3:10:37,  1.37s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9944, 'learning_rate': 2.4550534261346928e-05, 'epoch': 0.66}
 22%|████████████████▎                                                         | 2350/10701 [2:39:52<3:10:37,  1.37s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7597, 'learning_rate': 2.4547593373198708e-05, 'epoch': 0.66}
 22%|████████████████▎                                                         | 2350/10701 [2:39:52<3:10:37,  1.37s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 22%|████████████████▎                                                         | 2350/10701 [2:39:52<3:10:37,  1.37s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0405, 'learning_rate': 2.4544652485050484e-05, 'epoch': 0.66}
 22%|████████████████▎                                                         | 2350/10701 [2:39:52<3:10:37,  1.37s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2211, 'learning_rate': 2.4541711596902264e-05, 'epoch': 0.66}
 22%|████████████████▎                                                         | 2350/10701 [2:39:52<3:10:37,  1.37s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0248, 'learning_rate': 2.4538770708754047e-05, 'epoch': 0.66}
 22%|████████████████▎                                                         | 2350/10701 [2:39:52<3:10:37,  1.37s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1351, 'learning_rate': 2.4535829820605823e-05, 'epoch': 0.66}
 22%|████████████████▎                                                         | 2350/10701 [2:39:52<3:10:37,  1.37s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7603, 'learning_rate': 2.4532888932457603e-05, 'epoch': 0.66}
 22%|████████████████▎                                                         | 2350/10701 [2:39:52<3:10:37,  1.37s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7422, 'learning_rate': 2.452994804430938e-05, 'epoch': 0.66}
 22%|████████████████▎                                                         | 2350/10701 [2:39:52<3:10:37,  1.37s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9833, 'learning_rate': 2.4527007156161163e-05, 'epoch': 0.66}
 22%|████████████████▎                                                         | 2350/10701 [2:39:52<3:10:37,  1.37s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.809, 'learning_rate': 2.4524066268012942e-05, 'epoch': 0.66}
 22%|████████████████▎                                                         | 2350/10701 [2:39:52<3:10:37,  1.37s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6803, 'learning_rate': 2.452112537986472e-05, 'epoch': 0.66}
 22%|████████████████▎                                                         | 2350/10701 [2:39:52<3:10:37,  1.37s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 22%|████████████████▎                                                         | 2350/10701 [2:39:52<3:10:37,  1.37s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8838, 'learning_rate': 2.4518184491716498e-05, 'epoch': 0.66}
{'loss': 6.81, 'learning_rate': 2.4515243603568278e-05, 'epoch': 0.66}
 22%|████████████████▎                                                         | 2350/10701 [2:39:52<3:10:37,  1.37s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1666, 'learning_rate': 2.4512302715420058e-05, 'epoch': 0.66}
 22%|████████████████▎                                                         | 2350/10701 [2:39:52<3:10:37,  1.37s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 22%|████████████████▍                                                         | 2368/10701 [2:40:32<5:02:10,  2.18s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 22%|████████████████▍                                                         | 2368/10701 [2:40:32<5:02:10,  2.18s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8121, 'learning_rate': 2.4506420939123617e-05, 'epoch': 0.66}
 22%|████████████████▍                                                         | 2368/10701 [2:40:32<5:02:10,  2.18s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6207, 'learning_rate': 2.4503480050975394e-05, 'epoch': 0.66}
 22%|████████████████▍                                                         | 2368/10701 [2:40:32<5:02:10,  2.18s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8538, 'learning_rate': 2.4500539162827173e-05, 'epoch': 0.66}
 22%|████████████████▍                                                         | 2368/10701 [2:40:32<5:02:10,  2.18s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9726, 'learning_rate': 2.4497598274678956e-05, 'epoch': 0.66}
 22%|████████████████▍                                                         | 2368/10701 [2:40:32<5:02:10,  2.18s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0451, 'learning_rate': 2.4494657386530733e-05, 'epoch': 0.66}
 22%|████████████████▍                                                         | 2368/10701 [2:40:32<5:02:10,  2.18s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6633, 'learning_rate': 2.4491716498382512e-05, 'epoch': 0.67}
 22%|████████████████▍                                                         | 2368/10701 [2:40:32<5:02:10,  2.18s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.332, 'learning_rate': 2.4488775610234292e-05, 'epoch': 0.67}
 22%|████████████████▍                                                         | 2368/10701 [2:40:32<5:02:10,  2.18s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7245, 'learning_rate': 2.448583472208607e-05, 'epoch': 0.67}
 22%|████████████████▍                                                         | 2368/10701 [2:40:32<5:02:10,  2.18s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7817, 'learning_rate': 2.448289383393785e-05, 'epoch': 0.67}
 22%|████████████████▍                                                         | 2368/10701 [2:40:32<5:02:10,  2.18s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7503, 'learning_rate': 2.4479952945789628e-05, 'epoch': 0.67}
 22%|████████████████▍                                                         | 2368/10701 [2:40:32<5:02:10,  2.18s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7526, 'learning_rate': 2.4477012057641408e-05, 'epoch': 0.67}
 22%|████████████████▍                                                         | 2368/10701 [2:40:32<5:02:10,  2.18s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1267, 'learning_rate': 2.4474071169493187e-05, 'epoch': 0.67}
 22%|████████████████▍                                                         | 2368/10701 [2:40:32<5:02:10,  2.18s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6105, 'learning_rate': 2.4471130281344967e-05, 'epoch': 0.67}
 22%|████████████████▍                                                         | 2368/10701 [2:40:32<5:02:10,  2.18s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6201, 'learning_rate': 2.4468189393196747e-05, 'epoch': 0.67}
 22%|████████████████▍                                                         | 2368/10701 [2:40:32<5:02:10,  2.18s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6949, 'learning_rate': 2.4465248505048527e-05, 'epoch': 0.67}
 22%|████████████████▍                                                         | 2368/10701 [2:40:32<5:02:10,  2.18s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8304, 'learning_rate': 2.4462307616900303e-05, 'epoch': 0.67}
 22%|████████████████▍                                                         | 2368/10701 [2:40:32<5:02:10,  2.18s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 22%|████████████████▍                                                         | 2385/10701 [2:41:06<4:22:53,  1.90s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 22%|████████████████▍                                                         | 2385/10701 [2:41:06<4:22:53,  1.90s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5518, 'learning_rate': 2.4456425840603866e-05, 'epoch': 0.67}
 22%|████████████████▍                                                         | 2385/10701 [2:41:06<4:22:53,  1.90s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8829, 'learning_rate': 2.4453484952455642e-05, 'epoch': 0.67}
{'loss': 6.7153, 'learning_rate': 2.4450544064307422e-05, 'epoch': 0.67}
 22%|████████████████▍                                                         | 2385/10701 [2:41:06<4:22:53,  1.90s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9642, 'learning_rate': 2.44476031761592e-05, 'epoch': 0.67}
 22%|████████████████▍                                                         | 2385/10701 [2:41:06<4:22:53,  1.90s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7756, 'learning_rate': 2.4444662288010978e-05, 'epoch': 0.67}
 22%|████████████████▍                                                         | 2385/10701 [2:41:06<4:22:53,  1.90s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8353, 'learning_rate': 2.444172139986276e-05, 'epoch': 0.67}
 22%|████████████████▍                                                         | 2385/10701 [2:41:06<4:22:53,  1.90s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6284, 'learning_rate': 2.4438780511714537e-05, 'epoch': 0.67}
 22%|████████████████▍                                                         | 2385/10701 [2:41:06<4:22:53,  1.90s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8786, 'learning_rate': 2.4435839623566317e-05, 'epoch': 0.67}
{'loss': 6.7151, 'learning_rate': 2.4432898735418097e-05, 'epoch': 0.67}
 22%|████████████████▍                                                         | 2385/10701 [2:41:06<4:22:53,  1.90s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 22%|████████████████▌                                                         | 2395/10701 [2:41:22<3:20:11,  1.45s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 22%|████████████████▌                                                         | 2395/10701 [2:41:22<3:20:11,  1.45s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6631, 'learning_rate': 2.4427016959121656e-05, 'epoch': 0.67}
[WARNING|modeling_utils.py:388] 2022-03-02 12:26:05,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 12:26:05,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5131, 'learning_rate': 2.4421135182825212e-05, 'epoch': 0.67}
{'loss': 6.8408, 'learning_rate': 2.4418194294676992e-05, 'epoch': 0.67}
[WARNING|modeling_utils.py:388] 2022-03-02 12:26:05,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 22%|████████████████▌                                                         | 2400/10701 [2:41:28<3:02:47,  1.32s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 22%|████████████████▌                                                         | 2400/10701 [2:41:28<3:02:47,  1.32s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 5.8935, 'learning_rate': 2.441231251838055e-05, 'epoch': 0.67}
 22%|████████████████▌                                                         | 2400/10701 [2:41:28<3:02:47,  1.32s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8596, 'learning_rate': 2.440937163023233e-05, 'epoch': 0.67}
 22%|████████████████▌                                                         | 2400/10701 [2:41:28<3:02:47,  1.32s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8626, 'learning_rate': 2.440643074208411e-05, 'epoch': 0.67}
 22%|████████████████▌                                                         | 2400/10701 [2:41:28<3:02:47,  1.32s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.1686, 'learning_rate': 2.4403489853935887e-05, 'epoch': 0.67}
 22%|████████████████▌                                                         | 2400/10701 [2:41:28<3:02:47,  1.32s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7454, 'learning_rate': 2.440054896578767e-05, 'epoch': 0.67}
 22%|████████████████▌                                                         | 2400/10701 [2:41:28<3:02:47,  1.32s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8692, 'learning_rate': 2.4397608077639447e-05, 'epoch': 0.67}
 22%|████████████████▌                                                         | 2400/10701 [2:41:28<3:02:47,  1.32s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 22%|████████████████▌                                                         | 2400/10701 [2:41:28<3:02:47,  1.32s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8024, 'learning_rate': 2.4394667189491226e-05, 'epoch': 0.67}
{'loss': 6.7854, 'learning_rate': 2.4391726301343006e-05, 'epoch': 0.67}
 22%|████████████████▌                                                         | 2400/10701 [2:41:28<3:02:47,  1.32s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.3237, 'learning_rate': 2.4388785413194783e-05, 'epoch': 0.67}
 22%|████████████████▌                                                         | 2400/10701 [2:41:28<3:02:47,  1.32s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 22%|████████████████▌                                                         | 2400/10701 [2:41:28<3:02:47,  1.32s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9281, 'learning_rate': 2.4385844525046566e-05, 'epoch': 0.68}
 22%|████████████████▌                                                         | 2400/10701 [2:41:28<3:02:47,  1.32s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.937, 'learning_rate': 2.4382903636898345e-05, 'epoch': 0.68}
 22%|████████████████▌                                                         | 2400/10701 [2:41:28<3:02:47,  1.32s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9007, 'learning_rate': 2.4379962748750122e-05, 'epoch': 0.68}
 22%|████████████████▌                                                         | 2400/10701 [2:41:28<3:02:47,  1.32s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7196, 'learning_rate': 2.43770218606019e-05, 'epoch': 0.68}
 22%|████████████████▌                                                         | 2400/10701 [2:41:28<3:02:47,  1.32s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8709, 'learning_rate': 2.437408097245368e-05, 'epoch': 0.68}
 22%|████████████████▌                                                         | 2400/10701 [2:41:28<3:02:47,  1.32s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0748, 'learning_rate': 2.437114008430546e-05, 'epoch': 0.68}
 22%|████████████████▌                                                         | 2400/10701 [2:41:28<3:02:47,  1.32s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.978, 'learning_rate': 2.436819919615724e-05, 'epoch': 0.68}
 22%|████████████████▌                                                         | 2400/10701 [2:41:28<3:02:47,  1.32s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5606, 'learning_rate': 2.436525830800902e-05, 'epoch': 0.68}
 22%|████████████████▌                                                         | 2400/10701 [2:41:28<3:02:47,  1.32s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 23%|████████████████▋                                                         | 2418/10701 [2:42:09<4:59:36,  2.17s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 23%|████████████████▋                                                         | 2418/10701 [2:42:09<4:59:36,  2.17s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5752, 'learning_rate': 2.435937653171258e-05, 'epoch': 0.68}
 23%|████████████████▋                                                         | 2418/10701 [2:42:09<4:59:36,  2.17s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.565, 'learning_rate': 2.435643564356436e-05, 'epoch': 0.68}
 23%|████████████████▋                                                         | 2418/10701 [2:42:09<4:59:36,  2.17s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0503, 'learning_rate': 2.4353494755416136e-05, 'epoch': 0.68}
 23%|████████████████▋                                                         | 2418/10701 [2:42:09<4:59:36,  2.17s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8645, 'learning_rate': 2.4350553867267916e-05, 'epoch': 0.68}
 23%|████████████████▋                                                         | 2418/10701 [2:42:09<4:59:36,  2.17s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 23%|████████████████▊                                                         | 2423/10701 [2:42:19<4:52:12,  2.12s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 23%|████████████████▊                                                         | 2423/10701 [2:42:19<4:52:12,  2.12s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9208, 'learning_rate': 2.4344672090971475e-05, 'epoch': 0.68}
 23%|████████████████▊                                                         | 2423/10701 [2:42:19<4:52:12,  2.12s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9452, 'learning_rate': 2.4341731202823255e-05, 'epoch': 0.68}
 23%|████████████████▊                                                         | 2423/10701 [2:42:19<4:52:12,  2.12s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8359, 'learning_rate': 2.433879031467503e-05, 'epoch': 0.68}
 23%|████████████████▊                                                         | 2423/10701 [2:42:19<4:52:12,  2.12s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9807, 'learning_rate': 2.433584942652681e-05, 'epoch': 0.68}
 23%|████████████████▊                                                         | 2423/10701 [2:42:19<4:52:12,  2.12s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9309, 'learning_rate': 2.433290853837859e-05, 'epoch': 0.68}
 23%|████████████████▊                                                         | 2423/10701 [2:42:19<4:52:12,  2.12s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.825, 'learning_rate': 2.432996765023037e-05, 'epoch': 0.68}
 23%|████████████████▊                                                         | 2423/10701 [2:42:19<4:52:12,  2.12s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2358, 'learning_rate': 2.432702676208215e-05, 'epoch': 0.68}
 23%|████████████████▊                                                         | 2423/10701 [2:42:19<4:52:12,  2.12s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9943, 'learning_rate': 2.432408587393393e-05, 'epoch': 0.68}
 23%|████████████████▊                                                         | 2423/10701 [2:42:19<4:52:12,  2.12s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7143, 'learning_rate': 2.4321144985785706e-05, 'epoch': 0.68}
 23%|████████████████▊                                                         | 2423/10701 [2:42:19<4:52:12,  2.12s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.718, 'learning_rate': 2.431820409763749e-05, 'epoch': 0.68}
 23%|████████████████▊                                                         | 2423/10701 [2:42:19<4:52:12,  2.12s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0171, 'learning_rate': 2.431526320948927e-05, 'epoch': 0.68}
 23%|████████████████▊                                                         | 2423/10701 [2:42:19<4:52:12,  2.12s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8243, 'learning_rate': 2.4312322321341045e-05, 'epoch': 0.68}
 23%|████████████████▊                                                         | 2423/10701 [2:42:19<4:52:12,  2.12s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.2428, 'learning_rate': 2.4309381433192825e-05, 'epoch': 0.68}
 23%|████████████████▊                                                         | 2423/10701 [2:42:19<4:52:12,  2.12s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0308, 'learning_rate': 2.43064405450446e-05, 'epoch': 0.68}
 23%|████████████████▊                                                         | 2423/10701 [2:42:19<4:52:12,  2.12s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.861, 'learning_rate': 2.4303499656896384e-05, 'epoch': 0.68}
 23%|████████████████▊                                                         | 2423/10701 [2:42:19<4:52:12,  2.12s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7746, 'learning_rate': 2.4300558768748164e-05, 'epoch': 0.68}
 23%|████████████████▊                                                         | 2423/10701 [2:42:19<4:52:12,  2.12s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8503, 'learning_rate': 2.429761788059994e-05, 'epoch': 0.68}
{'loss': 7.0433, 'learning_rate': 2.429467699245172e-05, 'epoch': 0.68}
 23%|████████████████▊                                                         | 2423/10701 [2:42:19<4:52:12,  2.12s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7942, 'learning_rate': 2.42917361043035e-05, 'epoch': 0.68}
 23%|████████████████▊                                                         | 2423/10701 [2:42:19<4:52:12,  2.12s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 23%|████████████████▉                                                         | 2443/10701 [2:42:57<3:50:29,  1.67s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 23%|████████████████▉                                                         | 2443/10701 [2:42:57<3:50:29,  1.67s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8193, 'learning_rate': 2.428585432800706e-05, 'epoch': 0.68}
 23%|████████████████▉                                                         | 2443/10701 [2:42:57<3:50:29,  1.67s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8062, 'learning_rate': 2.428291343985884e-05, 'epoch': 0.69}
 23%|████████████████▉                                                         | 2446/10701 [2:43:01<3:18:21,  1.44s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 23%|████████████████▉                                                         | 2446/10701 [2:43:01<3:18:21,  1.44s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9323, 'learning_rate': 2.4277031663562395e-05, 'epoch': 0.69}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6066, 'learning_rate': 2.4271149887265955e-05, 'epoch': 0.69}
{'loss': 6.9802, 'learning_rate': 2.4268208999117734e-05, 'epoch': 0.69}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.3601, 'learning_rate': 2.426526811096951e-05, 'epoch': 0.69}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0317, 'learning_rate': 2.4262327222821294e-05, 'epoch': 0.69}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7078, 'learning_rate': 2.4259386334673074e-05, 'epoch': 0.69}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6922, 'learning_rate': 2.425644544652485e-05, 'epoch': 0.69}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6744, 'learning_rate': 2.425350455837663e-05, 'epoch': 0.69}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9788, 'learning_rate': 2.425056367022841e-05, 'epoch': 0.69}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7912, 'learning_rate': 2.424762278208019e-05, 'epoch': 0.69}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7996, 'learning_rate': 2.424468189393197e-05, 'epoch': 0.69}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8141, 'learning_rate': 2.424174100578375e-05, 'epoch': 0.69}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8794, 'learning_rate': 2.4238800117635525e-05, 'epoch': 0.69}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.937, 'learning_rate': 2.4235859229487305e-05, 'epoch': 0.69}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6495, 'learning_rate': 2.4232918341339088e-05, 'epoch': 0.69}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8937, 'learning_rate': 2.4229977453190864e-05, 'epoch': 0.69}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7284, 'learning_rate': 2.4227036565042644e-05, 'epoch': 0.69}
{'loss': 7.0291, 'learning_rate': 2.4224095676894423e-05, 'epoch': 0.69}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9037, 'learning_rate': 2.4221154788746203e-05, 'epoch': 0.69}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.894, 'learning_rate': 2.4218213900597983e-05, 'epoch': 0.69}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6869, 'learning_rate': 2.421527301244976e-05, 'epoch': 0.69}
{'loss': 6.7534, 'learning_rate': 2.421233212430154e-05, 'epoch': 0.69}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7065, 'learning_rate': 2.420939123615332e-05, 'epoch': 0.69}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0475, 'learning_rate': 2.42064503480051e-05, 'epoch': 0.69}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8591, 'learning_rate': 2.4203509459856878e-05, 'epoch': 0.69}
{'loss': 6.8997, 'learning_rate': 2.4200568571708658e-05, 'epoch': 0.69}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0664, 'learning_rate': 2.4197627683560434e-05, 'epoch': 0.69}
{'loss': 7.2093, 'learning_rate': 2.4194686795412214e-05, 'epoch': 0.69}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.724, 'learning_rate': 2.4191745907263997e-05, 'epoch': 0.69}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5666, 'learning_rate': 2.4188805019115773e-05, 'epoch': 0.69}
{'loss': 6.7341, 'learning_rate': 2.4185864130967553e-05, 'epoch': 0.69}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8194, 'learning_rate': 2.4182923242819333e-05, 'epoch': 0.69}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8252, 'learning_rate': 2.417998235467111e-05, 'epoch': 0.69}
{'loss': 7.117, 'learning_rate': 2.4177041466522892e-05, 'epoch': 0.7}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9878, 'learning_rate': 2.417410057837467e-05, 'epoch': 0.7}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6975, 'learning_rate': 2.417115969022645e-05, 'epoch': 0.7}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7592, 'learning_rate': 2.4168218802078228e-05, 'epoch': 0.7}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0083, 'learning_rate': 2.4165277913930008e-05, 'epoch': 0.7}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.7403, 'learning_rate': 2.4162337025781788e-05, 'epoch': 0.7}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9354, 'learning_rate': 2.4159396137633567e-05, 'epoch': 0.7}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.6941, 'learning_rate': 2.4156455249485344e-05, 'epoch': 0.7}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.925, 'learning_rate': 2.4153514361337123e-05, 'epoch': 0.7}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8178, 'learning_rate': 2.4150573473188906e-05, 'epoch': 0.7}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.9246, 'learning_rate': 2.4147632585040683e-05, 'epoch': 0.7}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 7.0038, 'learning_rate': 2.4144691696892462e-05, 'epoch': 0.7}
{'loss': 7.0296, 'learning_rate': 2.4141750808744242e-05, 'epoch': 0.7}
[WARNING|modeling_utils.py:388] 2022-03-02 12:27:45,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 23%|█████████████████▏                                                        | 2494/10701 [2:44:37<3:38:00,  1.59s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 23%|█████████████████▏                                                        | 2494/10701 [2:44:37<3:38:00,  1.59s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5555, 'learning_rate': 2.41358690324478e-05, 'epoch': 0.7}
 23%|█████████████████▏                                                        | 2494/10701 [2:44:37<3:38:00,  1.59s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.5844, 'learning_rate': 2.4132928144299578e-05, 'epoch': 0.7}
 23%|█████████████████▎                                                        | 2497/10701 [2:44:41<3:11:53,  1.40s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 23%|█████████████████▎                                                        | 2497/10701 [2:44:41<3:11:53,  1.40s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 6.8514, 'learning_rate': 2.4127046368003137e-05, 'epoch': 0.7}
{'loss': 6.3344, 'learning_rate': 2.4124105479854914e-05, 'epoch': 0.7}
 23%|█████████████████▎                                                        | 2497/10701 [2:44:41<3:11:53,  1.40s/it]g-point operations will not be computed-02 12:16:21,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8859, 'learning_rate': 2.4118223703558477e-05, 'epoch': 0.7}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
03/02/2022 12:54:38 - INFO - datasets.metric - Removing /home/sanchit_huggingface_co/.cache/huggingface/metrics/wer/default/default_experiment-1-0.arrow
{'eval_loss': 6.7275872230529785, 'eval_wer': 1.4950096235887056, 'eval_runtime': 1512.5692, 'eval_samples_per_second': 1.747, 'eval_steps_per_second': 0.437, 'epoch': 0.7}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8606, 'learning_rate': 2.4115282815410253e-05, 'epoch': 0.7}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7386, 'learning_rate': 2.4112341927262033e-05, 'epoch': 0.7}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7672, 'learning_rate': 2.4109401039113816e-05, 'epoch': 0.7}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9722, 'learning_rate': 2.4106460150965592e-05, 'epoch': 0.7}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9818, 'learning_rate': 2.4103519262817372e-05, 'epoch': 0.7}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0692, 'learning_rate': 2.410057837466915e-05, 'epoch': 0.7}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.5583, 'learning_rate': 2.4097637486520928e-05, 'epoch': 0.7}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.77, 'learning_rate': 2.409469659837271e-05, 'epoch': 0.7}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7405, 'learning_rate': 2.4091755710224487e-05, 'epoch': 0.7}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.895, 'learning_rate': 2.4088814822076267e-05, 'epoch': 0.7}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0186, 'learning_rate': 2.4085873933928047e-05, 'epoch': 0.7}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7219, 'learning_rate': 2.4082933045779823e-05, 'epoch': 0.7}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9566, 'learning_rate': 2.4079992157631606e-05, 'epoch': 0.7}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8832, 'learning_rate': 2.4077051269483386e-05, 'epoch': 0.7}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0101, 'learning_rate': 2.4074110381335162e-05, 'epoch': 0.7}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.1824, 'learning_rate': 2.4071169493186942e-05, 'epoch': 0.71}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6849, 'learning_rate': 2.4068228605038722e-05, 'epoch': 0.71}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.072, 'learning_rate': 2.40652877168905e-05, 'epoch': 0.71}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7548, 'learning_rate': 2.406234682874228e-05, 'epoch': 0.71}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9246, 'learning_rate': 2.405940594059406e-05, 'epoch': 0.71}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.864, 'learning_rate': 2.4056465052445837e-05, 'epoch': 0.71}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.4563, 'learning_rate': 2.405352416429762e-05, 'epoch': 0.71}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6535, 'learning_rate': 2.40505832761494e-05, 'epoch': 0.71}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8977, 'learning_rate': 2.4047642388001177e-05, 'epoch': 0.71}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9476, 'learning_rate': 2.4044701499852956e-05, 'epoch': 0.71}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8524, 'learning_rate': 2.4041760611704733e-05, 'epoch': 0.71}
{'loss': 7.0802, 'learning_rate': 2.4038819723556516e-05, 'epoch': 0.71}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9204, 'learning_rate': 2.4035878835408295e-05, 'epoch': 0.71}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0114, 'learning_rate': 2.4032937947260072e-05, 'epoch': 0.71}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7017, 'learning_rate': 2.402999705911185e-05, 'epoch': 0.71}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9901, 'learning_rate': 2.402705617096363e-05, 'epoch': 0.71}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6207, 'learning_rate': 2.402411528281541e-05, 'epoch': 0.71}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7629, 'learning_rate': 2.402117439466719e-05, 'epoch': 0.71}
{'loss': 6.6962, 'learning_rate': 2.401823350651897e-05, 'epoch': 0.71}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6469, 'learning_rate': 2.4015292618370747e-05, 'epoch': 0.71}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9103, 'learning_rate': 2.401235173022253e-05, 'epoch': 0.71}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0212, 'learning_rate': 2.400941084207431e-05, 'epoch': 0.71}
{'loss': 6.7428, 'learning_rate': 2.4006469953926086e-05, 'epoch': 0.71}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7249, 'learning_rate': 2.4003529065777866e-05, 'epoch': 0.71}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.929, 'learning_rate': 2.4000588177629642e-05, 'epoch': 0.71}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6335, 'learning_rate': 2.3997647289481425e-05, 'epoch': 0.71}
{'loss': 6.8545, 'learning_rate': 2.3994706401333205e-05, 'epoch': 0.71}
 23%|█████████████████▎                                                        | 2500/10701 [2:44:45<3:07:46,  1.37s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 24%|█████████████████▌                                                        | 2544/10701 [3:13:33<4:03:27,  1.79s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 24%|█████████████████▌                                                        | 2544/10701 [3:13:33<4:03:27,  1.79s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.2894, 'learning_rate': 2.398882462503676e-05, 'epoch': 0.71}
 24%|█████████████████▌                                                        | 2544/10701 [3:13:33<4:03:27,  1.79s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.5203, 'learning_rate': 2.398588373688854e-05, 'epoch': 0.71}
{'loss': 6.6533, 'learning_rate': 2.398294284874032e-05, 'epoch': 0.71}
 24%|█████████████████▌                                                        | 2544/10701 [3:13:33<4:03:27,  1.79s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 24%|█████████████████▌                                                        | 2548/10701 [3:13:39<3:23:50,  1.50s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 24%|█████████████████▌                                                        | 2548/10701 [3:13:39<3:23:50,  1.50s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.4511, 'learning_rate': 2.397706107244388e-05, 'epoch': 0.71}
{'loss': 6.3122, 'learning_rate': 2.3974120184295656e-05, 'epoch': 0.71}
 24%|█████████████████▌                                                        | 2548/10701 [3:13:39<3:23:50,  1.50s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.612, 'learning_rate': 2.3971179296147436e-05, 'epoch': 0.71}
 24%|█████████████████▌                                                        | 2548/10701 [3:13:39<3:23:50,  1.50s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 24%|█████████████████▌                                                        | 2548/10701 [3:13:39<3:23:50,  1.50s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.1896, 'learning_rate': 2.396823840799922e-05, 'epoch': 0.72}
 24%|█████████████████▌                                                        | 2548/10701 [3:13:39<3:23:50,  1.50s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.861, 'learning_rate': 2.3965297519850995e-05, 'epoch': 0.72}
 24%|█████████████████▌                                                        | 2548/10701 [3:13:39<3:23:50,  1.50s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9684, 'learning_rate': 2.3962356631702775e-05, 'epoch': 0.72}
 24%|█████████████████▌                                                        | 2548/10701 [3:13:39<3:23:50,  1.50s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.1394, 'learning_rate': 2.395941574355455e-05, 'epoch': 0.72}
 24%|█████████████████▌                                                        | 2548/10701 [3:13:39<3:23:50,  1.50s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8962, 'learning_rate': 2.3956474855406334e-05, 'epoch': 0.72}
 24%|█████████████████▌                                                        | 2548/10701 [3:13:39<3:23:50,  1.50s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8713, 'learning_rate': 2.3953533967258114e-05, 'epoch': 0.72}
 24%|█████████████████▌                                                        | 2548/10701 [3:13:39<3:23:50,  1.50s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 24%|█████████████████▌                                                        | 2548/10701 [3:13:39<3:23:50,  1.50s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8345, 'learning_rate': 2.395059307910989e-05, 'epoch': 0.72}
 24%|█████████████████▌                                                        | 2548/10701 [3:13:39<3:23:50,  1.50s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9139, 'learning_rate': 2.394765219096167e-05, 'epoch': 0.72}
 24%|█████████████████▌                                                        | 2548/10701 [3:13:39<3:23:50,  1.50s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9511, 'learning_rate': 2.394471130281345e-05, 'epoch': 0.72}
 24%|█████████████████▌                                                        | 2548/10701 [3:13:39<3:23:50,  1.50s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0635, 'learning_rate': 2.394177041466523e-05, 'epoch': 0.72}
 24%|█████████████████▌                                                        | 2548/10701 [3:13:39<3:23:50,  1.50s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 24%|█████████████████▋                                                        | 2562/10701 [3:14:10<5:12:44,  2.31s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 24%|█████████████████▋                                                        | 2562/10701 [3:14:10<5:12:44,  2.31s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.881, 'learning_rate': 2.393588863836879e-05, 'epoch': 0.72}
 24%|█████████████████▋                                                        | 2562/10701 [3:14:10<5:12:44,  2.31s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8575, 'learning_rate': 2.3932947750220565e-05, 'epoch': 0.72}
 24%|█████████████████▋                                                        | 2562/10701 [3:14:10<5:12:44,  2.31s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7461, 'learning_rate': 2.3930006862072345e-05, 'epoch': 0.72}
 24%|█████████████████▋                                                        | 2562/10701 [3:14:10<5:12:44,  2.31s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7338, 'learning_rate': 2.392706597392413e-05, 'epoch': 0.72}
 24%|█████████████████▋                                                        | 2562/10701 [3:14:10<5:12:44,  2.31s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8163, 'learning_rate': 2.3924125085775905e-05, 'epoch': 0.72}
 24%|█████████████████▋                                                        | 2562/10701 [3:14:10<5:12:44,  2.31s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8898, 'learning_rate': 2.3921184197627684e-05, 'epoch': 0.72}
 24%|█████████████████▋                                                        | 2562/10701 [3:14:10<5:12:44,  2.31s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6906, 'learning_rate': 2.3918243309479464e-05, 'epoch': 0.72}
 24%|█████████████████▋                                                        | 2562/10701 [3:14:10<5:12:44,  2.31s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6961, 'learning_rate': 2.391530242133124e-05, 'epoch': 0.72}
 24%|█████████████████▋                                                        | 2562/10701 [3:14:10<5:12:44,  2.31s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.5963, 'learning_rate': 2.3912361533183024e-05, 'epoch': 0.72}
 24%|█████████████████▋                                                        | 2562/10701 [3:14:10<5:12:44,  2.31s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8234, 'learning_rate': 2.39094206450348e-05, 'epoch': 0.72}
 24%|█████████████████▋                                                        | 2562/10701 [3:14:10<5:12:44,  2.31s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.841, 'learning_rate': 2.390647975688658e-05, 'epoch': 0.72}
 24%|█████████████████▋                                                        | 2562/10701 [3:14:10<5:12:44,  2.31s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 24%|█████████████████▋                                                        | 2562/10701 [3:14:10<5:12:44,  2.31s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.5371, 'learning_rate': 2.390353886873836e-05, 'epoch': 0.72}
{'loss': 7.0885, 'learning_rate': 2.390059798059014e-05, 'epoch': 0.72}
 24%|█████████████████▋                                                        | 2562/10701 [3:14:10<5:12:44,  2.31s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8453, 'learning_rate': 2.389765709244192e-05, 'epoch': 0.72}
 24%|█████████████████▋                                                        | 2562/10701 [3:14:10<5:12:44,  2.31s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8613, 'learning_rate': 2.38947162042937e-05, 'epoch': 0.72}
 24%|█████████████████▋                                                        | 2562/10701 [3:14:10<5:12:44,  2.31s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8336, 'learning_rate': 2.3891775316145475e-05, 'epoch': 0.72}
 24%|█████████████████▋                                                        | 2562/10701 [3:14:10<5:12:44,  2.31s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 24%|█████████████████▋                                                        | 2562/10701 [3:14:10<5:12:44,  2.31s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.2375, 'learning_rate': 2.3888834427997255e-05, 'epoch': 0.72}
 24%|█████████████████▋                                                        | 2562/10701 [3:14:10<5:12:44,  2.31s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.4281, 'learning_rate': 2.3885893539849038e-05, 'epoch': 0.72}
 24%|█████████████████▋                                                        | 2562/10701 [3:14:10<5:12:44,  2.31s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0128, 'learning_rate': 2.3882952651700814e-05, 'epoch': 0.72}
{'loss': 6.9641, 'learning_rate': 2.3880011763552594e-05, 'epoch': 0.72}
 24%|█████████████████▋                                                        | 2562/10701 [3:14:10<5:12:44,  2.31s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9362, 'learning_rate': 2.3877070875404373e-05, 'epoch': 0.72}
 24%|█████████████████▋                                                        | 2562/10701 [3:14:10<5:12:44,  2.31s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9083, 'learning_rate': 2.387412998725615e-05, 'epoch': 0.72}
 24%|█████████████████▋                                                        | 2562/10701 [3:14:10<5:12:44,  2.31s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8365, 'learning_rate': 2.3871189099107933e-05, 'epoch': 0.72}
 24%|█████████████████▋                                                        | 2562/10701 [3:14:10<5:12:44,  2.31s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8831, 'learning_rate': 2.386824821095971e-05, 'epoch': 0.72}
 24%|█████████████████▋                                                        | 2562/10701 [3:14:10<5:12:44,  2.31s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0533, 'learning_rate': 2.386530732281149e-05, 'epoch': 0.72}
 24%|█████████████████▋                                                        | 2562/10701 [3:14:10<5:12:44,  2.31s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6844, 'learning_rate': 2.386236643466327e-05, 'epoch': 0.73}
 24%|█████████████████▋                                                        | 2562/10701 [3:14:10<5:12:44,  2.31s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7105, 'learning_rate': 2.385942554651505e-05, 'epoch': 0.73}
 24%|█████████████████▉                                                        | 2589/10701 [3:15:06<4:09:53,  1.85s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 24%|█████████████████▉                                                        | 2589/10701 [3:15:06<4:09:53,  1.85s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.768, 'learning_rate': 2.3853543770218608e-05, 'epoch': 0.73}
 24%|█████████████████▉                                                        | 2589/10701 [3:15:06<4:09:53,  1.85s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9582, 'learning_rate': 2.3850602882070384e-05, 'epoch': 0.73}
 24%|█████████████████▉                                                        | 2589/10701 [3:15:06<4:09:53,  1.85s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6794, 'learning_rate': 2.3847661993922164e-05, 'epoch': 0.73}
 24%|█████████████████▉                                                        | 2589/10701 [3:15:06<4:09:53,  1.85s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.5575, 'learning_rate': 2.3844721105773947e-05, 'epoch': 0.73}
 24%|█████████████████▉                                                        | 2594/10701 [3:15:13<3:28:15,  1.54s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 24%|█████████████████▉                                                        | 2594/10701 [3:15:13<3:28:15,  1.54s/it][INFO|trainer.py:560] 2022-03-02 12:29:26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[WARNING|modeling_utils.py:388] 2022-03-02 12:59:57,303 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[WARNING|modeling_utils.py:388] 2022-03-02 12:59:57,303 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.4152, 'learning_rate': 2.3835898441329283e-05, 'epoch': 0.73}
{'loss': 6.4186, 'learning_rate': 2.383295755318106e-05, 'epoch': 0.73}
[WARNING|modeling_utils.py:388] 2022-03-02 12:59:57,303 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed26,129 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.5704, 'learning_rate': 2.3830016665032842e-05, 'epoch': 0.73}
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.0848, 'learning_rate': 2.38241348887364e-05, 'epoch': 0.73}
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0668, 'learning_rate': 2.3821194000588178e-05, 'epoch': 0.73}
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.2609, 'learning_rate': 2.3818253112439954e-05, 'epoch': 0.73}
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.967, 'learning_rate': 2.3815312224291738e-05, 'epoch': 0.73}
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8208, 'learning_rate': 2.3812371336143517e-05, 'epoch': 0.73}
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6882, 'learning_rate': 2.3809430447995294e-05, 'epoch': 0.73}
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9377, 'learning_rate': 2.3806489559847073e-05, 'epoch': 0.73}
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8347, 'learning_rate': 2.3803548671698856e-05, 'epoch': 0.73}
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7557, 'learning_rate': 2.3800607783550633e-05, 'epoch': 0.73}
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8517, 'learning_rate': 2.3797666895402413e-05, 'epoch': 0.73}
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7304, 'learning_rate': 2.3794726007254192e-05, 'epoch': 0.73}
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8754, 'learning_rate': 2.379178511910597e-05, 'epoch': 0.73}
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.774, 'learning_rate': 2.3788844230957752e-05, 'epoch': 0.73}
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0194, 'learning_rate': 2.378590334280953e-05, 'epoch': 0.73}
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7312, 'learning_rate': 2.3782962454661308e-05, 'epoch': 0.73}
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0986, 'learning_rate': 2.3780021566513088e-05, 'epoch': 0.73}
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.5186, 'learning_rate': 2.3777080678364864e-05, 'epoch': 0.73}
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8469, 'learning_rate': 2.3774139790216647e-05, 'epoch': 0.73}
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0811, 'learning_rate': 2.3771198902068427e-05, 'epoch': 0.73}
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.5319, 'learning_rate': 2.3768258013920203e-05, 'epoch': 0.73}
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6521, 'learning_rate': 2.3765317125771983e-05, 'epoch': 0.73}
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6511, 'learning_rate': 2.3762376237623762e-05, 'epoch': 0.73}
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.885, 'learning_rate': 2.3759435349475542e-05, 'epoch': 0.73}
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6292, 'learning_rate': 2.3756494461327322e-05, 'epoch': 0.74}
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.2157, 'learning_rate': 2.37535535731791e-05, 'epoch': 0.74}
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9484, 'learning_rate': 2.3750612685030878e-05, 'epoch': 0.74}
{'loss': 6.6367, 'learning_rate': 2.374767179688266e-05, 'epoch': 0.74}
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6773, 'learning_rate': 2.374473090873444e-05, 'epoch': 0.74}
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7081, 'learning_rate': 2.3741790020586217e-05, 'epoch': 0.74}
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.027, 'learning_rate': 2.3738849132437997e-05, 'epoch': 0.74}
{'loss': 6.7326, 'learning_rate': 2.3735908244289773e-05, 'epoch': 0.74}
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8859, 'learning_rate': 2.3732967356141556e-05, 'epoch': 0.74}
{'loss': 7.0325, 'learning_rate': 2.3730026467993336e-05, 'epoch': 0.74}
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9581, 'learning_rate': 2.3727085579845112e-05, 'epoch': 0.74}
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7137, 'learning_rate': 2.3724144691696892e-05, 'epoch': 0.74}
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7129, 'learning_rate': 2.3721203803548672e-05, 'epoch': 0.74}
{'loss': 6.7425, 'learning_rate': 2.371826291540045e-05, 'epoch': 0.74}
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7668, 'learning_rate': 2.371532202725223e-05, 'epoch': 0.74}
 24%|█████████████████▉                                                        | 2599/10701 [3:15:19<2:42:05,  1.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.868, 'learning_rate': 2.371238113910401e-05, 'epoch': 0.74}
 25%|██████████████████▏                                                       | 2639/10701 [3:16:43<4:05:56,  1.83s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 25%|██████████████████▏                                                       | 2639/10701 [3:16:43<4:05:56,  1.83s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9415, 'learning_rate': 2.370649936280757e-05, 'epoch': 0.74}
 25%|██████████████████▏                                                       | 2639/10701 [3:16:43<4:05:56,  1.83s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7656, 'learning_rate': 2.370355847465935e-05, 'epoch': 0.74}
 25%|██████████████████▏                                                       | 2639/10701 [3:16:43<4:05:56,  1.83s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 25%|██████████████████▎                                                       | 2643/10701 [3:16:49<3:36:52,  1.61s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 25%|██████████████████▎                                                       | 2643/10701 [3:16:49<3:36:52,  1.61s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.5923, 'learning_rate': 2.3697676698362906e-05, 'epoch': 0.74}
{'loss': 6.5312, 'learning_rate': 2.3694735810214683e-05, 'epoch': 0.74}
 25%|██████████████████▎                                                       | 2643/10701 [3:16:49<3:36:52,  1.61s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 25%|██████████████████▎                                                       | 2646/10701 [3:16:53<3:12:16,  1.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 25%|██████████████████▎                                                       | 2646/10701 [3:16:53<3:12:16,  1.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.57, 'learning_rate': 2.3688854033918245e-05, 'epoch': 0.74}
{'loss': 6.9269, 'learning_rate': 2.3685913145770022e-05, 'epoch': 0.74}
 25%|██████████████████▎                                                       | 2646/10701 [3:16:53<3:12:16,  1.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:00:01,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.1079, 'learning_rate': 2.36829722576218e-05, 'epoch': 0.74}
 25%|██████████████████▎                                                       | 2649/10701 [3:16:57<2:47:06,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 25%|██████████████████▎                                                       | 2649/10701 [3:16:57<2:47:06,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.0438, 'learning_rate': 2.367709048132536e-05, 'epoch': 0.74}
 25%|██████████████████▎                                                       | 2649/10701 [3:16:57<2:47:06,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7649, 'learning_rate': 2.367414959317714e-05, 'epoch': 0.74}
 25%|██████████████████▎                                                       | 2649/10701 [3:16:57<2:47:06,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 25%|██████████████████▎                                                       | 2649/10701 [3:16:57<2:47:06,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 25%|██████████████████▎                                                       | 2653/10701 [3:17:05<4:32:29,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 25%|██████████████████▎                                                       | 2653/10701 [3:17:05<4:32:29,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7911, 'learning_rate': 2.3668267816880697e-05, 'epoch': 0.74}
 25%|██████████████████▎                                                       | 2653/10701 [3:17:05<4:32:29,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0483, 'learning_rate': 2.3665326928732476e-05, 'epoch': 0.74}
 25%|██████████████████▎                                                       | 2653/10701 [3:17:05<4:32:29,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0664, 'learning_rate': 2.366238604058426e-05, 'epoch': 0.74}
 25%|██████████████████▎                                                       | 2653/10701 [3:17:05<4:32:29,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0949, 'learning_rate': 2.3659445152436036e-05, 'epoch': 0.74}
 25%|██████████████████▎                                                       | 2653/10701 [3:17:05<4:32:29,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0437, 'learning_rate': 2.3656504264287816e-05, 'epoch': 0.74}
 25%|██████████████████▎                                                       | 2653/10701 [3:17:05<4:32:29,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 25%|██████████████████▎                                                       | 2653/10701 [3:17:05<4:32:29,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.772, 'learning_rate': 2.3653563376139595e-05, 'epoch': 0.75}
{'loss': 6.8837, 'learning_rate': 2.3650622487991375e-05, 'epoch': 0.75}
 25%|██████████████████▎                                                       | 2653/10701 [3:17:05<4:32:29,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 25%|██████████████████▎                                                       | 2653/10701 [3:17:05<4:32:29,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8825, 'learning_rate': 2.3647681599843155e-05, 'epoch': 0.75}
 25%|██████████████████▎                                                       | 2653/10701 [3:17:05<4:32:29,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.917, 'learning_rate': 2.364474071169493e-05, 'epoch': 0.75}
 25%|██████████████████▎                                                       | 2653/10701 [3:17:05<4:32:29,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9344, 'learning_rate': 2.364179982354671e-05, 'epoch': 0.75}
 25%|██████████████████▎                                                       | 2653/10701 [3:17:05<4:32:29,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6294, 'learning_rate': 2.363885893539849e-05, 'epoch': 0.75}
 25%|██████████████████▎                                                       | 2653/10701 [3:17:05<4:32:29,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6454, 'learning_rate': 2.363591804725027e-05, 'epoch': 0.75}
 25%|██████████████████▎                                                       | 2653/10701 [3:17:05<4:32:29,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.2488, 'learning_rate': 2.363297715910205e-05, 'epoch': 0.75}
 25%|██████████████████▎                                                       | 2653/10701 [3:17:05<4:32:29,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.4929, 'learning_rate': 2.3627095382805606e-05, 'epoch': 0.75}
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9594, 'learning_rate': 2.3624154494657386e-05, 'epoch': 0.75}
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8647, 'learning_rate': 2.362121360650917e-05, 'epoch': 0.75}
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7428, 'learning_rate': 2.3618272718360945e-05, 'epoch': 0.75}
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6256, 'learning_rate': 2.3615331830212725e-05, 'epoch': 0.75}
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 25%|██████████████████▍                                                       | 2673/10701 [3:17:49<4:44:13,  2.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 25%|██████████████████▍                                                       | 2673/10701 [3:17:49<4:44:13,  2.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0625, 'learning_rate': 2.360945005391628e-05, 'epoch': 0.75}
 25%|██████████████████▍                                                       | 2673/10701 [3:17:49<4:44:13,  2.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8355, 'learning_rate': 2.3606509165768064e-05, 'epoch': 0.75}
 25%|██████████████████▍                                                       | 2673/10701 [3:17:49<4:44:13,  2.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8038, 'learning_rate': 2.360356827761984e-05, 'epoch': 0.75}
 25%|██████████████████▍                                                       | 2673/10701 [3:17:49<4:44:13,  2.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.4689, 'learning_rate': 2.35976865013234e-05, 'epoch': 0.75}
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0118, 'learning_rate': 2.359474561317518e-05, 'epoch': 0.75}
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.803, 'learning_rate': 2.359180472502696e-05, 'epoch': 0.75}
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7267, 'learning_rate': 2.358886383687874e-05, 'epoch': 0.75}
                                                                                                                        [WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 25%|██████████████████▌                                                       | 2682/10701 [3:18:08<4:30:56,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 25%|██████████████████▌                                                       | 2682/10701 [3:18:08<4:30:56,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8486, 'learning_rate': 2.3582982060582295e-05, 'epoch': 0.75}
 25%|██████████████████▌                                                       | 2682/10701 [3:18:08<4:30:56,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6274, 'learning_rate': 2.358004117243408e-05, 'epoch': 0.75}
 25%|██████████████████▌                                                       | 2682/10701 [3:18:08<4:30:56,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.1554, 'learning_rate': 2.3577100284285855e-05, 'epoch': 0.75}
 25%|██████████████████▌                                                       | 2682/10701 [3:18:08<4:30:56,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8586, 'learning_rate': 2.3574159396137634e-05, 'epoch': 0.75}
 25%|██████████████████▌                                                       | 2682/10701 [3:18:08<4:30:56,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.5775, 'learning_rate': 2.3571218507989414e-05, 'epoch': 0.75}
 25%|██████████████████▌                                                       | 2682/10701 [3:18:08<4:30:56,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9805, 'learning_rate': 2.356827761984119e-05, 'epoch': 0.75}
 25%|██████████████████▌                                                       | 2682/10701 [3:18:08<4:30:56,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.1246, 'learning_rate': 2.3565336731692974e-05, 'epoch': 0.75}
 25%|██████████████████▌                                                       | 2682/10701 [3:18:08<4:30:56,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7246, 'learning_rate': 2.356239584354475e-05, 'epoch': 0.75}
{'loss': 7.0059, 'learning_rate': 2.355945495539653e-05, 'epoch': 0.75}
 25%|██████████████████▌                                                       | 2682/10701 [3:18:08<4:30:56,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0876, 'learning_rate': 2.355651406724831e-05, 'epoch': 0.75}
 25%|██████████████████▌                                                       | 2682/10701 [3:18:08<4:30:56,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.1619, 'learning_rate': 2.355357317910009e-05, 'epoch': 0.75}
 25%|██████████████████▌                                                       | 2682/10701 [3:18:08<4:30:56,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7, 'learning_rate': 2.355063229095187e-05, 'epoch': 0.75}
 25%|██████████████████▌                                                       | 2682/10701 [3:18:08<4:30:56,  2.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.925, 'learning_rate': 2.354769140280365e-05, 'epoch': 0.76}
 25%|██████████████████▋                                                       | 2696/10701 [3:18:32<3:14:25,  1.46s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 25%|██████████████████▋                                                       | 2696/10701 [3:18:32<3:14:25,  1.46s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7932, 'learning_rate': 2.3541809626507205e-05, 'epoch': 0.76}
{'loss': 6.8392, 'learning_rate': 2.3538868738358988e-05, 'epoch': 0.76}
 25%|██████████████████▋                                                       | 2696/10701 [3:18:32<3:14:25,  1.46s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:01:38,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.311, 'learning_rate': 2.3532986962062544e-05, 'epoch': 0.76}
{'loss': 6.0855, 'learning_rate': 2.3530046073914324e-05, 'epoch': 0.76}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9138, 'learning_rate': 2.35271051857661e-05, 'epoch': 0.76}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.1443, 'learning_rate': 2.3524164297617883e-05, 'epoch': 0.76}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7155, 'learning_rate': 2.352122340946966e-05, 'epoch': 0.76}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.1171, 'learning_rate': 2.351828252132144e-05, 'epoch': 0.76}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8267, 'learning_rate': 2.351534163317322e-05, 'epoch': 0.76}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0675, 'learning_rate': 2.3512400745024995e-05, 'epoch': 0.76}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0236, 'learning_rate': 2.3509459856876778e-05, 'epoch': 0.76}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.5944, 'learning_rate': 2.3506518968728558e-05, 'epoch': 0.76}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9538, 'learning_rate': 2.3503578080580334e-05, 'epoch': 0.76}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8372, 'learning_rate': 2.3500637192432114e-05, 'epoch': 0.76}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.1368, 'learning_rate': 2.3497696304283897e-05, 'epoch': 0.76}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8877, 'learning_rate': 2.3494755416135673e-05, 'epoch': 0.76}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.918, 'learning_rate': 2.3491814527987453e-05, 'epoch': 0.76}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6072, 'learning_rate': 2.3488873639839233e-05, 'epoch': 0.76}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8274, 'learning_rate': 2.348593275169101e-05, 'epoch': 0.76}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8231, 'learning_rate': 2.3482991863542792e-05, 'epoch': 0.76}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9972, 'learning_rate': 2.3480050975394572e-05, 'epoch': 0.76}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.5907, 'learning_rate': 2.347711008724635e-05, 'epoch': 0.76}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8729, 'learning_rate': 2.3474169199098128e-05, 'epoch': 0.76}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9611, 'learning_rate': 2.3471228310949905e-05, 'epoch': 0.76}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8115, 'learning_rate': 2.3468287422801688e-05, 'epoch': 0.76}
{'loss': 7.1729, 'learning_rate': 2.3465346534653467e-05, 'epoch': 0.76}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.5809, 'learning_rate': 2.3462405646505244e-05, 'epoch': 0.76}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9671, 'learning_rate': 2.3459464758357023e-05, 'epoch': 0.76}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8137, 'learning_rate': 2.3456523870208803e-05, 'epoch': 0.76}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6775, 'learning_rate': 2.3453582982060583e-05, 'epoch': 0.76}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.2042, 'learning_rate': 2.3450642093912363e-05, 'epoch': 0.76}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8522, 'learning_rate': 2.3447701205764142e-05, 'epoch': 0.76}
{'loss': 6.8993, 'learning_rate': 2.344476031761592e-05, 'epoch': 0.76}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6326, 'learning_rate': 2.3441819429467702e-05, 'epoch': 0.77}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0945, 'learning_rate': 2.343887854131948e-05, 'epoch': 0.77}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.5864, 'learning_rate': 2.3435937653171258e-05, 'epoch': 0.77}
{'loss': 6.8055, 'learning_rate': 2.3432996765023038e-05, 'epoch': 0.77}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.883, 'learning_rate': 2.3430055876874814e-05, 'epoch': 0.77}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8956, 'learning_rate': 2.3427114988726597e-05, 'epoch': 0.77}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9648, 'learning_rate': 2.3424174100578377e-05, 'epoch': 0.77}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9897, 'learning_rate': 2.3421233212430153e-05, 'epoch': 0.77}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7348, 'learning_rate': 2.3418292324281933e-05, 'epoch': 0.77}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0785, 'learning_rate': 2.3415351436133713e-05, 'epoch': 0.77}
 25%|██████████████████▋                                                       | 2699/10701 [3:18:35<2:46:43,  1.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 26%|██████████████████▉                                                       | 2741/10701 [3:20:05<4:02:47,  1.83s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 26%|██████████████████▉                                                       | 2741/10701 [3:20:05<4:02:47,  1.83s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7194, 'learning_rate': 2.3409469659837272e-05, 'epoch': 0.77}
 26%|██████████████████▉                                                       | 2741/10701 [3:20:05<4:02:47,  1.83s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6547, 'learning_rate': 2.3406528771689052e-05, 'epoch': 0.77}
 26%|██████████████████▉                                                       | 2741/10701 [3:20:05<4:02:47,  1.83s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7459, 'learning_rate': 2.3403587883540828e-05, 'epoch': 0.77}
 26%|██████████████████▉                                                       | 2745/10701 [3:20:11<3:26:51,  1.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 26%|██████████████████▉                                                       | 2745/10701 [3:20:11<3:26:51,  1.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7476, 'learning_rate': 2.339770610724439e-05, 'epoch': 0.77}
 26%|██████████████████▉                                                       | 2745/10701 [3:20:11<3:26:51,  1.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8123, 'learning_rate': 2.3394765219096167e-05, 'epoch': 0.77}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.5942, 'learning_rate': 2.3388883442799723e-05, 'epoch': 0.77}
{'loss': 6.51, 'learning_rate': 2.3385942554651506e-05, 'epoch': 0.77}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.3627, 'learning_rate': 2.3383001666503286e-05, 'epoch': 0.77}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.2297, 'learning_rate': 2.3380060778355062e-05, 'epoch': 0.77}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.1871, 'learning_rate': 2.3377119890206842e-05, 'epoch': 0.77}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7022, 'learning_rate': 2.3374179002058622e-05, 'epoch': 0.77}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.5546, 'learning_rate': 2.33712381139104e-05, 'epoch': 0.77}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0547, 'learning_rate': 2.336829722576218e-05, 'epoch': 0.77}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7496, 'learning_rate': 2.336535633761396e-05, 'epoch': 0.77}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9656, 'learning_rate': 2.3362415449465737e-05, 'epoch': 0.77}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.2435, 'learning_rate': 2.3359474561317517e-05, 'epoch': 0.77}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9227, 'learning_rate': 2.33565336731693e-05, 'epoch': 0.77}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7228, 'learning_rate': 2.3353592785021077e-05, 'epoch': 0.77}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.609, 'learning_rate': 2.3350651896872856e-05, 'epoch': 0.77}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9451, 'learning_rate': 2.3347711008724636e-05, 'epoch': 0.77}
{'loss': 6.7429, 'learning_rate': 2.3344770120576416e-05, 'epoch': 0.77}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.2265, 'learning_rate': 2.3341829232428196e-05, 'epoch': 0.77}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.5575, 'learning_rate': 2.3338888344279972e-05, 'epoch': 0.78}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0071, 'learning_rate': 2.333594745613175e-05, 'epoch': 0.78}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.469, 'learning_rate': 2.333300656798353e-05, 'epoch': 0.78}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8776, 'learning_rate': 2.333006567983531e-05, 'epoch': 0.78}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9118, 'learning_rate': 2.332712479168709e-05, 'epoch': 0.78}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.732, 'learning_rate': 2.332418390353887e-05, 'epoch': 0.78}
{'loss': 6.8946, 'learning_rate': 2.3321243015390647e-05, 'epoch': 0.78}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8659, 'learning_rate': 2.3318302127242427e-05, 'epoch': 0.78}
{'loss': 6.8897, 'learning_rate': 2.331536123909421e-05, 'epoch': 0.78}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0182, 'learning_rate': 2.3312420350945986e-05, 'epoch': 0.78}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7662, 'learning_rate': 2.3309479462797766e-05, 'epoch': 0.78}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6283, 'learning_rate': 2.3306538574649545e-05, 'epoch': 0.78}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8099, 'learning_rate': 2.3303597686501322e-05, 'epoch': 0.78}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6977, 'learning_rate': 2.3300656798353105e-05, 'epoch': 0.78}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.5022, 'learning_rate': 2.329771591020488e-05, 'epoch': 0.78}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8182, 'learning_rate': 2.329477502205666e-05, 'epoch': 0.78}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8634, 'learning_rate': 2.329183413390844e-05, 'epoch': 0.78}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8621, 'learning_rate': 2.328889324576022e-05, 'epoch': 0.78}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9743, 'learning_rate': 2.3285952357612e-05, 'epoch': 0.78}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6791, 'learning_rate': 2.328301146946378e-05, 'epoch': 0.78}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8845, 'learning_rate': 2.3280070581315556e-05, 'epoch': 0.78}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8778, 'learning_rate': 2.3277129693167336e-05, 'epoch': 0.78}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6454, 'learning_rate': 2.327418880501912e-05, 'epoch': 0.78}
 26%|███████████████████                                                       | 2748/10701 [3:20:14<2:56:43,  1.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8615, 'learning_rate': 2.3271247916870895e-05, 'epoch': 0.78}
 26%|███████████████████▎                                                      | 2790/10701 [3:21:42<4:04:07,  1.85s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 26%|███████████████████▎                                                      | 2790/10701 [3:21:42<4:04:07,  1.85s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.757, 'learning_rate': 2.3265366140574455e-05, 'epoch': 0.78}
 26%|███████████████████▎                                                      | 2790/10701 [3:21:42<4:04:07,  1.85s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.4839, 'learning_rate': 2.326242525242623e-05, 'epoch': 0.78}
 26%|███████████████████▎                                                      | 2790/10701 [3:21:42<4:04:07,  1.85s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8202, 'learning_rate': 2.3259484364278014e-05, 'epoch': 0.78}
 26%|███████████████████▎                                                      | 2790/10701 [3:21:42<4:04:07,  1.85s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 26%|███████████████████▎                                                      | 2795/10701 [3:21:50<3:22:43,  1.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 26%|███████████████████▎                                                      | 2795/10701 [3:21:50<3:22:43,  1.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8465, 'learning_rate': 2.325360258798157e-05, 'epoch': 0.78}
[WARNING|modeling_utils.py:388] 2022-03-02 13:06:33,652 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[WARNING|modeling_utils.py:388] 2022-03-02 13:06:33,652 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7396, 'learning_rate': 2.3247720811685126e-05, 'epoch': 0.78}
{'loss': 6.4441, 'learning_rate': 2.324477992353691e-05, 'epoch': 0.78}
[WARNING|modeling_utils.py:388] 2022-03-02 13:06:33,652 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 26%|███████████████████▎                                                      | 2800/10701 [3:21:56<2:59:56,  1.37s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 26%|███████████████████▎                                                      | 2800/10701 [3:21:56<2:59:56,  1.37s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.3132, 'learning_rate': 2.3238898147240466e-05, 'epoch': 0.78}
{'loss': 6.4199, 'learning_rate': 2.3235957259092245e-05, 'epoch': 0.78}
 26%|███████████████████▎                                                      | 2800/10701 [3:21:56<2:59:56,  1.37s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 26%|███████████████████▎                                                      | 2800/10701 [3:21:56<2:59:56,  1.37s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8663, 'learning_rate': 2.323301637094403e-05, 'epoch': 0.79}
 26%|███████████████████▎                                                      | 2800/10701 [3:21:56<2:59:56,  1.37s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7951, 'learning_rate': 2.3230075482795805e-05, 'epoch': 0.79}
 26%|███████████████████▎                                                      | 2800/10701 [3:21:56<2:59:56,  1.37s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7987, 'learning_rate': 2.3227134594647584e-05, 'epoch': 0.79}
 26%|███████████████████▎                                                      | 2800/10701 [3:21:56<2:59:56,  1.37s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9305, 'learning_rate': 2.3224193706499364e-05, 'epoch': 0.79}
 26%|███████████████████▎                                                      | 2800/10701 [3:21:56<2:59:56,  1.37s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7359, 'learning_rate': 2.322125281835114e-05, 'epoch': 0.79}
 26%|███████████████████▎                                                      | 2800/10701 [3:21:56<2:59:56,  1.37s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9297, 'learning_rate': 2.3218311930202924e-05, 'epoch': 0.79}
 26%|███████████████████▎                                                      | 2800/10701 [3:21:56<2:59:56,  1.37s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.192, 'learning_rate': 2.3215371042054703e-05, 'epoch': 0.79}
 26%|███████████████████▎                                                      | 2800/10701 [3:21:56<2:59:56,  1.37s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8315, 'learning_rate': 2.321243015390648e-05, 'epoch': 0.79}
 26%|███████████████████▎                                                      | 2800/10701 [3:21:56<2:59:56,  1.37s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8395, 'learning_rate': 2.320948926575826e-05, 'epoch': 0.79}
 26%|███████████████████▎                                                      | 2800/10701 [3:21:56<2:59:56,  1.37s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 26%|███████████████████▎                                                      | 2800/10701 [3:21:56<2:59:56,  1.37s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9153, 'learning_rate': 2.3206548377610036e-05, 'epoch': 0.79}
 26%|███████████████████▎                                                      | 2800/10701 [3:21:56<2:59:56,  1.37s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.918, 'learning_rate': 2.320360748946182e-05, 'epoch': 0.79}
 26%|███████████████████▎                                                      | 2800/10701 [3:21:56<2:59:56,  1.37s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6716, 'learning_rate': 2.32006666013136e-05, 'epoch': 0.79}
 26%|███████████████████▎                                                      | 2800/10701 [3:21:56<2:59:56,  1.37s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8052, 'learning_rate': 2.3197725713165375e-05, 'epoch': 0.79}
 26%|███████████████████▎                                                      | 2800/10701 [3:21:56<2:59:56,  1.37s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7442, 'learning_rate': 2.3194784825017155e-05, 'epoch': 0.79}
 26%|███████████████████▎                                                      | 2800/10701 [3:21:56<2:59:56,  1.37s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.009, 'learning_rate': 2.3191843936868938e-05, 'epoch': 0.79}
 26%|███████████████████▎                                                      | 2800/10701 [3:21:56<2:59:56,  1.37s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.1373, 'learning_rate': 2.3188903048720714e-05, 'epoch': 0.79}
 26%|███████████████████▎                                                      | 2800/10701 [3:21:56<2:59:56,  1.37s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9977, 'learning_rate': 2.3185962160572494e-05, 'epoch': 0.79}
 26%|███████████████████▎                                                      | 2800/10701 [3:21:56<2:59:56,  1.37s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7622, 'learning_rate': 2.3183021272424274e-05, 'epoch': 0.79}
 26%|███████████████████▎                                                      | 2800/10701 [3:21:56<2:59:56,  1.37s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7717, 'learning_rate': 2.318008038427605e-05, 'epoch': 0.79}
 26%|███████████████████▎                                                      | 2800/10701 [3:21:56<2:59:56,  1.37s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9384, 'learning_rate': 2.3177139496127833e-05, 'epoch': 0.79}
 26%|███████████████████▎                                                      | 2800/10701 [3:21:56<2:59:56,  1.37s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0186, 'learning_rate': 2.3174198607979613e-05, 'epoch': 0.79}
 26%|███████████████████▎                                                      | 2800/10701 [3:21:56<2:59:56,  1.37s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0688, 'learning_rate': 2.317125771983139e-05, 'epoch': 0.79}
 26%|███████████████████▎                                                      | 2800/10701 [3:21:56<2:59:56,  1.37s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7214, 'learning_rate': 2.316831683168317e-05, 'epoch': 0.79}
 26%|███████████████████▎                                                      | 2800/10701 [3:21:56<2:59:56,  1.37s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0195, 'learning_rate': 2.3165375943534945e-05, 'epoch': 0.79}
 26%|███████████████████▎                                                      | 2800/10701 [3:21:56<2:59:56,  1.37s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8692, 'learning_rate': 2.3162435055386728e-05, 'epoch': 0.79}
 26%|███████████████████▎                                                      | 2800/10701 [3:21:56<2:59:56,  1.37s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 26%|███████████████████▌                                                      | 2827/10701 [3:22:56<4:34:33,  2.09s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 26%|███████████████████▌                                                      | 2827/10701 [3:22:56<4:34:33,  2.09s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8334, 'learning_rate': 2.3156553279090284e-05, 'epoch': 0.79}
 26%|███████████████████▌                                                      | 2827/10701 [3:22:56<4:34:33,  2.09s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7098, 'learning_rate': 2.3153612390942064e-05, 'epoch': 0.79}
 26%|███████████████████▌                                                      | 2827/10701 [3:22:56<4:34:33,  2.09s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7565, 'learning_rate': 2.3150671502793844e-05, 'epoch': 0.79}
 26%|███████████████████▌                                                      | 2827/10701 [3:22:56<4:34:33,  2.09s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0283, 'learning_rate': 2.3147730614645624e-05, 'epoch': 0.79}
 26%|███████████████████▌                                                      | 2827/10701 [3:22:56<4:34:33,  2.09s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9885, 'learning_rate': 2.3144789726497403e-05, 'epoch': 0.79}
 26%|███████████████████▌                                                      | 2827/10701 [3:22:56<4:34:33,  2.09s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9871, 'learning_rate': 2.3141848838349183e-05, 'epoch': 0.79}
 26%|███████████████████▌                                                      | 2827/10701 [3:22:56<4:34:33,  2.09s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8199, 'learning_rate': 2.313890795020096e-05, 'epoch': 0.79}
 26%|███████████████████▌                                                      | 2827/10701 [3:22:56<4:34:33,  2.09s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8542, 'learning_rate': 2.3135967062052742e-05, 'epoch': 0.79}
 26%|███████████████████▌                                                      | 2827/10701 [3:22:56<4:34:33,  2.09s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.831, 'learning_rate': 2.3133026173904522e-05, 'epoch': 0.79}
 26%|███████████████████▌                                                      | 2827/10701 [3:22:56<4:34:33,  2.09s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.1539, 'learning_rate': 2.31300852857563e-05, 'epoch': 0.79}
 26%|███████████████████▌                                                      | 2827/10701 [3:22:56<4:34:33,  2.09s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.774, 'learning_rate': 2.3127144397608078e-05, 'epoch': 0.8}
 26%|███████████████████▌                                                      | 2827/10701 [3:22:56<4:34:33,  2.09s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9036, 'learning_rate': 2.3124203509459855e-05, 'epoch': 0.8}
 26%|███████████████████▌                                                      | 2827/10701 [3:22:56<4:34:33,  2.09s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8835, 'learning_rate': 2.3121262621311638e-05, 'epoch': 0.8}
{'loss': 6.7455, 'learning_rate': 2.3118321733163417e-05, 'epoch': 0.8}
 26%|███████████████████▌                                                      | 2827/10701 [3:22:56<4:34:33,  2.09s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 27%|███████████████████▋                                                      | 2842/10701 [3:23:24<3:45:57,  1.73s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 27%|███████████████████▋                                                      | 2842/10701 [3:23:24<3:45:57,  1.73s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7612, 'learning_rate': 2.3112439956866973e-05, 'epoch': 0.8}
 27%|███████████████████▋                                                      | 2842/10701 [3:23:24<3:45:57,  1.73s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7518, 'learning_rate': 2.3109499068718753e-05, 'epoch': 0.8}
 27%|███████████████████▋                                                      | 2842/10701 [3:23:24<3:45:57,  1.73s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.5052, 'learning_rate': 2.3106558180570533e-05, 'epoch': 0.8}
 27%|███████████████████▋                                                      | 2846/10701 [3:23:30<3:12:55,  1.47s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 27%|███████████████████▋                                                      | 2846/10701 [3:23:30<3:12:55,  1.47s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[WARNING|modeling_utils.py:388] 2022-03-02 13:08:13,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[WARNING|modeling_utils.py:388] 2022-03-02 13:08:13,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.2193, 'learning_rate': 2.309773551612587e-05, 'epoch': 0.8}
{'loss': 6.1175, 'learning_rate': 2.309479462797765e-05, 'epoch': 0.8}
[WARNING|modeling_utils.py:388] 2022-03-02 13:08:13,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.2758, 'learning_rate': 2.309185373982943e-05, 'epoch': 0.8}
{'loss': 6.309, 'learning_rate': 2.3088912851681208e-05, 'epoch': 0.8}
[WARNING|modeling_utils.py:388] 2022-03-02 13:08:13,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0421, 'learning_rate': 2.3085971963532988e-05, 'epoch': 0.8}
[WARNING|modeling_utils.py:388] 2022-03-02 13:08:13,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[WARNING|modeling_utils.py:388] 2022-03-02 13:08:13,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8171, 'learning_rate': 2.3083031075384767e-05, 'epoch': 0.8}
[WARNING|modeling_utils.py:388] 2022-03-02 13:08:13,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7008, 'learning_rate': 2.3080090187236547e-05, 'epoch': 0.8}
[WARNING|modeling_utils.py:388] 2022-03-02 13:08:13,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9021, 'learning_rate': 2.3077149299088327e-05, 'epoch': 0.8}
[WARNING|modeling_utils.py:388] 2022-03-02 13:08:13,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9849, 'learning_rate': 2.3071267522791883e-05, 'epoch': 0.8}
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0281, 'learning_rate': 2.3068326634643663e-05, 'epoch': 0.8}
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6281, 'learning_rate': 2.3065385746495442e-05, 'epoch': 0.8}
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8101, 'learning_rate': 2.3062444858347222e-05, 'epoch': 0.8}
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6716, 'learning_rate': 2.3059503970199002e-05, 'epoch': 0.8}
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7729, 'learning_rate': 2.3056563082050778e-05, 'epoch': 0.8}
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.5825, 'learning_rate': 2.3053622193902558e-05, 'epoch': 0.8}
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9806, 'learning_rate': 2.305068130575434e-05, 'epoch': 0.8}
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9067, 'learning_rate': 2.3047740417606117e-05, 'epoch': 0.8}
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.772, 'learning_rate': 2.3044799529457897e-05, 'epoch': 0.8}
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9875, 'learning_rate': 2.3041858641309677e-05, 'epoch': 0.8}
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.1841, 'learning_rate': 2.3038917753161456e-05, 'epoch': 0.8}
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9167, 'learning_rate': 2.3035976865013236e-05, 'epoch': 0.8}
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8204, 'learning_rate': 2.3033035976865013e-05, 'epoch': 0.8}
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6498, 'learning_rate': 2.3030095088716792e-05, 'epoch': 0.8}
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7776, 'learning_rate': 2.3027154200568572e-05, 'epoch': 0.8}
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.024, 'learning_rate': 2.302421331242035e-05, 'epoch': 0.81}
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7506, 'learning_rate': 2.302127242427213e-05, 'epoch': 0.81}
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8563, 'learning_rate': 2.301833153612391e-05, 'epoch': 0.81}
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6354, 'learning_rate': 2.3015390647975687e-05, 'epoch': 0.81}
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.133, 'learning_rate': 2.3012449759827467e-05, 'epoch': 0.81}
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8071, 'learning_rate': 2.300950887167925e-05, 'epoch': 0.81}
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8487, 'learning_rate': 2.3006567983531027e-05, 'epoch': 0.81}
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.968, 'learning_rate': 2.3003627095382806e-05, 'epoch': 0.81}
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0751, 'learning_rate': 2.3000686207234586e-05, 'epoch': 0.81}
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0885, 'learning_rate': 2.2997745319086362e-05, 'epoch': 0.81}
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.209, 'learning_rate': 2.2994804430938146e-05, 'epoch': 0.81}
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7331, 'learning_rate': 2.2991863542789922e-05, 'epoch': 0.81}
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8305, 'learning_rate': 2.29889226546417e-05, 'epoch': 0.81}
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.5465, 'learning_rate': 2.298598176649348e-05, 'epoch': 0.81}
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0401, 'learning_rate': 2.298304087834526e-05, 'epoch': 0.81}
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7231, 'learning_rate': 2.298009999019704e-05, 'epoch': 0.81}
{'loss': 6.787, 'learning_rate': 2.297715910204882e-05, 'epoch': 0.81}
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6321, 'learning_rate': 2.2974218213900597e-05, 'epoch': 0.81}
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6814, 'learning_rate': 2.2971277325752377e-05, 'epoch': 0.81}
 27%|███████████████████▋                                                      | 2856/10701 [3:23:49<4:43:16,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 27%|███████████████████▉                                                      | 2892/10701 [3:25:02<3:40:05,  1.69s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 27%|███████████████████▉                                                      | 2892/10701 [3:25:02<3:40:05,  1.69s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0659, 'learning_rate': 2.2965395549455936e-05, 'epoch': 0.81}
{'loss': 7.0175, 'learning_rate': 2.2962454661307716e-05, 'epoch': 0.81}
 27%|███████████████████▉                                                      | 2892/10701 [3:25:02<3:40:05,  1.69s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[WARNING|modeling_utils.py:388] 2022-03-02 13:09:47,275 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[WARNING|modeling_utils.py:388] 2022-03-02 13:09:47,275 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0346, 'learning_rate': 2.2956572885011272e-05, 'epoch': 0.81}
{'loss': 7.1346, 'learning_rate': 2.2953631996863055e-05, 'epoch': 0.81}
[WARNING|modeling_utils.py:388] 2022-03-02 13:09:47,275 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 27%|████████████████████                                                      | 2898/10701 [3:25:10<2:52:26,  1.33s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 27%|████████████████████                                                      | 2898/10701 [3:25:10<2:52:26,  1.33s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.4166, 'learning_rate': 2.294775022056661e-05, 'epoch': 0.81}
{'loss': 6.6833, 'learning_rate': 2.294480933241839e-05, 'epoch': 0.81}
 27%|████████████████████                                                      | 2898/10701 [3:25:10<2:52:26,  1.33s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.3395, 'learning_rate': 2.2941868444270167e-05, 'epoch': 0.81}
 27%|████████████████████                                                      | 2898/10701 [3:25:10<2:52:26,  1.33s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9685, 'learning_rate': 2.293892755612195e-05, 'epoch': 0.81}
 27%|████████████████████                                                      | 2898/10701 [3:25:10<2:52:26,  1.33s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 27%|████████████████████                                                      | 2898/10701 [3:25:10<2:52:26,  1.33s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8702, 'learning_rate': 2.293598666797373e-05, 'epoch': 0.81}
 27%|████████████████████                                                      | 2898/10701 [3:25:10<2:52:26,  1.33s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8163, 'learning_rate': 2.2933045779825506e-05, 'epoch': 0.81}
 27%|████████████████████                                                      | 2898/10701 [3:25:10<2:52:26,  1.33s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7801, 'learning_rate': 2.2930104891677286e-05, 'epoch': 0.81}
 27%|████████████████████                                                      | 2898/10701 [3:25:10<2:52:26,  1.33s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8939, 'learning_rate': 2.292716400352907e-05, 'epoch': 0.81}
 27%|████████████████████                                                      | 2898/10701 [3:25:10<2:52:26,  1.33s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7216, 'learning_rate': 2.2924223115380845e-05, 'epoch': 0.81}
 27%|████████████████████                                                      | 2898/10701 [3:25:10<2:52:26,  1.33s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.225, 'learning_rate': 2.2921282227232625e-05, 'epoch': 0.81}
 27%|████████████████████                                                      | 2898/10701 [3:25:10<2:52:26,  1.33s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7786, 'learning_rate': 2.2918341339084405e-05, 'epoch': 0.82}
 27%|████████████████████                                                      | 2898/10701 [3:25:10<2:52:26,  1.33s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.3385, 'learning_rate': 2.291540045093618e-05, 'epoch': 0.82}
 27%|████████████████████                                                      | 2898/10701 [3:25:10<2:52:26,  1.33s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8075, 'learning_rate': 2.2912459562787964e-05, 'epoch': 0.82}
 27%|████████████████████                                                      | 2898/10701 [3:25:10<2:52:26,  1.33s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9385, 'learning_rate': 2.2909518674639744e-05, 'epoch': 0.82}
 27%|████████████████████                                                      | 2898/10701 [3:25:10<2:52:26,  1.33s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 27%|████████████████████                                                      | 2898/10701 [3:25:10<2:52:26,  1.33s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7651, 'learning_rate': 2.290657778649152e-05, 'epoch': 0.82}
 27%|████████████████████                                                      | 2898/10701 [3:25:10<2:52:26,  1.33s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7632, 'learning_rate': 2.29036368983433e-05, 'epoch': 0.82}
 27%|████████████████████                                                      | 2898/10701 [3:25:10<2:52:26,  1.33s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7655, 'learning_rate': 2.2900696010195076e-05, 'epoch': 0.82}
 27%|████████████████████                                                      | 2898/10701 [3:25:10<2:52:26,  1.33s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7538, 'learning_rate': 2.289775512204686e-05, 'epoch': 0.82}
 27%|████████████████████                                                      | 2898/10701 [3:25:10<2:52:26,  1.33s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8935, 'learning_rate': 2.289481423389864e-05, 'epoch': 0.82}
 27%|████████████████████                                                      | 2898/10701 [3:25:10<2:52:26,  1.33s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7903, 'learning_rate': 2.2891873345750416e-05, 'epoch': 0.82}
 27%|████████████████████                                                      | 2898/10701 [3:25:10<2:52:26,  1.33s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9233, 'learning_rate': 2.2888932457602195e-05, 'epoch': 0.82}
 27%|████████████████████                                                      | 2898/10701 [3:25:10<2:52:26,  1.33s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9007, 'learning_rate': 2.2885991569453975e-05, 'epoch': 0.82}
 27%|████████████████████                                                      | 2898/10701 [3:25:10<2:52:26,  1.33s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8274, 'learning_rate': 2.2883050681305755e-05, 'epoch': 0.82}
 27%|████████████████████                                                      | 2898/10701 [3:25:10<2:52:26,  1.33s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 27%|████████████████████▏                                                     | 2922/10701 [3:26:02<4:41:24,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 27%|████████████████████▏                                                     | 2922/10701 [3:26:02<4:41:24,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9414, 'learning_rate': 2.2877168905009314e-05, 'epoch': 0.82}
 27%|████████████████████▏                                                     | 2922/10701 [3:26:02<4:41:24,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7455, 'learning_rate': 2.287422801686109e-05, 'epoch': 0.82}
 27%|████████████████████▏                                                     | 2922/10701 [3:26:02<4:41:24,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6622, 'learning_rate': 2.2871287128712874e-05, 'epoch': 0.82}
 27%|████████████████████▏                                                     | 2922/10701 [3:26:02<4:41:24,  2.17s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 27%|████████████████████▏                                                     | 2926/10701 [3:26:10<4:36:49,  2.14s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 27%|████████████████████▏                                                     | 2926/10701 [3:26:10<4:36:49,  2.14s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9901, 'learning_rate': 2.286540535241643e-05, 'epoch': 0.82}
 27%|████████████████████▏                                                     | 2926/10701 [3:26:10<4:36:49,  2.14s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0405, 'learning_rate': 2.286246446426821e-05, 'epoch': 0.82}
 27%|████████████████████▏                                                     | 2926/10701 [3:26:10<4:36:49,  2.14s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.5265, 'learning_rate': 2.2859523576119986e-05, 'epoch': 0.82}
 27%|████████████████████▏                                                     | 2926/10701 [3:26:10<4:36:49,  2.14s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6398, 'learning_rate': 2.285658268797177e-05, 'epoch': 0.82}
 27%|████████████████████▏                                                     | 2926/10701 [3:26:10<4:36:49,  2.14s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7795, 'learning_rate': 2.285364179982355e-05, 'epoch': 0.82}
 27%|████████████████████▏                                                     | 2926/10701 [3:26:10<4:36:49,  2.14s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.563, 'learning_rate': 2.2850700911675325e-05, 'epoch': 0.82}
 27%|████████████████████▏                                                     | 2926/10701 [3:26:10<4:36:49,  2.14s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9061, 'learning_rate': 2.2847760023527105e-05, 'epoch': 0.82}
 27%|████████████████████▏                                                     | 2926/10701 [3:26:10<4:36:49,  2.14s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0885, 'learning_rate': 2.2844819135378884e-05, 'epoch': 0.82}
 27%|████████████████████▏                                                     | 2926/10701 [3:26:10<4:36:49,  2.14s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9024, 'learning_rate': 2.2841878247230664e-05, 'epoch': 0.82}
 27%|████████████████████▏                                                     | 2926/10701 [3:26:10<4:36:49,  2.14s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7052, 'learning_rate': 2.2838937359082444e-05, 'epoch': 0.82}
 27%|████████████████████▏                                                     | 2926/10701 [3:26:10<4:36:49,  2.14s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8712, 'learning_rate': 2.2835996470934224e-05, 'epoch': 0.82}
 27%|████████████████████▏                                                     | 2926/10701 [3:26:10<4:36:49,  2.14s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9554, 'learning_rate': 2.2833055582786e-05, 'epoch': 0.82}
 27%|████████████████████▏                                                     | 2926/10701 [3:26:10<4:36:49,  2.14s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8205, 'learning_rate': 2.2830114694637783e-05, 'epoch': 0.82}
 27%|████████████████████▏                                                     | 2926/10701 [3:26:10<4:36:49,  2.14s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.821, 'learning_rate': 2.2827173806489563e-05, 'epoch': 0.82}
[WARNING|modeling_utils.py:388] 2022-03-02 13:11:20,361 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[WARNING|modeling_utils.py:388] 2022-03-02 13:11:20,361 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.64, 'learning_rate': 2.282129203019312e-05, 'epoch': 0.82}
[WARNING|modeling_utils.py:388] 2022-03-02 13:11:20,361 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8709, 'learning_rate': 2.2818351142044895e-05, 'epoch': 0.82}
[WARNING|modeling_utils.py:388] 2022-03-02 13:11:20,361 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 28%|████████████████████▎                                                     | 2945/10701 [3:26:45<3:18:47,  1.54s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 28%|████████████████████▎                                                     | 2945/10701 [3:26:45<3:18:47,  1.54s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8367, 'learning_rate': 2.2812469365748458e-05, 'epoch': 0.83}
{'loss': 6.7833, 'learning_rate': 2.2809528477600234e-05, 'epoch': 0.83}
 28%|████████████████████▎                                                     | 2945/10701 [3:26:45<3:18:47,  1.54s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.1066, 'learning_rate': 2.2806587589452014e-05, 'epoch': 0.83}
 28%|████████████████████▍                                                     | 2948/10701 [3:26:49<2:49:32,  1.31s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 28%|████████████████████▍                                                     | 2948/10701 [3:26:49<2:49:32,  1.31s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.4657, 'learning_rate': 2.2800705813155574e-05, 'epoch': 0.83}
 28%|████████████████████▍                                                     | 2948/10701 [3:26:49<2:49:32,  1.31s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.5634, 'learning_rate': 2.2797764925007353e-05, 'epoch': 0.83}
{'loss': 6.0922, 'learning_rate': 2.2794824036859133e-05, 'epoch': 0.83}
 28%|████████████████████▍                                                     | 2948/10701 [3:26:49<2:49:32,  1.31s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 28%|████████████████████▍                                                     | 2948/10701 [3:26:49<2:49:32,  1.31s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.1393, 'learning_rate': 2.279188314871091e-05, 'epoch': 0.83}
 28%|████████████████████▍                                                     | 2948/10701 [3:26:49<2:49:32,  1.31s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.906, 'learning_rate': 2.278894226056269e-05, 'epoch': 0.83}
 28%|████████████████████▍                                                     | 2948/10701 [3:26:49<2:49:32,  1.31s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8192, 'learning_rate': 2.2786001372414472e-05, 'epoch': 0.83}
 28%|████████████████████▍                                                     | 2948/10701 [3:26:49<2:49:32,  1.31s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9494, 'learning_rate': 2.278306048426625e-05, 'epoch': 0.83}
 28%|████████████████████▍                                                     | 2948/10701 [3:26:49<2:49:32,  1.31s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6233, 'learning_rate': 2.2780119596118028e-05, 'epoch': 0.83}
 28%|████████████████████▍                                                     | 2948/10701 [3:26:49<2:49:32,  1.31s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 28%|████████████████████▍                                                     | 2957/10701 [3:27:08<4:44:19,  2.20s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 28%|████████████████████▍                                                     | 2957/10701 [3:27:08<4:44:19,  2.20s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6977, 'learning_rate': 2.2774237819821588e-05, 'epoch': 0.83}
 28%|████████████████████▍                                                     | 2957/10701 [3:27:08<4:44:19,  2.20s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7606, 'learning_rate': 2.2771296931673367e-05, 'epoch': 0.83}
 28%|████████████████████▍                                                     | 2957/10701 [3:27:08<4:44:19,  2.20s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7873, 'learning_rate': 2.2768356043525144e-05, 'epoch': 0.83}
 28%|████████████████████▍                                                     | 2957/10701 [3:27:08<4:44:19,  2.20s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7821, 'learning_rate': 2.2765415155376924e-05, 'epoch': 0.83}
 28%|████████████████████▍                                                     | 2957/10701 [3:27:08<4:44:19,  2.20s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 28%|████████████████████▍                                                     | 2957/10701 [3:27:08<4:44:19,  2.20s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0452, 'learning_rate': 2.2762474267228703e-05, 'epoch': 0.83}
 28%|████████████████████▍                                                     | 2957/10701 [3:27:08<4:44:19,  2.20s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9136, 'learning_rate': 2.2759533379080483e-05, 'epoch': 0.83}
 28%|████████████████████▍                                                     | 2957/10701 [3:27:08<4:44:19,  2.20s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6759, 'learning_rate': 2.2756592490932263e-05, 'epoch': 0.83}
 28%|████████████████████▍                                                     | 2957/10701 [3:27:08<4:44:19,  2.20s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8175, 'learning_rate': 2.2753651602784042e-05, 'epoch': 0.83}
 28%|████████████████████▍                                                     | 2957/10701 [3:27:08<4:44:19,  2.20s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8422, 'learning_rate': 2.275071071463582e-05, 'epoch': 0.83}
{'loss': 7.1262, 'learning_rate': 2.27477698264876e-05, 'epoch': 0.83}
 28%|████████████████████▍                                                     | 2957/10701 [3:27:08<4:44:19,  2.20s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 28%|████████████████████▍                                                     | 2957/10701 [3:27:08<4:44:19,  2.20s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.705, 'learning_rate': 2.274482893833938e-05, 'epoch': 0.83}
{'loss': 6.3581, 'learning_rate': 2.2741888050191158e-05, 'epoch': 0.83}
 28%|████████████████████▍                                                     | 2957/10701 [3:27:08<4:44:19,  2.20s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.5943, 'learning_rate': 2.2738947162042938e-05, 'epoch': 0.83}
 28%|████████████████████▍                                                     | 2957/10701 [3:27:08<4:44:19,  2.20s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 28%|████████████████████▍                                                     | 2957/10701 [3:27:08<4:44:19,  2.20s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.0113, 'learning_rate': 2.2736006273894717e-05, 'epoch': 0.83}
 28%|████████████████████▍                                                     | 2957/10701 [3:27:08<4:44:19,  2.20s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6784, 'learning_rate': 2.2733065385746494e-05, 'epoch': 0.83}
 28%|████████████████████▍                                                     | 2957/10701 [3:27:08<4:44:19,  2.20s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7618, 'learning_rate': 2.2730124497598277e-05, 'epoch': 0.83}
{'loss': 6.5495, 'learning_rate': 2.2727183609450053e-05, 'epoch': 0.83}
 28%|████████████████████▍                                                     | 2957/10701 [3:27:08<4:44:19,  2.20s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8643, 'learning_rate': 2.2724242721301833e-05, 'epoch': 0.83}
 28%|████████████████████▍                                                     | 2957/10701 [3:27:08<4:44:19,  2.20s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.1452, 'learning_rate': 2.2721301833153613e-05, 'epoch': 0.83}
 28%|████████████████████▍                                                     | 2957/10701 [3:27:08<4:44:19,  2.20s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 28%|████████████████████▍                                                     | 2957/10701 [3:27:08<4:44:19,  2.20s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 7.1207, 'learning_rate': 2.2718360945005392e-05, 'epoch': 0.83}
 28%|████████████████████▍                                                     | 2957/10701 [3:27:08<4:44:19,  2.20s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7594, 'learning_rate': 2.2715420056857172e-05, 'epoch': 0.83}
{'loss': 6.8055, 'learning_rate': 2.2712479168708952e-05, 'epoch': 0.83}
 28%|████████████████████▍                                                     | 2957/10701 [3:27:08<4:44:19,  2.20s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9028, 'learning_rate': 2.2709538280560728e-05, 'epoch': 0.84}
 28%|████████████████████▍                                                     | 2957/10701 [3:27:08<4:44:19,  2.20s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 28%|████████████████████▍                                                     | 2957/10701 [3:27:08<4:44:19,  2.20s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9214, 'learning_rate': 2.2706597392412508e-05, 'epoch': 0.84}
 28%|████████████████████▍                                                     | 2957/10701 [3:27:08<4:44:19,  2.20s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7777, 'learning_rate': 2.270365650426429e-05, 'epoch': 0.84}
{'loss': 6.7552, 'learning_rate': 2.2700715616116067e-05, 'epoch': 0.84}
 28%|████████████████████▍                                                     | 2957/10701 [3:27:08<4:44:19,  2.20s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.8938, 'learning_rate': 2.2697774727967847e-05, 'epoch': 0.84}
 28%|████████████████████▍                                                     | 2957/10701 [3:27:08<4:44:19,  2.20s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6218, 'learning_rate': 2.2694833839819627e-05, 'epoch': 0.84}
 28%|████████████████████▍                                                     | 2957/10701 [3:27:08<4:44:19,  2.20s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9991, 'learning_rate': 2.2691892951671403e-05, 'epoch': 0.84}
 28%|████████████████████▍                                                     | 2957/10701 [3:27:08<4:44:19,  2.20s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9063, 'learning_rate': 2.2688952063523186e-05, 'epoch': 0.84}
 28%|████████████████████▍                                                     | 2957/10701 [3:27:08<4:44:19,  2.20s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.687, 'learning_rate': 2.2686011175374963e-05, 'epoch': 0.84}
 28%|████████████████████▍                                                     | 2957/10701 [3:27:08<4:44:19,  2.20s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 28%|████████████████████▋                                                     | 2989/10701 [3:28:14<3:56:32,  1.84s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
 28%|████████████████████▋                                                     | 2989/10701 [3:28:14<3:56:32,  1.84s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9142, 'learning_rate': 2.2680129399078522e-05, 'epoch': 0.84}
 28%|████████████████████▋                                                     | 2989/10701 [3:28:14<3:56:32,  1.84s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.669, 'learning_rate': 2.2677188510930302e-05, 'epoch': 0.84}
 28%|████████████████████▋                                                     | 2989/10701 [3:28:14<3:56:32,  1.84s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.9802, 'learning_rate': 2.267424762278208e-05, 'epoch': 0.84}
{'loss': 7.0606, 'learning_rate': 2.267130673463386e-05, 'epoch': 0.84}
 28%|████████████████████▋                                                     | 2989/10701 [3:28:14<3:56:32,  1.84s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7634, 'learning_rate': 2.2668365846485638e-05, 'epoch': 0.84}
 28%|████████████████████▋                                                     | 2989/10701 [3:28:14<3:56:32,  1.84s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6353, 'learning_rate': 2.2665424958337417e-05, 'epoch': 0.84}
 28%|████████████████████▋                                                     | 2989/10701 [3:28:14<3:56:32,  1.84s/it]g-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.7694, 'learning_rate': 2.26624840701892e-05, 'epoch': 0.84}
[WARNING|modeling_utils.py:388] 2022-03-02 13:13:07,334 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[WARNING|modeling_utils.py:388] 2022-03-02 13:13:07,334 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.6915, 'learning_rate': 2.2656602293892756e-05, 'epoch': 0.84}
[WARNING|modeling_utils.py:388] 2022-03-02 13:13:07,334 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
{'loss': 6.3028, 'learning_rate': 2.2653661405744536e-05, 'epoch': 0.84}
{'loss': 6.3266, 'learning_rate': 2.2650720517596312e-05, 'epoch': 0.84}
[WARNING|modeling_utils.py:388] 2022-03-02 13:13:07,334 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
[INFO|trainer.py:2369] 2022-03-02 13:13:11,567 >>   Batch size = 4ot estimate the number of tokens of the input, floating-point operations will not be computed-02 13:03:17,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.