yoshitomo-matsubara
commited on
Commit
•
527a6a3
1
Parent(s):
9ea3db7
tuned hyperparameters
Browse files- pytorch_model.bin +1 -1
- tokenizer.json +0 -0
- training.log +43 -43
pytorch_model.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 1340746825
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:0420c6d7099dd98a7abc6589eecf192a0c63b4fc661bd591c9bf00b4836b74af
|
3 |
size 1340746825
|
tokenizer.json
CHANGED
The diff for this file is too large to render.
See raw diff
|
|
training.log
CHANGED
@@ -1,49 +1,49 @@
|
|
1 |
-
2021-05-
|
2 |
-
2021-05-
|
3 |
Num processes: 1
|
4 |
Process index: 0
|
5 |
Local process index: 0
|
6 |
Device: cuda
|
7 |
Use FP16 precision: True
|
8 |
|
9 |
-
2021-05-
|
10 |
-
2021-05-
|
11 |
-
2021-05-
|
12 |
-
2021-05-
|
13 |
-
2021-05-
|
14 |
-
2021-05-
|
15 |
-
2021-05-
|
16 |
-
2021-05-
|
17 |
-
2021-05-
|
18 |
-
2021-05-
|
19 |
-
2021-05-
|
20 |
-
2021-05-
|
21 |
-
2021-05-
|
22 |
-
2021-05-
|
23 |
-
2021-05-
|
24 |
-
2021-05-
|
25 |
-
2021-05-
|
26 |
-
2021-05-
|
27 |
-
2021-05-
|
28 |
-
2021-05-
|
29 |
-
2021-05-
|
30 |
-
2021-05-
|
31 |
-
2021-05-
|
32 |
-
2021-05-
|
33 |
-
2021-05-
|
34 |
-
2021-05-
|
35 |
-
2021-05-
|
36 |
-
2021-05-
|
37 |
-
2021-05-
|
38 |
-
2021-05-
|
39 |
-
2021-05-
|
40 |
-
2021-05-
|
41 |
-
2021-05-
|
42 |
-
2021-05-
|
43 |
-
2021-05-
|
44 |
-
2021-05-
|
45 |
-
2021-05-
|
46 |
-
2021-05-
|
47 |
-
2021-05-
|
48 |
-
2021-05-
|
49 |
-
2021-05-
|
|
|
1 |
+
2021-05-27 03:06:57,518 INFO __main__ Namespace(adjust_lr=False, config='torchdistill/configs/sample/glue/rte/ce/bert_large_uncased.yaml', log='log/glue/rte/ce/bert_large_uncased.txt', private_output='leaderboard/glue/standard/bert_large_uncased/', seed=None, student_only=False, task_name='rte', test_only=False, world_size=1)
|
2 |
+
2021-05-27 03:06:57,550 INFO __main__ Distributed environment: NO
|
3 |
Num processes: 1
|
4 |
Process index: 0
|
5 |
Local process index: 0
|
6 |
Device: cuda
|
7 |
Use FP16 precision: True
|
8 |
|
9 |
+
2021-05-27 03:07:10,458 INFO __main__ Start training
|
10 |
+
2021-05-27 03:07:10,459 INFO torchdistill.models.util [student model]
|
11 |
+
2021-05-27 03:07:10,459 INFO torchdistill.models.util Using the original student model
|
12 |
+
2021-05-27 03:07:10,459 INFO torchdistill.core.training Loss = 1.0 * OrgLoss
|
13 |
+
2021-05-27 03:07:13,632 INFO torchdistill.misc.log Epoch: [0] [ 0/312] eta: 0:02:13 lr: 1.997863247863248e-05 sample/s: 9.410467095760941 loss: 0.6985 (0.6985) time: 0.4283 data: 0.0032 max mem: 5355
|
14 |
+
2021-05-27 03:07:34,141 INFO torchdistill.misc.log Epoch: [0] [ 50/312] eta: 0:01:47 lr: 1.891025641025641e-05 sample/s: 9.260763560189773 loss: 0.6855 (0.7316) time: 0.4141 data: 0.0017 max mem: 7365
|
15 |
+
2021-05-27 03:07:54,769 INFO torchdistill.misc.log Epoch: [0] [100/312] eta: 0:01:27 lr: 1.7841880341880344e-05 sample/s: 9.300060421620962 loss: 0.6784 (0.7141) time: 0.4156 data: 0.0018 max mem: 7365
|
16 |
+
2021-05-27 03:08:15,006 INFO torchdistill.misc.log Epoch: [0] [150/312] eta: 0:01:06 lr: 1.6773504273504274e-05 sample/s: 9.304485750349947 loss: 0.6591 (0.7030) time: 0.4016 data: 0.0017 max mem: 7365
|
17 |
+
2021-05-27 03:08:35,380 INFO torchdistill.misc.log Epoch: [0] [200/312] eta: 0:00:45 lr: 1.5705128205128205e-05 sample/s: 9.304217428726076 loss: 0.6689 (0.6939) time: 0.4085 data: 0.0017 max mem: 7365
|
18 |
+
2021-05-27 03:08:55,663 INFO torchdistill.misc.log Epoch: [0] [250/312] eta: 0:00:25 lr: 1.4636752136752137e-05 sample/s: 9.296092555242803 loss: 0.6506 (0.6908) time: 0.4118 data: 0.0018 max mem: 7365
|
19 |
+
2021-05-27 03:09:16,350 INFO torchdistill.misc.log Epoch: [0] [300/312] eta: 0:00:04 lr: 1.356837606837607e-05 sample/s: 9.244190153358904 loss: 0.6405 (0.6823) time: 0.4173 data: 0.0017 max mem: 7365
|
20 |
+
2021-05-27 03:09:20,776 INFO torchdistill.misc.log Epoch: [0] Total time: 0:02:07
|
21 |
+
2021-05-27 03:09:24,609 INFO /usr/local/lib/python3.7/dist-packages/datasets/metric.py Removing /root/.cache/huggingface/metrics/glue/rte/default_experiment-1-0.arrow
|
22 |
+
2021-05-27 03:09:24,610 INFO __main__ Validation: accuracy = 0.6173285198555957
|
23 |
+
2021-05-27 03:09:24,610 INFO __main__ Updating ckpt at ./resource/ckpt/glue/rte/ce/rte-bert-large-uncased
|
24 |
+
2021-05-27 03:09:29,657 INFO torchdistill.misc.log Epoch: [1] [ 0/312] eta: 0:02:19 lr: 1.3311965811965812e-05 sample/s: 9.012096369155467 loss: 0.6355 (0.6355) time: 0.4460 data: 0.0021 max mem: 7365
|
25 |
+
2021-05-27 03:09:50,469 INFO torchdistill.misc.log Epoch: [1] [ 50/312] eta: 0:01:49 lr: 1.2243589743589746e-05 sample/s: 9.16842369299592 loss: 0.5310 (0.5850) time: 0.4242 data: 0.0017 max mem: 7365
|
26 |
+
2021-05-27 03:10:10,381 INFO torchdistill.misc.log Epoch: [1] [100/312] eta: 0:01:26 lr: 1.1175213675213676e-05 sample/s: 11.36395200609068 loss: 0.4777 (0.5638) time: 0.4081 data: 0.0017 max mem: 7365
|
27 |
+
2021-05-27 03:10:30,677 INFO torchdistill.misc.log Epoch: [1] [150/312] eta: 0:01:05 lr: 1.0106837606837608e-05 sample/s: 9.29443941850809 loss: 0.5052 (0.5646) time: 0.4139 data: 0.0017 max mem: 7365
|
28 |
+
2021-05-27 03:10:51,658 INFO torchdistill.misc.log Epoch: [1] [200/312] eta: 0:00:45 lr: 9.03846153846154e-06 sample/s: 9.300792527973533 loss: 0.4903 (0.5516) time: 0.4198 data: 0.0017 max mem: 7365
|
29 |
+
2021-05-27 03:11:12,170 INFO torchdistill.misc.log Epoch: [1] [250/312] eta: 0:00:25 lr: 7.970085470085472e-06 sample/s: 9.282976357127444 loss: 0.4948 (0.5433) time: 0.4087 data: 0.0017 max mem: 7365
|
30 |
+
2021-05-27 03:11:32,902 INFO torchdistill.misc.log Epoch: [1] [300/312] eta: 0:00:04 lr: 6.901709401709402e-06 sample/s: 9.302071412730095 loss: 0.4903 (0.5442) time: 0.4143 data: 0.0017 max mem: 7365
|
31 |
+
2021-05-27 03:11:36,971 INFO torchdistill.misc.log Epoch: [1] Total time: 0:02:07
|
32 |
+
2021-05-27 03:11:40,801 INFO /usr/local/lib/python3.7/dist-packages/datasets/metric.py Removing /root/.cache/huggingface/metrics/glue/rte/default_experiment-1-0.arrow
|
33 |
+
2021-05-27 03:11:40,801 INFO __main__ Validation: accuracy = 0.740072202166065
|
34 |
+
2021-05-27 03:11:40,802 INFO __main__ Updating ckpt at ./resource/ckpt/glue/rte/ce/rte-bert-large-uncased
|
35 |
+
2021-05-27 03:11:46,120 INFO torchdistill.misc.log Epoch: [2] [ 0/312] eta: 0:02:19 lr: 6.645299145299145e-06 sample/s: 9.060380879964056 loss: 0.7969 (0.7969) time: 0.4475 data: 0.0060 max mem: 7365
|
36 |
+
2021-05-27 03:12:06,781 INFO torchdistill.misc.log Epoch: [2] [ 50/312] eta: 0:01:48 lr: 5.576923076923077e-06 sample/s: 10.741735558355614 loss: 0.3631 (0.3967) time: 0.4164 data: 0.0017 max mem: 7365
|
37 |
+
2021-05-27 03:12:27,317 INFO torchdistill.misc.log Epoch: [2] [100/312] eta: 0:01:27 lr: 4.508547008547009e-06 sample/s: 9.16765717108441 loss: 0.3599 (0.3882) time: 0.4054 data: 0.0017 max mem: 7365
|
38 |
+
2021-05-27 03:12:47,698 INFO torchdistill.misc.log Epoch: [2] [150/312] eta: 0:01:06 lr: 3.4401709401709403e-06 sample/s: 13.347942659561435 loss: 0.2746 (0.3688) time: 0.3992 data: 0.0017 max mem: 7365
|
39 |
+
2021-05-27 03:13:08,144 INFO torchdistill.misc.log Epoch: [2] [200/312] eta: 0:00:45 lr: 2.371794871794872e-06 sample/s: 9.294048107726917 loss: 0.2737 (0.3552) time: 0.4062 data: 0.0017 max mem: 7365
|
40 |
+
2021-05-27 03:13:29,127 INFO torchdistill.misc.log Epoch: [2] [250/312] eta: 0:00:25 lr: 1.3034188034188036e-06 sample/s: 9.276037203192194 loss: 0.2493 (0.3468) time: 0.4134 data: 0.0017 max mem: 7365
|
41 |
+
2021-05-27 03:13:49,840 INFO torchdistill.misc.log Epoch: [2] [300/312] eta: 0:00:04 lr: 2.3504273504273505e-07 sample/s: 9.257023235231964 loss: 0.1451 (0.3412) time: 0.4061 data: 0.0018 max mem: 7365
|
42 |
+
2021-05-27 03:13:54,183 INFO torchdistill.misc.log Epoch: [2] Total time: 0:02:08
|
43 |
+
2021-05-27 03:13:58,014 INFO /usr/local/lib/python3.7/dist-packages/datasets/metric.py Removing /root/.cache/huggingface/metrics/glue/rte/default_experiment-1-0.arrow
|
44 |
+
2021-05-27 03:13:58,014 INFO __main__ Validation: accuracy = 0.7220216606498195
|
45 |
+
2021-05-27 03:14:04,515 INFO __main__ [Student: bert-large-uncased]
|
46 |
+
2021-05-27 03:14:08,368 INFO /usr/local/lib/python3.7/dist-packages/datasets/metric.py Removing /root/.cache/huggingface/metrics/glue/rte/default_experiment-1-0.arrow
|
47 |
+
2021-05-27 03:14:08,368 INFO __main__ Test: accuracy = 0.740072202166065
|
48 |
+
2021-05-27 03:14:08,368 INFO __main__ Start prediction for private dataset(s)
|
49 |
+
2021-05-27 03:14:08,369 INFO __main__ rte/test: 3000 samples
|