stefan-it commited on
Commit
ed09391
1 Parent(s): 360ae8f

Upload ./training.log with huggingface_hub

Browse files
Files changed (1) hide show
  1. training.log +245 -0
training.log ADDED
@@ -0,0 +1,245 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2024-03-26 11:04:36,871 ----------------------------------------------------------------------------------------------------
2
+ 2024-03-26 11:04:36,871 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): BertModel(
5
+ (embeddings): BertEmbeddings(
6
+ (word_embeddings): Embedding(30001, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): BertEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x BertLayer(
15
+ (attention): BertAttention(
16
+ (self): BertSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): BertSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): BertIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): BertOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): BertPooler(
41
+ (dense): Linear(in_features=768, out_features=768, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=768, out_features=17, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2024-03-26 11:04:36,871 ----------------------------------------------------------------------------------------------------
51
+ 2024-03-26 11:04:36,871 Corpus: 758 train + 94 dev + 96 test sentences
52
+ 2024-03-26 11:04:36,871 ----------------------------------------------------------------------------------------------------
53
+ 2024-03-26 11:04:36,871 Train: 758 sentences
54
+ 2024-03-26 11:04:36,871 (train_with_dev=False, train_with_test=False)
55
+ 2024-03-26 11:04:36,871 ----------------------------------------------------------------------------------------------------
56
+ 2024-03-26 11:04:36,871 Training Params:
57
+ 2024-03-26 11:04:36,871 - learning_rate: "3e-05"
58
+ 2024-03-26 11:04:36,871 - mini_batch_size: "8"
59
+ 2024-03-26 11:04:36,871 - max_epochs: "10"
60
+ 2024-03-26 11:04:36,871 - shuffle: "True"
61
+ 2024-03-26 11:04:36,871 ----------------------------------------------------------------------------------------------------
62
+ 2024-03-26 11:04:36,871 Plugins:
63
+ 2024-03-26 11:04:36,871 - TensorboardLogger
64
+ 2024-03-26 11:04:36,871 - LinearScheduler | warmup_fraction: '0.1'
65
+ 2024-03-26 11:04:36,871 ----------------------------------------------------------------------------------------------------
66
+ 2024-03-26 11:04:36,871 Final evaluation on model from best epoch (best-model.pt)
67
+ 2024-03-26 11:04:36,871 - metric: "('micro avg', 'f1-score')"
68
+ 2024-03-26 11:04:36,871 ----------------------------------------------------------------------------------------------------
69
+ 2024-03-26 11:04:36,871 Computation:
70
+ 2024-03-26 11:04:36,871 - compute on device: cuda:0
71
+ 2024-03-26 11:04:36,871 - embedding storage: none
72
+ 2024-03-26 11:04:36,871 ----------------------------------------------------------------------------------------------------
73
+ 2024-03-26 11:04:36,871 Model training base path: "flair-co-funer-german_bert_base-bs8-e10-lr3e-05-1"
74
+ 2024-03-26 11:04:36,871 ----------------------------------------------------------------------------------------------------
75
+ 2024-03-26 11:04:36,871 ----------------------------------------------------------------------------------------------------
76
+ 2024-03-26 11:04:36,871 Logging anything other than scalars to TensorBoard is currently not supported.
77
+ 2024-03-26 11:04:38,523 epoch 1 - iter 9/95 - loss 3.40933677 - time (sec): 1.65 - samples/sec: 1864.22 - lr: 0.000003 - momentum: 0.000000
78
+ 2024-03-26 11:04:40,110 epoch 1 - iter 18/95 - loss 3.26210809 - time (sec): 3.24 - samples/sec: 1930.43 - lr: 0.000005 - momentum: 0.000000
79
+ 2024-03-26 11:04:42,628 epoch 1 - iter 27/95 - loss 3.07575274 - time (sec): 5.76 - samples/sec: 1778.83 - lr: 0.000008 - momentum: 0.000000
80
+ 2024-03-26 11:04:44,926 epoch 1 - iter 36/95 - loss 2.85331391 - time (sec): 8.05 - samples/sec: 1735.76 - lr: 0.000011 - momentum: 0.000000
81
+ 2024-03-26 11:04:46,899 epoch 1 - iter 45/95 - loss 2.65661608 - time (sec): 10.03 - samples/sec: 1741.50 - lr: 0.000014 - momentum: 0.000000
82
+ 2024-03-26 11:04:48,188 epoch 1 - iter 54/95 - loss 2.51764366 - time (sec): 11.32 - samples/sec: 1779.71 - lr: 0.000017 - momentum: 0.000000
83
+ 2024-03-26 11:04:49,969 epoch 1 - iter 63/95 - loss 2.36976031 - time (sec): 13.10 - samples/sec: 1776.34 - lr: 0.000020 - momentum: 0.000000
84
+ 2024-03-26 11:04:51,327 epoch 1 - iter 72/95 - loss 2.25457005 - time (sec): 14.46 - samples/sec: 1802.05 - lr: 0.000022 - momentum: 0.000000
85
+ 2024-03-26 11:04:53,398 epoch 1 - iter 81/95 - loss 2.11328222 - time (sec): 16.53 - samples/sec: 1792.24 - lr: 0.000025 - momentum: 0.000000
86
+ 2024-03-26 11:04:54,764 epoch 1 - iter 90/95 - loss 2.00924946 - time (sec): 17.89 - samples/sec: 1813.13 - lr: 0.000028 - momentum: 0.000000
87
+ 2024-03-26 11:04:56,048 ----------------------------------------------------------------------------------------------------
88
+ 2024-03-26 11:04:56,048 EPOCH 1 done: loss 1.9314 - lr: 0.000028
89
+ 2024-03-26 11:04:56,903 DEV : loss 0.5613349676132202 - f1-score (micro avg) 0.6334
90
+ 2024-03-26 11:04:56,904 saving best model
91
+ 2024-03-26 11:04:57,197 ----------------------------------------------------------------------------------------------------
92
+ 2024-03-26 11:04:59,380 epoch 2 - iter 9/95 - loss 0.57723266 - time (sec): 2.18 - samples/sec: 1692.36 - lr: 0.000030 - momentum: 0.000000
93
+ 2024-03-26 11:05:01,132 epoch 2 - iter 18/95 - loss 0.59299330 - time (sec): 3.93 - samples/sec: 1843.91 - lr: 0.000029 - momentum: 0.000000
94
+ 2024-03-26 11:05:02,998 epoch 2 - iter 27/95 - loss 0.56105796 - time (sec): 5.80 - samples/sec: 1777.29 - lr: 0.000029 - momentum: 0.000000
95
+ 2024-03-26 11:05:04,798 epoch 2 - iter 36/95 - loss 0.52648710 - time (sec): 7.60 - samples/sec: 1759.18 - lr: 0.000029 - momentum: 0.000000
96
+ 2024-03-26 11:05:06,726 epoch 2 - iter 45/95 - loss 0.49076722 - time (sec): 9.53 - samples/sec: 1777.21 - lr: 0.000028 - momentum: 0.000000
97
+ 2024-03-26 11:05:08,976 epoch 2 - iter 54/95 - loss 0.45479843 - time (sec): 11.78 - samples/sec: 1752.91 - lr: 0.000028 - momentum: 0.000000
98
+ 2024-03-26 11:05:10,325 epoch 2 - iter 63/95 - loss 0.44881549 - time (sec): 13.13 - samples/sec: 1794.48 - lr: 0.000028 - momentum: 0.000000
99
+ 2024-03-26 11:05:11,692 epoch 2 - iter 72/95 - loss 0.43506525 - time (sec): 14.49 - samples/sec: 1824.98 - lr: 0.000028 - momentum: 0.000000
100
+ 2024-03-26 11:05:13,510 epoch 2 - iter 81/95 - loss 0.42206134 - time (sec): 16.31 - samples/sec: 1814.64 - lr: 0.000027 - momentum: 0.000000
101
+ 2024-03-26 11:05:15,178 epoch 2 - iter 90/95 - loss 0.41737343 - time (sec): 17.98 - samples/sec: 1815.09 - lr: 0.000027 - momentum: 0.000000
102
+ 2024-03-26 11:05:16,110 ----------------------------------------------------------------------------------------------------
103
+ 2024-03-26 11:05:16,110 EPOCH 2 done: loss 0.4112 - lr: 0.000027
104
+ 2024-03-26 11:05:17,069 DEV : loss 0.31459930539131165 - f1-score (micro avg) 0.807
105
+ 2024-03-26 11:05:17,070 saving best model
106
+ 2024-03-26 11:05:17,555 ----------------------------------------------------------------------------------------------------
107
+ 2024-03-26 11:05:19,551 epoch 3 - iter 9/95 - loss 0.31440666 - time (sec): 2.00 - samples/sec: 1681.68 - lr: 0.000026 - momentum: 0.000000
108
+ 2024-03-26 11:05:21,633 epoch 3 - iter 18/95 - loss 0.28335954 - time (sec): 4.08 - samples/sec: 1650.51 - lr: 0.000026 - momentum: 0.000000
109
+ 2024-03-26 11:05:23,006 epoch 3 - iter 27/95 - loss 0.27402111 - time (sec): 5.45 - samples/sec: 1755.03 - lr: 0.000026 - momentum: 0.000000
110
+ 2024-03-26 11:05:25,527 epoch 3 - iter 36/95 - loss 0.26965058 - time (sec): 7.97 - samples/sec: 1696.07 - lr: 0.000025 - momentum: 0.000000
111
+ 2024-03-26 11:05:27,809 epoch 3 - iter 45/95 - loss 0.25296598 - time (sec): 10.25 - samples/sec: 1731.51 - lr: 0.000025 - momentum: 0.000000
112
+ 2024-03-26 11:05:29,018 epoch 3 - iter 54/95 - loss 0.24175864 - time (sec): 11.46 - samples/sec: 1787.67 - lr: 0.000025 - momentum: 0.000000
113
+ 2024-03-26 11:05:31,057 epoch 3 - iter 63/95 - loss 0.22898676 - time (sec): 13.50 - samples/sec: 1764.24 - lr: 0.000025 - momentum: 0.000000
114
+ 2024-03-26 11:05:32,704 epoch 3 - iter 72/95 - loss 0.21672189 - time (sec): 15.15 - samples/sec: 1773.05 - lr: 0.000024 - momentum: 0.000000
115
+ 2024-03-26 11:05:34,527 epoch 3 - iter 81/95 - loss 0.21563889 - time (sec): 16.97 - samples/sec: 1762.62 - lr: 0.000024 - momentum: 0.000000
116
+ 2024-03-26 11:05:36,799 epoch 3 - iter 90/95 - loss 0.20715295 - time (sec): 19.24 - samples/sec: 1731.07 - lr: 0.000024 - momentum: 0.000000
117
+ 2024-03-26 11:05:37,291 ----------------------------------------------------------------------------------------------------
118
+ 2024-03-26 11:05:37,291 EPOCH 3 done: loss 0.2084 - lr: 0.000024
119
+ 2024-03-26 11:05:38,237 DEV : loss 0.24972011148929596 - f1-score (micro avg) 0.8552
120
+ 2024-03-26 11:05:38,238 saving best model
121
+ 2024-03-26 11:05:38,713 ----------------------------------------------------------------------------------------------------
122
+ 2024-03-26 11:05:40,338 epoch 4 - iter 9/95 - loss 0.16002693 - time (sec): 1.62 - samples/sec: 1983.32 - lr: 0.000023 - momentum: 0.000000
123
+ 2024-03-26 11:05:42,433 epoch 4 - iter 18/95 - loss 0.13594050 - time (sec): 3.72 - samples/sec: 1734.08 - lr: 0.000023 - momentum: 0.000000
124
+ 2024-03-26 11:05:44,241 epoch 4 - iter 27/95 - loss 0.14400724 - time (sec): 5.53 - samples/sec: 1762.62 - lr: 0.000022 - momentum: 0.000000
125
+ 2024-03-26 11:05:46,829 epoch 4 - iter 36/95 - loss 0.12468849 - time (sec): 8.11 - samples/sec: 1697.88 - lr: 0.000022 - momentum: 0.000000
126
+ 2024-03-26 11:05:48,532 epoch 4 - iter 45/95 - loss 0.13305017 - time (sec): 9.82 - samples/sec: 1719.67 - lr: 0.000022 - momentum: 0.000000
127
+ 2024-03-26 11:05:50,109 epoch 4 - iter 54/95 - loss 0.14059580 - time (sec): 11.40 - samples/sec: 1769.99 - lr: 0.000022 - momentum: 0.000000
128
+ 2024-03-26 11:05:52,067 epoch 4 - iter 63/95 - loss 0.14104997 - time (sec): 13.35 - samples/sec: 1782.75 - lr: 0.000021 - momentum: 0.000000
129
+ 2024-03-26 11:05:53,365 epoch 4 - iter 72/95 - loss 0.14056154 - time (sec): 14.65 - samples/sec: 1813.54 - lr: 0.000021 - momentum: 0.000000
130
+ 2024-03-26 11:05:55,134 epoch 4 - iter 81/95 - loss 0.13741921 - time (sec): 16.42 - samples/sec: 1802.26 - lr: 0.000021 - momentum: 0.000000
131
+ 2024-03-26 11:05:56,677 epoch 4 - iter 90/95 - loss 0.13423455 - time (sec): 17.96 - samples/sec: 1821.25 - lr: 0.000020 - momentum: 0.000000
132
+ 2024-03-26 11:05:57,609 ----------------------------------------------------------------------------------------------------
133
+ 2024-03-26 11:05:57,609 EPOCH 4 done: loss 0.1333 - lr: 0.000020
134
+ 2024-03-26 11:05:58,553 DEV : loss 0.2319207489490509 - f1-score (micro avg) 0.8966
135
+ 2024-03-26 11:05:58,554 saving best model
136
+ 2024-03-26 11:05:59,032 ----------------------------------------------------------------------------------------------------
137
+ 2024-03-26 11:06:00,701 epoch 5 - iter 9/95 - loss 0.09106046 - time (sec): 1.67 - samples/sec: 1896.96 - lr: 0.000020 - momentum: 0.000000
138
+ 2024-03-26 11:06:02,974 epoch 5 - iter 18/95 - loss 0.11214134 - time (sec): 3.94 - samples/sec: 1700.70 - lr: 0.000019 - momentum: 0.000000
139
+ 2024-03-26 11:06:04,590 epoch 5 - iter 27/95 - loss 0.10946660 - time (sec): 5.56 - samples/sec: 1745.51 - lr: 0.000019 - momentum: 0.000000
140
+ 2024-03-26 11:06:06,311 epoch 5 - iter 36/95 - loss 0.10398039 - time (sec): 7.28 - samples/sec: 1733.22 - lr: 0.000019 - momentum: 0.000000
141
+ 2024-03-26 11:06:08,070 epoch 5 - iter 45/95 - loss 0.11325312 - time (sec): 9.04 - samples/sec: 1775.03 - lr: 0.000019 - momentum: 0.000000
142
+ 2024-03-26 11:06:09,736 epoch 5 - iter 54/95 - loss 0.11530607 - time (sec): 10.70 - samples/sec: 1817.06 - lr: 0.000018 - momentum: 0.000000
143
+ 2024-03-26 11:06:11,620 epoch 5 - iter 63/95 - loss 0.10923260 - time (sec): 12.59 - samples/sec: 1799.95 - lr: 0.000018 - momentum: 0.000000
144
+ 2024-03-26 11:06:13,849 epoch 5 - iter 72/95 - loss 0.10006288 - time (sec): 14.82 - samples/sec: 1831.56 - lr: 0.000018 - momentum: 0.000000
145
+ 2024-03-26 11:06:15,138 epoch 5 - iter 81/95 - loss 0.10120984 - time (sec): 16.11 - samples/sec: 1849.01 - lr: 0.000017 - momentum: 0.000000
146
+ 2024-03-26 11:06:17,349 epoch 5 - iter 90/95 - loss 0.09734972 - time (sec): 18.32 - samples/sec: 1807.97 - lr: 0.000017 - momentum: 0.000000
147
+ 2024-03-26 11:06:17,987 ----------------------------------------------------------------------------------------------------
148
+ 2024-03-26 11:06:17,987 EPOCH 5 done: loss 0.0979 - lr: 0.000017
149
+ 2024-03-26 11:06:18,961 DEV : loss 0.2127046436071396 - f1-score (micro avg) 0.8974
150
+ 2024-03-26 11:06:18,963 saving best model
151
+ 2024-03-26 11:06:19,438 ----------------------------------------------------------------------------------------------------
152
+ 2024-03-26 11:06:21,034 epoch 6 - iter 9/95 - loss 0.04200073 - time (sec): 1.59 - samples/sec: 1813.67 - lr: 0.000016 - momentum: 0.000000
153
+ 2024-03-26 11:06:23,107 epoch 6 - iter 18/95 - loss 0.05948374 - time (sec): 3.67 - samples/sec: 1788.89 - lr: 0.000016 - momentum: 0.000000
154
+ 2024-03-26 11:06:24,828 epoch 6 - iter 27/95 - loss 0.06671877 - time (sec): 5.39 - samples/sec: 1824.30 - lr: 0.000016 - momentum: 0.000000
155
+ 2024-03-26 11:06:26,526 epoch 6 - iter 36/95 - loss 0.06392011 - time (sec): 7.09 - samples/sec: 1788.22 - lr: 0.000016 - momentum: 0.000000
156
+ 2024-03-26 11:06:28,148 epoch 6 - iter 45/95 - loss 0.07242741 - time (sec): 8.71 - samples/sec: 1805.81 - lr: 0.000015 - momentum: 0.000000
157
+ 2024-03-26 11:06:30,233 epoch 6 - iter 54/95 - loss 0.07406039 - time (sec): 10.79 - samples/sec: 1781.25 - lr: 0.000015 - momentum: 0.000000
158
+ 2024-03-26 11:06:31,868 epoch 6 - iter 63/95 - loss 0.07635195 - time (sec): 12.43 - samples/sec: 1778.68 - lr: 0.000015 - momentum: 0.000000
159
+ 2024-03-26 11:06:34,770 epoch 6 - iter 72/95 - loss 0.07127343 - time (sec): 15.33 - samples/sec: 1739.55 - lr: 0.000014 - momentum: 0.000000
160
+ 2024-03-26 11:06:36,712 epoch 6 - iter 81/95 - loss 0.06987351 - time (sec): 17.27 - samples/sec: 1743.49 - lr: 0.000014 - momentum: 0.000000
161
+ 2024-03-26 11:06:38,441 epoch 6 - iter 90/95 - loss 0.07065652 - time (sec): 19.00 - samples/sec: 1737.38 - lr: 0.000014 - momentum: 0.000000
162
+ 2024-03-26 11:06:39,053 ----------------------------------------------------------------------------------------------------
163
+ 2024-03-26 11:06:39,053 EPOCH 6 done: loss 0.0710 - lr: 0.000014
164
+ 2024-03-26 11:06:40,007 DEV : loss 0.2133890837430954 - f1-score (micro avg) 0.9046
165
+ 2024-03-26 11:06:40,008 saving best model
166
+ 2024-03-26 11:06:40,490 ----------------------------------------------------------------------------------------------------
167
+ 2024-03-26 11:06:41,856 epoch 7 - iter 9/95 - loss 0.08524458 - time (sec): 1.37 - samples/sec: 2165.50 - lr: 0.000013 - momentum: 0.000000
168
+ 2024-03-26 11:06:43,572 epoch 7 - iter 18/95 - loss 0.07552488 - time (sec): 3.08 - samples/sec: 1906.20 - lr: 0.000013 - momentum: 0.000000
169
+ 2024-03-26 11:06:45,409 epoch 7 - iter 27/95 - loss 0.07187309 - time (sec): 4.92 - samples/sec: 1858.85 - lr: 0.000013 - momentum: 0.000000
170
+ 2024-03-26 11:06:47,334 epoch 7 - iter 36/95 - loss 0.06534470 - time (sec): 6.84 - samples/sec: 1828.25 - lr: 0.000012 - momentum: 0.000000
171
+ 2024-03-26 11:06:49,747 epoch 7 - iter 45/95 - loss 0.06072521 - time (sec): 9.26 - samples/sec: 1770.58 - lr: 0.000012 - momentum: 0.000000
172
+ 2024-03-26 11:06:50,760 epoch 7 - iter 54/95 - loss 0.06119397 - time (sec): 10.27 - samples/sec: 1844.86 - lr: 0.000012 - momentum: 0.000000
173
+ 2024-03-26 11:06:52,656 epoch 7 - iter 63/95 - loss 0.05777551 - time (sec): 12.16 - samples/sec: 1849.25 - lr: 0.000011 - momentum: 0.000000
174
+ 2024-03-26 11:06:54,617 epoch 7 - iter 72/95 - loss 0.05549539 - time (sec): 14.13 - samples/sec: 1813.17 - lr: 0.000011 - momentum: 0.000000
175
+ 2024-03-26 11:06:56,639 epoch 7 - iter 81/95 - loss 0.05670037 - time (sec): 16.15 - samples/sec: 1808.04 - lr: 0.000011 - momentum: 0.000000
176
+ 2024-03-26 11:06:58,649 epoch 7 - iter 90/95 - loss 0.05591996 - time (sec): 18.16 - samples/sec: 1810.66 - lr: 0.000010 - momentum: 0.000000
177
+ 2024-03-26 11:06:59,508 ----------------------------------------------------------------------------------------------------
178
+ 2024-03-26 11:06:59,508 EPOCH 7 done: loss 0.0559 - lr: 0.000010
179
+ 2024-03-26 11:07:00,471 DEV : loss 0.19283850491046906 - f1-score (micro avg) 0.9148
180
+ 2024-03-26 11:07:00,473 saving best model
181
+ 2024-03-26 11:07:00,949 ----------------------------------------------------------------------------------------------------
182
+ 2024-03-26 11:07:02,622 epoch 8 - iter 9/95 - loss 0.04158058 - time (sec): 1.67 - samples/sec: 1788.85 - lr: 0.000010 - momentum: 0.000000
183
+ 2024-03-26 11:07:04,677 epoch 8 - iter 18/95 - loss 0.03721012 - time (sec): 3.73 - samples/sec: 1631.56 - lr: 0.000010 - momentum: 0.000000
184
+ 2024-03-26 11:07:06,284 epoch 8 - iter 27/95 - loss 0.03911622 - time (sec): 5.33 - samples/sec: 1725.71 - lr: 0.000009 - momentum: 0.000000
185
+ 2024-03-26 11:07:08,047 epoch 8 - iter 36/95 - loss 0.04077943 - time (sec): 7.10 - samples/sec: 1773.99 - lr: 0.000009 - momentum: 0.000000
186
+ 2024-03-26 11:07:10,485 epoch 8 - iter 45/95 - loss 0.03480845 - time (sec): 9.53 - samples/sec: 1743.66 - lr: 0.000009 - momentum: 0.000000
187
+ 2024-03-26 11:07:12,891 epoch 8 - iter 54/95 - loss 0.03949966 - time (sec): 11.94 - samples/sec: 1743.95 - lr: 0.000008 - momentum: 0.000000
188
+ 2024-03-26 11:07:14,890 epoch 8 - iter 63/95 - loss 0.04061254 - time (sec): 13.94 - samples/sec: 1751.54 - lr: 0.000008 - momentum: 0.000000
189
+ 2024-03-26 11:07:15,995 epoch 8 - iter 72/95 - loss 0.04045703 - time (sec): 15.04 - samples/sec: 1784.82 - lr: 0.000008 - momentum: 0.000000
190
+ 2024-03-26 11:07:17,702 epoch 8 - iter 81/95 - loss 0.04072728 - time (sec): 16.75 - samples/sec: 1771.85 - lr: 0.000007 - momentum: 0.000000
191
+ 2024-03-26 11:07:19,135 epoch 8 - iter 90/95 - loss 0.04136720 - time (sec): 18.18 - samples/sec: 1784.94 - lr: 0.000007 - momentum: 0.000000
192
+ 2024-03-26 11:07:20,399 ----------------------------------------------------------------------------------------------------
193
+ 2024-03-26 11:07:20,399 EPOCH 8 done: loss 0.0431 - lr: 0.000007
194
+ 2024-03-26 11:07:21,357 DEV : loss 0.22110594809055328 - f1-score (micro avg) 0.9234
195
+ 2024-03-26 11:07:21,358 saving best model
196
+ 2024-03-26 11:07:21,819 ----------------------------------------------------------------------------------------------------
197
+ 2024-03-26 11:07:23,659 epoch 9 - iter 9/95 - loss 0.02117655 - time (sec): 1.84 - samples/sec: 1888.79 - lr: 0.000007 - momentum: 0.000000
198
+ 2024-03-26 11:07:25,630 epoch 9 - iter 18/95 - loss 0.01735997 - time (sec): 3.81 - samples/sec: 1773.36 - lr: 0.000006 - momentum: 0.000000
199
+ 2024-03-26 11:07:27,518 epoch 9 - iter 27/95 - loss 0.02071184 - time (sec): 5.70 - samples/sec: 1724.52 - lr: 0.000006 - momentum: 0.000000
200
+ 2024-03-26 11:07:29,467 epoch 9 - iter 36/95 - loss 0.03209492 - time (sec): 7.65 - samples/sec: 1760.60 - lr: 0.000006 - momentum: 0.000000
201
+ 2024-03-26 11:07:31,408 epoch 9 - iter 45/95 - loss 0.03322818 - time (sec): 9.59 - samples/sec: 1739.09 - lr: 0.000005 - momentum: 0.000000
202
+ 2024-03-26 11:07:33,300 epoch 9 - iter 54/95 - loss 0.03241033 - time (sec): 11.48 - samples/sec: 1771.92 - lr: 0.000005 - momentum: 0.000000
203
+ 2024-03-26 11:07:35,232 epoch 9 - iter 63/95 - loss 0.03415390 - time (sec): 13.41 - samples/sec: 1770.43 - lr: 0.000005 - momentum: 0.000000
204
+ 2024-03-26 11:07:36,896 epoch 9 - iter 72/95 - loss 0.03629789 - time (sec): 15.08 - samples/sec: 1775.74 - lr: 0.000004 - momentum: 0.000000
205
+ 2024-03-26 11:07:38,662 epoch 9 - iter 81/95 - loss 0.03751881 - time (sec): 16.84 - samples/sec: 1765.57 - lr: 0.000004 - momentum: 0.000000
206
+ 2024-03-26 11:07:40,478 epoch 9 - iter 90/95 - loss 0.03567593 - time (sec): 18.66 - samples/sec: 1781.63 - lr: 0.000004 - momentum: 0.000000
207
+ 2024-03-26 11:07:40,996 ----------------------------------------------------------------------------------------------------
208
+ 2024-03-26 11:07:40,996 EPOCH 9 done: loss 0.0366 - lr: 0.000004
209
+ 2024-03-26 11:07:41,962 DEV : loss 0.21862706542015076 - f1-score (micro avg) 0.9222
210
+ 2024-03-26 11:07:41,963 ----------------------------------------------------------------------------------------------------
211
+ 2024-03-26 11:07:43,480 epoch 10 - iter 9/95 - loss 0.01060765 - time (sec): 1.52 - samples/sec: 1831.02 - lr: 0.000003 - momentum: 0.000000
212
+ 2024-03-26 11:07:45,382 epoch 10 - iter 18/95 - loss 0.01790097 - time (sec): 3.42 - samples/sec: 1768.27 - lr: 0.000003 - momentum: 0.000000
213
+ 2024-03-26 11:07:47,560 epoch 10 - iter 27/95 - loss 0.02783328 - time (sec): 5.60 - samples/sec: 1727.46 - lr: 0.000003 - momentum: 0.000000
214
+ 2024-03-26 11:07:49,513 epoch 10 - iter 36/95 - loss 0.03132828 - time (sec): 7.55 - samples/sec: 1737.31 - lr: 0.000002 - momentum: 0.000000
215
+ 2024-03-26 11:07:50,728 epoch 10 - iter 45/95 - loss 0.03012425 - time (sec): 8.76 - samples/sec: 1788.43 - lr: 0.000002 - momentum: 0.000000
216
+ 2024-03-26 11:07:52,708 epoch 10 - iter 54/95 - loss 0.03276909 - time (sec): 10.74 - samples/sec: 1772.15 - lr: 0.000002 - momentum: 0.000000
217
+ 2024-03-26 11:07:54,137 epoch 10 - iter 63/95 - loss 0.03503644 - time (sec): 12.17 - samples/sec: 1785.04 - lr: 0.000001 - momentum: 0.000000
218
+ 2024-03-26 11:07:56,445 epoch 10 - iter 72/95 - loss 0.03111766 - time (sec): 14.48 - samples/sec: 1768.76 - lr: 0.000001 - momentum: 0.000000
219
+ 2024-03-26 11:07:58,830 epoch 10 - iter 81/95 - loss 0.03395486 - time (sec): 16.87 - samples/sec: 1750.74 - lr: 0.000001 - momentum: 0.000000
220
+ 2024-03-26 11:08:00,685 epoch 10 - iter 90/95 - loss 0.03194792 - time (sec): 18.72 - samples/sec: 1748.52 - lr: 0.000000 - momentum: 0.000000
221
+ 2024-03-26 11:08:01,734 ----------------------------------------------------------------------------------------------------
222
+ 2024-03-26 11:08:01,735 EPOCH 10 done: loss 0.0309 - lr: 0.000000
223
+ 2024-03-26 11:08:02,703 DEV : loss 0.22073255479335785 - f1-score (micro avg) 0.9226
224
+ 2024-03-26 11:08:03,024 ----------------------------------------------------------------------------------------------------
225
+ 2024-03-26 11:08:03,025 Loading model from best epoch ...
226
+ 2024-03-26 11:08:03,994 SequenceTagger predicts: Dictionary with 17 tags: O, S-Unternehmen, B-Unternehmen, E-Unternehmen, I-Unternehmen, S-Auslagerung, B-Auslagerung, E-Auslagerung, I-Auslagerung, S-Ort, B-Ort, E-Ort, I-Ort, S-Software, B-Software, E-Software, I-Software
227
+ 2024-03-26 11:08:04,882
228
+ Results:
229
+ - F-score (micro) 0.912
230
+ - F-score (macro) 0.6941
231
+ - Accuracy 0.8406
232
+
233
+ By class:
234
+ precision recall f1-score support
235
+
236
+ Unternehmen 0.9008 0.8872 0.8939 266
237
+ Auslagerung 0.8764 0.9116 0.8937 249
238
+ Ort 0.9852 0.9925 0.9888 134
239
+ Software 0.0000 0.0000 0.0000 0
240
+
241
+ micro avg 0.9058 0.9183 0.9120 649
242
+ macro avg 0.6906 0.6979 0.6941 649
243
+ weighted avg 0.9089 0.9183 0.9134 649
244
+
245
+ 2024-03-26 11:08:04,882 ----------------------------------------------------------------------------------------------------