stefan-it commited on
Commit
806e269
1 Parent(s): 6b81a63

Upload folder using huggingface_hub

Browse files
Files changed (5) hide show
  1. best-model.pt +3 -0
  2. dev.tsv +0 -0
  3. loss.tsv +11 -0
  4. test.tsv +0 -0
  5. training.log +239 -0
best-model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:20e3c2b25bb84837472994fd85ba32bc9652e3b9f32ecbb1f3756fba6f46d9f7
3
+ size 443311111
dev.tsv ADDED
The diff for this file is too large to render. See raw diff
 
loss.tsv ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
2
+ 1 11:09:42 0.0000 0.3598 0.1101 0.7682 0.6880 0.7259 0.5796
3
+ 2 11:10:46 0.0000 0.0950 0.1115 0.7850 0.4866 0.6008 0.4361
4
+ 3 11:11:50 0.0000 0.0615 0.0918 0.8679 0.7469 0.8029 0.6834
5
+ 4 11:12:55 0.0000 0.0422 0.0877 0.8111 0.8430 0.8267 0.7170
6
+ 5 11:13:58 0.0000 0.0316 0.1308 0.8631 0.7490 0.8020 0.6769
7
+ 6 11:15:03 0.0000 0.0222 0.1474 0.8643 0.7696 0.8142 0.6976
8
+ 7 11:16:07 0.0000 0.0170 0.1656 0.8493 0.7800 0.8131 0.6952
9
+ 8 11:17:10 0.0000 0.0122 0.1542 0.8493 0.8151 0.8318 0.7225
10
+ 9 11:18:14 0.0000 0.0077 0.1693 0.8689 0.7872 0.8260 0.7128
11
+ 10 11:19:18 0.0000 0.0052 0.1755 0.8730 0.7882 0.8284 0.7151
test.tsv ADDED
The diff for this file is too large to render. See raw diff
 
training.log ADDED
@@ -0,0 +1,239 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-14 11:08:39,421 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-14 11:08:39,422 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): BertModel(
5
+ (embeddings): BertEmbeddings(
6
+ (word_embeddings): Embedding(32001, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): BertEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x BertLayer(
15
+ (attention): BertAttention(
16
+ (self): BertSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): BertSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): BertIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): BertOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): BertPooler(
41
+ (dense): Linear(in_features=768, out_features=768, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=768, out_features=13, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2023-10-14 11:08:39,422 ----------------------------------------------------------------------------------------------------
51
+ 2023-10-14 11:08:39,422 MultiCorpus: 5777 train + 722 dev + 723 test sentences
52
+ - NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
53
+ 2023-10-14 11:08:39,422 ----------------------------------------------------------------------------------------------------
54
+ 2023-10-14 11:08:39,422 Train: 5777 sentences
55
+ 2023-10-14 11:08:39,422 (train_with_dev=False, train_with_test=False)
56
+ 2023-10-14 11:08:39,423 ----------------------------------------------------------------------------------------------------
57
+ 2023-10-14 11:08:39,423 Training Params:
58
+ 2023-10-14 11:08:39,423 - learning_rate: "5e-05"
59
+ 2023-10-14 11:08:39,423 - mini_batch_size: "8"
60
+ 2023-10-14 11:08:39,423 - max_epochs: "10"
61
+ 2023-10-14 11:08:39,423 - shuffle: "True"
62
+ 2023-10-14 11:08:39,423 ----------------------------------------------------------------------------------------------------
63
+ 2023-10-14 11:08:39,423 Plugins:
64
+ 2023-10-14 11:08:39,423 - LinearScheduler | warmup_fraction: '0.1'
65
+ 2023-10-14 11:08:39,423 ----------------------------------------------------------------------------------------------------
66
+ 2023-10-14 11:08:39,423 Final evaluation on model from best epoch (best-model.pt)
67
+ 2023-10-14 11:08:39,423 - metric: "('micro avg', 'f1-score')"
68
+ 2023-10-14 11:08:39,423 ----------------------------------------------------------------------------------------------------
69
+ 2023-10-14 11:08:39,423 Computation:
70
+ 2023-10-14 11:08:39,423 - compute on device: cuda:0
71
+ 2023-10-14 11:08:39,423 - embedding storage: none
72
+ 2023-10-14 11:08:39,423 ----------------------------------------------------------------------------------------------------
73
+ 2023-10-14 11:08:39,423 Model training base path: "hmbench-icdar/nl-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4"
74
+ 2023-10-14 11:08:39,423 ----------------------------------------------------------------------------------------------------
75
+ 2023-10-14 11:08:39,423 ----------------------------------------------------------------------------------------------------
76
+ 2023-10-14 11:08:45,872 epoch 1 - iter 72/723 - loss 1.95446486 - time (sec): 6.45 - samples/sec: 2891.94 - lr: 0.000005 - momentum: 0.000000
77
+ 2023-10-14 11:08:51,688 epoch 1 - iter 144/723 - loss 1.15070028 - time (sec): 12.26 - samples/sec: 2945.36 - lr: 0.000010 - momentum: 0.000000
78
+ 2023-10-14 11:08:57,784 epoch 1 - iter 216/723 - loss 0.84752208 - time (sec): 18.36 - samples/sec: 2902.56 - lr: 0.000015 - momentum: 0.000000
79
+ 2023-10-14 11:09:03,660 epoch 1 - iter 288/723 - loss 0.68506420 - time (sec): 24.24 - samples/sec: 2896.34 - lr: 0.000020 - momentum: 0.000000
80
+ 2023-10-14 11:09:09,759 epoch 1 - iter 360/723 - loss 0.57910742 - time (sec): 30.34 - samples/sec: 2910.32 - lr: 0.000025 - momentum: 0.000000
81
+ 2023-10-14 11:09:15,522 epoch 1 - iter 432/723 - loss 0.50869393 - time (sec): 36.10 - samples/sec: 2939.24 - lr: 0.000030 - momentum: 0.000000
82
+ 2023-10-14 11:09:21,253 epoch 1 - iter 504/723 - loss 0.45633381 - time (sec): 41.83 - samples/sec: 2955.22 - lr: 0.000035 - momentum: 0.000000
83
+ 2023-10-14 11:09:27,720 epoch 1 - iter 576/723 - loss 0.41501574 - time (sec): 48.30 - samples/sec: 2953.32 - lr: 0.000040 - momentum: 0.000000
84
+ 2023-10-14 11:09:33,809 epoch 1 - iter 648/723 - loss 0.38295739 - time (sec): 54.39 - samples/sec: 2940.77 - lr: 0.000045 - momentum: 0.000000
85
+ 2023-10-14 11:09:38,985 epoch 1 - iter 720/723 - loss 0.36045102 - time (sec): 59.56 - samples/sec: 2949.21 - lr: 0.000050 - momentum: 0.000000
86
+ 2023-10-14 11:09:39,190 ----------------------------------------------------------------------------------------------------
87
+ 2023-10-14 11:09:39,190 EPOCH 1 done: loss 0.3598 - lr: 0.000050
88
+ 2023-10-14 11:09:42,683 DEV : loss 0.11006532609462738 - f1-score (micro avg) 0.7259
89
+ 2023-10-14 11:09:42,700 saving best model
90
+ 2023-10-14 11:09:43,090 ----------------------------------------------------------------------------------------------------
91
+ 2023-10-14 11:09:48,823 epoch 2 - iter 72/723 - loss 0.11097378 - time (sec): 5.73 - samples/sec: 2830.47 - lr: 0.000049 - momentum: 0.000000
92
+ 2023-10-14 11:09:54,889 epoch 2 - iter 144/723 - loss 0.10276963 - time (sec): 11.80 - samples/sec: 2864.35 - lr: 0.000049 - momentum: 0.000000
93
+ 2023-10-14 11:10:01,074 epoch 2 - iter 216/723 - loss 0.10943754 - time (sec): 17.98 - samples/sec: 2878.27 - lr: 0.000048 - momentum: 0.000000
94
+ 2023-10-14 11:10:07,800 epoch 2 - iter 288/723 - loss 0.10336155 - time (sec): 24.71 - samples/sec: 2868.76 - lr: 0.000048 - momentum: 0.000000
95
+ 2023-10-14 11:10:13,888 epoch 2 - iter 360/723 - loss 0.09910848 - time (sec): 30.80 - samples/sec: 2884.35 - lr: 0.000047 - momentum: 0.000000
96
+ 2023-10-14 11:10:19,661 epoch 2 - iter 432/723 - loss 0.09836247 - time (sec): 36.57 - samples/sec: 2887.37 - lr: 0.000047 - momentum: 0.000000
97
+ 2023-10-14 11:10:25,255 epoch 2 - iter 504/723 - loss 0.09901695 - time (sec): 42.16 - samples/sec: 2896.44 - lr: 0.000046 - momentum: 0.000000
98
+ 2023-10-14 11:10:30,985 epoch 2 - iter 576/723 - loss 0.09643764 - time (sec): 47.89 - samples/sec: 2911.46 - lr: 0.000046 - momentum: 0.000000
99
+ 2023-10-14 11:10:37,114 epoch 2 - iter 648/723 - loss 0.09508996 - time (sec): 54.02 - samples/sec: 2910.76 - lr: 0.000045 - momentum: 0.000000
100
+ 2023-10-14 11:10:43,191 epoch 2 - iter 720/723 - loss 0.09515557 - time (sec): 60.10 - samples/sec: 2923.45 - lr: 0.000044 - momentum: 0.000000
101
+ 2023-10-14 11:10:43,424 ----------------------------------------------------------------------------------------------------
102
+ 2023-10-14 11:10:43,425 EPOCH 2 done: loss 0.0950 - lr: 0.000044
103
+ 2023-10-14 11:10:46,938 DEV : loss 0.11145603656768799 - f1-score (micro avg) 0.6008
104
+ 2023-10-14 11:10:46,954 ----------------------------------------------------------------------------------------------------
105
+ 2023-10-14 11:10:53,078 epoch 3 - iter 72/723 - loss 0.05697663 - time (sec): 6.12 - samples/sec: 2960.64 - lr: 0.000044 - momentum: 0.000000
106
+ 2023-10-14 11:10:59,115 epoch 3 - iter 144/723 - loss 0.05590903 - time (sec): 12.16 - samples/sec: 2928.48 - lr: 0.000043 - momentum: 0.000000
107
+ 2023-10-14 11:11:04,869 epoch 3 - iter 216/723 - loss 0.06028807 - time (sec): 17.91 - samples/sec: 2898.16 - lr: 0.000043 - momentum: 0.000000
108
+ 2023-10-14 11:11:10,528 epoch 3 - iter 288/723 - loss 0.06014774 - time (sec): 23.57 - samples/sec: 2937.38 - lr: 0.000042 - momentum: 0.000000
109
+ 2023-10-14 11:11:16,547 epoch 3 - iter 360/723 - loss 0.05942668 - time (sec): 29.59 - samples/sec: 2963.42 - lr: 0.000042 - momentum: 0.000000
110
+ 2023-10-14 11:11:22,397 epoch 3 - iter 432/723 - loss 0.06112432 - time (sec): 35.44 - samples/sec: 2966.69 - lr: 0.000041 - momentum: 0.000000
111
+ 2023-10-14 11:11:28,720 epoch 3 - iter 504/723 - loss 0.06151410 - time (sec): 41.77 - samples/sec: 2965.99 - lr: 0.000041 - momentum: 0.000000
112
+ 2023-10-14 11:11:34,166 epoch 3 - iter 576/723 - loss 0.06139185 - time (sec): 47.21 - samples/sec: 2972.35 - lr: 0.000040 - momentum: 0.000000
113
+ 2023-10-14 11:11:40,094 epoch 3 - iter 648/723 - loss 0.06076136 - time (sec): 53.14 - samples/sec: 2963.14 - lr: 0.000039 - momentum: 0.000000
114
+ 2023-10-14 11:11:46,678 epoch 3 - iter 720/723 - loss 0.06160248 - time (sec): 59.72 - samples/sec: 2936.77 - lr: 0.000039 - momentum: 0.000000
115
+ 2023-10-14 11:11:47,005 ----------------------------------------------------------------------------------------------------
116
+ 2023-10-14 11:11:47,005 EPOCH 3 done: loss 0.0615 - lr: 0.000039
117
+ 2023-10-14 11:11:50,496 DEV : loss 0.09180538356304169 - f1-score (micro avg) 0.8029
118
+ 2023-10-14 11:11:50,515 saving best model
119
+ 2023-10-14 11:11:50,997 ----------------------------------------------------------------------------------------------------
120
+ 2023-10-14 11:11:57,031 epoch 4 - iter 72/723 - loss 0.03286789 - time (sec): 6.03 - samples/sec: 2911.68 - lr: 0.000038 - momentum: 0.000000
121
+ 2023-10-14 11:12:03,391 epoch 4 - iter 144/723 - loss 0.04890866 - time (sec): 12.39 - samples/sec: 2893.01 - lr: 0.000038 - momentum: 0.000000
122
+ 2023-10-14 11:12:09,237 epoch 4 - iter 216/723 - loss 0.04888331 - time (sec): 18.24 - samples/sec: 2894.47 - lr: 0.000037 - momentum: 0.000000
123
+ 2023-10-14 11:12:15,546 epoch 4 - iter 288/723 - loss 0.04495173 - time (sec): 24.55 - samples/sec: 2878.44 - lr: 0.000037 - momentum: 0.000000
124
+ 2023-10-14 11:12:21,075 epoch 4 - iter 360/723 - loss 0.04443698 - time (sec): 30.08 - samples/sec: 2897.91 - lr: 0.000036 - momentum: 0.000000
125
+ 2023-10-14 11:12:27,144 epoch 4 - iter 432/723 - loss 0.04236909 - time (sec): 36.15 - samples/sec: 2924.18 - lr: 0.000036 - momentum: 0.000000
126
+ 2023-10-14 11:12:33,112 epoch 4 - iter 504/723 - loss 0.04277641 - time (sec): 42.11 - samples/sec: 2914.76 - lr: 0.000035 - momentum: 0.000000
127
+ 2023-10-14 11:12:39,122 epoch 4 - iter 576/723 - loss 0.04268064 - time (sec): 48.12 - samples/sec: 2919.51 - lr: 0.000034 - momentum: 0.000000
128
+ 2023-10-14 11:12:45,234 epoch 4 - iter 648/723 - loss 0.04288486 - time (sec): 54.24 - samples/sec: 2923.74 - lr: 0.000034 - momentum: 0.000000
129
+ 2023-10-14 11:12:51,242 epoch 4 - iter 720/723 - loss 0.04212052 - time (sec): 60.24 - samples/sec: 2914.87 - lr: 0.000033 - momentum: 0.000000
130
+ 2023-10-14 11:12:51,437 ----------------------------------------------------------------------------------------------------
131
+ 2023-10-14 11:12:51,437 EPOCH 4 done: loss 0.0422 - lr: 0.000033
132
+ 2023-10-14 11:12:55,395 DEV : loss 0.08765760809183121 - f1-score (micro avg) 0.8267
133
+ 2023-10-14 11:12:55,411 saving best model
134
+ 2023-10-14 11:12:55,905 ----------------------------------------------------------------------------------------------------
135
+ 2023-10-14 11:13:02,279 epoch 5 - iter 72/723 - loss 0.03255457 - time (sec): 6.37 - samples/sec: 2890.41 - lr: 0.000033 - momentum: 0.000000
136
+ 2023-10-14 11:13:07,749 epoch 5 - iter 144/723 - loss 0.02936104 - time (sec): 11.84 - samples/sec: 2992.68 - lr: 0.000032 - momentum: 0.000000
137
+ 2023-10-14 11:13:14,058 epoch 5 - iter 216/723 - loss 0.02869964 - time (sec): 18.15 - samples/sec: 2982.64 - lr: 0.000032 - momentum: 0.000000
138
+ 2023-10-14 11:13:19,868 epoch 5 - iter 288/723 - loss 0.03183997 - time (sec): 23.96 - samples/sec: 2956.14 - lr: 0.000031 - momentum: 0.000000
139
+ 2023-10-14 11:13:25,531 epoch 5 - iter 360/723 - loss 0.03090078 - time (sec): 29.62 - samples/sec: 2958.85 - lr: 0.000031 - momentum: 0.000000
140
+ 2023-10-14 11:13:31,005 epoch 5 - iter 432/723 - loss 0.03133124 - time (sec): 35.10 - samples/sec: 2956.71 - lr: 0.000030 - momentum: 0.000000
141
+ 2023-10-14 11:13:37,056 epoch 5 - iter 504/723 - loss 0.03078959 - time (sec): 41.15 - samples/sec: 2963.95 - lr: 0.000029 - momentum: 0.000000
142
+ 2023-10-14 11:13:43,185 epoch 5 - iter 576/723 - loss 0.03127458 - time (sec): 47.28 - samples/sec: 2955.69 - lr: 0.000029 - momentum: 0.000000
143
+ 2023-10-14 11:13:49,383 epoch 5 - iter 648/723 - loss 0.03267352 - time (sec): 53.48 - samples/sec: 2959.58 - lr: 0.000028 - momentum: 0.000000
144
+ 2023-10-14 11:13:55,255 epoch 5 - iter 720/723 - loss 0.03166451 - time (sec): 59.35 - samples/sec: 2960.54 - lr: 0.000028 - momentum: 0.000000
145
+ 2023-10-14 11:13:55,425 ----------------------------------------------------------------------------------------------------
146
+ 2023-10-14 11:13:55,425 EPOCH 5 done: loss 0.0316 - lr: 0.000028
147
+ 2023-10-14 11:13:58,953 DEV : loss 0.13083083927631378 - f1-score (micro avg) 0.802
148
+ 2023-10-14 11:13:58,970 ----------------------------------------------------------------------------------------------------
149
+ 2023-10-14 11:14:05,067 epoch 6 - iter 72/723 - loss 0.01949715 - time (sec): 6.10 - samples/sec: 2846.45 - lr: 0.000027 - momentum: 0.000000
150
+ 2023-10-14 11:14:11,288 epoch 6 - iter 144/723 - loss 0.02577060 - time (sec): 12.32 - samples/sec: 2847.72 - lr: 0.000027 - momentum: 0.000000
151
+ 2023-10-14 11:14:17,115 epoch 6 - iter 216/723 - loss 0.02325824 - time (sec): 18.14 - samples/sec: 2893.21 - lr: 0.000026 - momentum: 0.000000
152
+ 2023-10-14 11:14:23,395 epoch 6 - iter 288/723 - loss 0.02320719 - time (sec): 24.42 - samples/sec: 2887.98 - lr: 0.000026 - momentum: 0.000000
153
+ 2023-10-14 11:14:29,178 epoch 6 - iter 360/723 - loss 0.02320219 - time (sec): 30.21 - samples/sec: 2894.14 - lr: 0.000025 - momentum: 0.000000
154
+ 2023-10-14 11:14:35,636 epoch 6 - iter 432/723 - loss 0.02204302 - time (sec): 36.66 - samples/sec: 2864.92 - lr: 0.000024 - momentum: 0.000000
155
+ 2023-10-14 11:14:41,853 epoch 6 - iter 504/723 - loss 0.02223350 - time (sec): 42.88 - samples/sec: 2865.35 - lr: 0.000024 - momentum: 0.000000
156
+ 2023-10-14 11:14:48,295 epoch 6 - iter 576/723 - loss 0.02115934 - time (sec): 49.32 - samples/sec: 2883.97 - lr: 0.000023 - momentum: 0.000000
157
+ 2023-10-14 11:14:54,075 epoch 6 - iter 648/723 - loss 0.02227967 - time (sec): 55.10 - samples/sec: 2887.62 - lr: 0.000023 - momentum: 0.000000
158
+ 2023-10-14 11:14:59,656 epoch 6 - iter 720/723 - loss 0.02225091 - time (sec): 60.68 - samples/sec: 2896.19 - lr: 0.000022 - momentum: 0.000000
159
+ 2023-10-14 11:14:59,822 ----------------------------------------------------------------------------------------------------
160
+ 2023-10-14 11:14:59,822 EPOCH 6 done: loss 0.0222 - lr: 0.000022
161
+ 2023-10-14 11:15:03,409 DEV : loss 0.14740866422653198 - f1-score (micro avg) 0.8142
162
+ 2023-10-14 11:15:03,424 ----------------------------------------------------------------------------------------------------
163
+ 2023-10-14 11:15:09,622 epoch 7 - iter 72/723 - loss 0.00957452 - time (sec): 6.20 - samples/sec: 2831.54 - lr: 0.000022 - momentum: 0.000000
164
+ 2023-10-14 11:15:16,166 epoch 7 - iter 144/723 - loss 0.01188012 - time (sec): 12.74 - samples/sec: 2876.66 - lr: 0.000021 - momentum: 0.000000
165
+ 2023-10-14 11:15:21,813 epoch 7 - iter 216/723 - loss 0.01261768 - time (sec): 18.39 - samples/sec: 2910.92 - lr: 0.000021 - momentum: 0.000000
166
+ 2023-10-14 11:15:27,957 epoch 7 - iter 288/723 - loss 0.01369299 - time (sec): 24.53 - samples/sec: 2931.95 - lr: 0.000020 - momentum: 0.000000
167
+ 2023-10-14 11:15:33,566 epoch 7 - iter 360/723 - loss 0.01532146 - time (sec): 30.14 - samples/sec: 2935.86 - lr: 0.000019 - momentum: 0.000000
168
+ 2023-10-14 11:15:39,009 epoch 7 - iter 432/723 - loss 0.01513085 - time (sec): 35.58 - samples/sec: 2953.20 - lr: 0.000019 - momentum: 0.000000
169
+ 2023-10-14 11:15:45,475 epoch 7 - iter 504/723 - loss 0.01624201 - time (sec): 42.05 - samples/sec: 2945.23 - lr: 0.000018 - momentum: 0.000000
170
+ 2023-10-14 11:15:51,426 epoch 7 - iter 576/723 - loss 0.01671837 - time (sec): 48.00 - samples/sec: 2955.70 - lr: 0.000018 - momentum: 0.000000
171
+ 2023-10-14 11:15:56,996 epoch 7 - iter 648/723 - loss 0.01672154 - time (sec): 53.57 - samples/sec: 2970.48 - lr: 0.000017 - momentum: 0.000000
172
+ 2023-10-14 11:16:03,131 epoch 7 - iter 720/723 - loss 0.01649884 - time (sec): 59.71 - samples/sec: 2944.54 - lr: 0.000017 - momentum: 0.000000
173
+ 2023-10-14 11:16:03,334 ----------------------------------------------------------------------------------------------------
174
+ 2023-10-14 11:16:03,334 EPOCH 7 done: loss 0.0170 - lr: 0.000017
175
+ 2023-10-14 11:16:07,244 DEV : loss 0.16563206911087036 - f1-score (micro avg) 0.8131
176
+ 2023-10-14 11:16:07,261 ----------------------------------------------------------------------------------------------------
177
+ 2023-10-14 11:16:13,097 epoch 8 - iter 72/723 - loss 0.01815901 - time (sec): 5.83 - samples/sec: 2864.79 - lr: 0.000016 - momentum: 0.000000
178
+ 2023-10-14 11:16:19,181 epoch 8 - iter 144/723 - loss 0.01454238 - time (sec): 11.92 - samples/sec: 2902.43 - lr: 0.000016 - momentum: 0.000000
179
+ 2023-10-14 11:16:25,206 epoch 8 - iter 216/723 - loss 0.01396639 - time (sec): 17.94 - samples/sec: 2916.65 - lr: 0.000015 - momentum: 0.000000
180
+ 2023-10-14 11:16:31,319 epoch 8 - iter 288/723 - loss 0.01233712 - time (sec): 24.06 - samples/sec: 2896.51 - lr: 0.000014 - momentum: 0.000000
181
+ 2023-10-14 11:16:37,272 epoch 8 - iter 360/723 - loss 0.01149011 - time (sec): 30.01 - samples/sec: 2930.63 - lr: 0.000014 - momentum: 0.000000
182
+ 2023-10-14 11:16:42,721 epoch 8 - iter 432/723 - loss 0.01140242 - time (sec): 35.46 - samples/sec: 2948.80 - lr: 0.000013 - momentum: 0.000000
183
+ 2023-10-14 11:16:49,014 epoch 8 - iter 504/723 - loss 0.01267287 - time (sec): 41.75 - samples/sec: 2939.73 - lr: 0.000013 - momentum: 0.000000
184
+ 2023-10-14 11:16:54,938 epoch 8 - iter 576/723 - loss 0.01271844 - time (sec): 47.68 - samples/sec: 2944.93 - lr: 0.000012 - momentum: 0.000000
185
+ 2023-10-14 11:17:00,516 epoch 8 - iter 648/723 - loss 0.01192778 - time (sec): 53.25 - samples/sec: 2958.87 - lr: 0.000012 - momentum: 0.000000
186
+ 2023-10-14 11:17:06,796 epoch 8 - iter 720/723 - loss 0.01221300 - time (sec): 59.53 - samples/sec: 2946.63 - lr: 0.000011 - momentum: 0.000000
187
+ 2023-10-14 11:17:07,022 ----------------------------------------------------------------------------------------------------
188
+ 2023-10-14 11:17:07,022 EPOCH 8 done: loss 0.0122 - lr: 0.000011
189
+ 2023-10-14 11:17:10,588 DEV : loss 0.1542436182498932 - f1-score (micro avg) 0.8318
190
+ 2023-10-14 11:17:10,605 saving best model
191
+ 2023-10-14 11:17:11,147 ----------------------------------------------------------------------------------------------------
192
+ 2023-10-14 11:17:17,372 epoch 9 - iter 72/723 - loss 0.00632595 - time (sec): 6.22 - samples/sec: 2919.02 - lr: 0.000011 - momentum: 0.000000
193
+ 2023-10-14 11:17:23,181 epoch 9 - iter 144/723 - loss 0.00583846 - time (sec): 12.03 - samples/sec: 2936.80 - lr: 0.000010 - momentum: 0.000000
194
+ 2023-10-14 11:17:29,380 epoch 9 - iter 216/723 - loss 0.00746569 - time (sec): 18.23 - samples/sec: 2858.87 - lr: 0.000009 - momentum: 0.000000
195
+ 2023-10-14 11:17:36,031 epoch 9 - iter 288/723 - loss 0.00781052 - time (sec): 24.88 - samples/sec: 2868.30 - lr: 0.000009 - momentum: 0.000000
196
+ 2023-10-14 11:17:41,837 epoch 9 - iter 360/723 - loss 0.00707142 - time (sec): 30.69 - samples/sec: 2892.67 - lr: 0.000008 - momentum: 0.000000
197
+ 2023-10-14 11:17:47,948 epoch 9 - iter 432/723 - loss 0.00709441 - time (sec): 36.80 - samples/sec: 2898.81 - lr: 0.000008 - momentum: 0.000000
198
+ 2023-10-14 11:17:53,401 epoch 9 - iter 504/723 - loss 0.00711054 - time (sec): 42.25 - samples/sec: 2923.32 - lr: 0.000007 - momentum: 0.000000
199
+ 2023-10-14 11:17:59,535 epoch 9 - iter 576/723 - loss 0.00729666 - time (sec): 48.39 - samples/sec: 2923.90 - lr: 0.000007 - momentum: 0.000000
200
+ 2023-10-14 11:18:05,332 epoch 9 - iter 648/723 - loss 0.00763742 - time (sec): 54.18 - samples/sec: 2924.87 - lr: 0.000006 - momentum: 0.000000
201
+ 2023-10-14 11:18:11,175 epoch 9 - iter 720/723 - loss 0.00760018 - time (sec): 60.03 - samples/sec: 2927.43 - lr: 0.000006 - momentum: 0.000000
202
+ 2023-10-14 11:18:11,415 ----------------------------------------------------------------------------------------------------
203
+ 2023-10-14 11:18:11,415 EPOCH 9 done: loss 0.0077 - lr: 0.000006
204
+ 2023-10-14 11:18:14,900 DEV : loss 0.16931197047233582 - f1-score (micro avg) 0.826
205
+ 2023-10-14 11:18:14,916 ----------------------------------------------------------------------------------------------------
206
+ 2023-10-14 11:18:20,563 epoch 10 - iter 72/723 - loss 0.00226973 - time (sec): 5.65 - samples/sec: 2956.98 - lr: 0.000005 - momentum: 0.000000
207
+ 2023-10-14 11:18:26,764 epoch 10 - iter 144/723 - loss 0.00486341 - time (sec): 11.85 - samples/sec: 2931.85 - lr: 0.000004 - momentum: 0.000000
208
+ 2023-10-14 11:18:33,052 epoch 10 - iter 216/723 - loss 0.00637801 - time (sec): 18.13 - samples/sec: 2889.19 - lr: 0.000004 - momentum: 0.000000
209
+ 2023-10-14 11:18:39,200 epoch 10 - iter 288/723 - loss 0.00616245 - time (sec): 24.28 - samples/sec: 2929.82 - lr: 0.000003 - momentum: 0.000000
210
+ 2023-10-14 11:18:45,320 epoch 10 - iter 360/723 - loss 0.00540335 - time (sec): 30.40 - samples/sec: 2947.65 - lr: 0.000003 - momentum: 0.000000
211
+ 2023-10-14 11:18:51,198 epoch 10 - iter 432/723 - loss 0.00563038 - time (sec): 36.28 - samples/sec: 2945.76 - lr: 0.000002 - momentum: 0.000000
212
+ 2023-10-14 11:18:56,544 epoch 10 - iter 504/723 - loss 0.00536321 - time (sec): 41.63 - samples/sec: 2944.91 - lr: 0.000002 - momentum: 0.000000
213
+ 2023-10-14 11:19:02,296 epoch 10 - iter 576/723 - loss 0.00493682 - time (sec): 47.38 - samples/sec: 2940.31 - lr: 0.000001 - momentum: 0.000000
214
+ 2023-10-14 11:19:08,584 epoch 10 - iter 648/723 - loss 0.00541293 - time (sec): 53.67 - samples/sec: 2939.73 - lr: 0.000001 - momentum: 0.000000
215
+ 2023-10-14 11:19:14,511 epoch 10 - iter 720/723 - loss 0.00514294 - time (sec): 59.59 - samples/sec: 2944.81 - lr: 0.000000 - momentum: 0.000000
216
+ 2023-10-14 11:19:14,745 ----------------------------------------------------------------------------------------------------
217
+ 2023-10-14 11:19:14,745 EPOCH 10 done: loss 0.0052 - lr: 0.000000
218
+ 2023-10-14 11:19:18,797 DEV : loss 0.17554564774036407 - f1-score (micro avg) 0.8284
219
+ 2023-10-14 11:19:19,239 ----------------------------------------------------------------------------------------------------
220
+ 2023-10-14 11:19:19,240 Loading model from best epoch ...
221
+ 2023-10-14 11:19:20,833 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
222
+ 2023-10-14 11:19:24,050
223
+ Results:
224
+ - F-score (micro) 0.8061
225
+ - F-score (macro) 0.6941
226
+ - Accuracy 0.6877
227
+
228
+ By class:
229
+ precision recall f1-score support
230
+
231
+ PER 0.7714 0.8610 0.8137 482
232
+ LOC 0.8868 0.8210 0.8526 458
233
+ ORG 0.4643 0.3768 0.4160 69
234
+
235
+ micro avg 0.8026 0.8097 0.8061 1009
236
+ macro avg 0.7075 0.6863 0.6941 1009
237
+ weighted avg 0.8028 0.8097 0.8042 1009
238
+
239
+ 2023-10-14 11:19:24,050 ----------------------------------------------------------------------------------------------------