[hops] 2024-09-23 16:13:21.055 | INFO | Initializing a parser from /workspace/configs/exp_camembertv2/camembertv2_base_p2_17k_last_layer.yaml [hops] 2024-09-23 16:13:21.210 | INFO | Generating a FastText model from the treebank [hops] 2024-09-23 16:13:21.214 | INFO | Training fasttext model [hops] 2024-09-23 16:13:22.656 | WARNING | Some weights of RobertaModel were not initialized from the model checkpoint at /scratch/camembertv2/runs/models/camembertv2-base-bf16/post/ckpt-p2-17000/pt/ and are newly initialized: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. [hops] 2024-09-23 16:13:34.964 | INFO | Start training on cuda:3 [hops] 2024-09-23 16:13:35.203 | WARNING | You're using a RobertaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding. [hops] 2024-09-23 16:13:59.790 | INFO | Epoch 0: train loss 3.0280 dev loss 2.5852 dev tag acc 18.31% dev head acc 16.94% dev deprel acc 32.20% [hops] 2024-09-23 16:13:59.791 | INFO | New best model: head accuracy 16.94% > 0.00% [hops] 2024-09-23 16:14:09.299 | INFO | Epoch 1: train loss 2.3355 dev loss 1.8455 dev tag acc 43.37% dev head acc 38.18% dev deprel acc 54.33% [hops] 2024-09-23 16:14:09.300 | INFO | New best model: head accuracy 38.18% > 16.94% [hops] 2024-09-23 16:14:19.117 | INFO | Epoch 2: train loss 1.8050 dev loss 1.4588 dev tag acc 48.38% dev head acc 54.01% dev deprel acc 63.87% [hops] 2024-09-23 16:14:19.118 | INFO | New best model: head accuracy 54.01% > 38.18% [hops] 2024-09-23 16:14:29.492 | INFO | Epoch 3: train loss 1.4915 dev loss 1.2201 dev tag acc 55.22% dev head acc 62.64% dev deprel acc 72.20% [hops] 2024-09-23 16:14:29.492 | INFO | New best model: head accuracy 62.64% > 54.01% [hops] 2024-09-23 16:14:39.867 | INFO | Epoch 4: train loss 1.2642 dev loss 1.0422 dev tag acc 64.74% dev head acc 68.98% dev deprel acc 78.59% [hops] 2024-09-23 16:14:39.868 | INFO | New best model: head accuracy 68.98% > 62.64% [hops] 2024-09-23 16:14:50.176 | INFO | Epoch 5: train loss 1.0661 dev loss 0.8861 dev tag acc 73.28% dev head acc 72.64% dev deprel acc 82.30% [hops] 2024-09-23 16:14:50.176 | INFO | New best model: head accuracy 72.64% > 68.98% [hops] 2024-09-23 16:15:00.035 | INFO | Epoch 6: train loss 0.9110 dev loss 0.8110 dev tag acc 79.97% dev head acc 75.34% dev deprel acc 83.54% [hops] 2024-09-23 16:15:00.036 | INFO | New best model: head accuracy 75.34% > 72.64% [hops] 2024-09-23 16:15:10.484 | INFO | Epoch 7: train loss 0.7924 dev loss 0.7090 dev tag acc 82.43% dev head acc 77.50% dev deprel acc 85.19% [hops] 2024-09-23 16:15:10.485 | INFO | New best model: head accuracy 77.50% > 75.34% [hops] 2024-09-23 16:15:20.718 | INFO | Epoch 8: train loss 0.6980 dev loss 0.6438 dev tag acc 84.23% dev head acc 78.11% dev deprel acc 86.49% [hops] 2024-09-23 16:15:20.719 | INFO | New best model: head accuracy 78.11% > 77.50% [hops] 2024-09-23 16:15:30.573 | INFO | Epoch 9: train loss 0.6270 dev loss 0.6051 dev tag acc 86.31% dev head acc 80.42% dev deprel acc 86.95% [hops] 2024-09-23 16:15:30.574 | INFO | New best model: head accuracy 80.42% > 78.11% [hops] 2024-09-23 16:15:40.486 | INFO | Epoch 10: train loss 0.5629 dev loss 0.5624 dev tag acc 89.32% dev head acc 80.31% dev deprel acc 87.99% [hops] 2024-09-23 16:15:48.515 | INFO | Epoch 11: train loss 0.5067 dev loss 0.5517 dev tag acc 90.65% dev head acc 81.64% dev deprel acc 88.35% [hops] 2024-09-23 16:15:48.515 | INFO | New best model: head accuracy 81.64% > 80.42% [hops] 2024-09-23 16:15:58.296 | INFO | Epoch 12: train loss 0.4651 dev loss 0.5284 dev tag acc 92.42% dev head acc 82.31% dev deprel acc 88.75% [hops] 2024-09-23 16:15:58.297 | INFO | New best model: head accuracy 82.31% > 81.64% [hops] 2024-09-23 16:16:08.157 | INFO | Epoch 13: train loss 0.4250 dev loss 0.5025 dev tag acc 93.53% dev head acc 83.33% dev deprel acc 89.65% [hops] 2024-09-23 16:16:08.158 | INFO | New best model: head accuracy 83.33% > 82.31% [hops] 2024-09-23 16:16:17.917 | INFO | Epoch 14: train loss 0.3938 dev loss 0.4943 dev tag acc 93.93% dev head acc 84.46% dev deprel acc 89.86% [hops] 2024-09-23 16:16:17.918 | INFO | New best model: head accuracy 84.46% > 83.33% [hops] 2024-09-23 16:16:27.703 | INFO | Epoch 15: train loss 0.3636 dev loss 0.4950 dev tag acc 94.81% dev head acc 84.71% dev deprel acc 89.90% [hops] 2024-09-23 16:16:27.704 | INFO | New best model: head accuracy 84.71% > 84.46% [hops] 2024-09-23 16:16:37.566 | INFO | Epoch 16: train loss 0.3358 dev loss 0.4867 dev tag acc 95.23% dev head acc 85.02% dev deprel acc 90.12% [hops] 2024-09-23 16:16:37.567 | INFO | New best model: head accuracy 85.02% > 84.71% [hops] 2024-09-23 16:16:47.274 | INFO | Epoch 17: train loss 0.3155 dev loss 0.4794 dev tag acc 95.56% dev head acc 85.76% dev deprel acc 90.02% [hops] 2024-09-23 16:16:47.275 | INFO | New best model: head accuracy 85.76% > 85.02% [hops] 2024-09-23 16:16:56.984 | INFO | Epoch 18: train loss 0.2961 dev loss 0.4793 dev tag acc 95.86% dev head acc 85.64% dev deprel acc 90.34% [hops] 2024-09-23 16:17:04.954 | INFO | Epoch 19: train loss 0.2805 dev loss 0.4874 dev tag acc 96.20% dev head acc 86.08% dev deprel acc 90.62% [hops] 2024-09-23 16:17:04.955 | INFO | New best model: head accuracy 86.08% > 85.76% [hops] 2024-09-23 16:17:15.143 | INFO | Epoch 20: train loss 0.2617 dev loss 0.4873 dev tag acc 96.33% dev head acc 86.28% dev deprel acc 90.91% [hops] 2024-09-23 16:17:15.145 | INFO | New best model: head accuracy 86.28% > 86.08% [hops] 2024-09-23 16:17:25.286 | INFO | Epoch 21: train loss 0.2492 dev loss 0.4855 dev tag acc 96.44% dev head acc 86.26% dev deprel acc 91.19% [hops] 2024-09-23 16:17:33.055 | INFO | Epoch 22: train loss 0.2353 dev loss 0.5254 dev tag acc 96.52% dev head acc 86.71% dev deprel acc 91.00% [hops] 2024-09-23 16:17:33.056 | INFO | New best model: head accuracy 86.71% > 86.28% [hops] 2024-09-23 16:17:42.549 | INFO | Epoch 23: train loss 0.2193 dev loss 0.5271 dev tag acc 96.61% dev head acc 86.32% dev deprel acc 91.24% [hops] 2024-09-23 16:17:50.233 | INFO | Epoch 24: train loss 0.2127 dev loss 0.5132 dev tag acc 96.79% dev head acc 86.76% dev deprel acc 91.51% [hops] 2024-09-23 16:17:50.234 | INFO | New best model: head accuracy 86.76% > 86.71% [hops] 2024-09-23 16:18:00.614 | INFO | Epoch 25: train loss 0.2009 dev loss 0.5119 dev tag acc 96.57% dev head acc 86.68% dev deprel acc 91.51% [hops] 2024-09-23 16:18:08.646 | INFO | Epoch 26: train loss 0.1921 dev loss 0.5223 dev tag acc 96.88% dev head acc 87.05% dev deprel acc 91.42% [hops] 2024-09-23 16:18:08.647 | INFO | New best model: head accuracy 87.05% > 86.76% [hops] 2024-09-23 16:18:18.112 | INFO | Epoch 27: train loss 0.1791 dev loss 0.5547 dev tag acc 97.13% dev head acc 86.99% dev deprel acc 91.39% [hops] 2024-09-23 16:18:25.772 | INFO | Epoch 28: train loss 0.1744 dev loss 0.5114 dev tag acc 97.37% dev head acc 87.30% dev deprel acc 91.60% [hops] 2024-09-23 16:18:25.773 | INFO | New best model: head accuracy 87.30% > 87.05% [hops] 2024-09-23 16:18:35.209 | INFO | Epoch 29: train loss 0.1659 dev loss 0.5170 dev tag acc 97.31% dev head acc 87.43% dev deprel acc 91.83% [hops] 2024-09-23 16:18:35.210 | INFO | New best model: head accuracy 87.43% > 87.30% [hops] 2024-09-23 16:18:45.102 | INFO | Epoch 30: train loss 0.1572 dev loss 0.5548 dev tag acc 97.38% dev head acc 87.60% dev deprel acc 91.77% [hops] 2024-09-23 16:18:45.102 | INFO | New best model: head accuracy 87.60% > 87.43% [hops] 2024-09-23 16:18:55.060 | INFO | Epoch 31: train loss 0.1517 dev loss 0.5943 dev tag acc 97.39% dev head acc 87.63% dev deprel acc 91.78% [hops] 2024-09-23 16:18:55.061 | INFO | New best model: head accuracy 87.63% > 87.60% [hops] 2024-09-23 16:19:05.039 | INFO | Epoch 32: train loss 0.1458 dev loss 0.5661 dev tag acc 97.42% dev head acc 87.91% dev deprel acc 91.85% [hops] 2024-09-23 16:19:05.040 | INFO | New best model: head accuracy 87.91% > 87.63% [hops] 2024-09-23 16:19:14.549 | INFO | Epoch 33: train loss 0.1395 dev loss 0.5671 dev tag acc 97.52% dev head acc 87.96% dev deprel acc 91.90% [hops] 2024-09-23 16:19:14.550 | INFO | New best model: head accuracy 87.96% > 87.91% [hops] 2024-09-23 16:19:24.043 | INFO | Epoch 34: train loss 0.1338 dev loss 0.5757 dev tag acc 97.52% dev head acc 88.20% dev deprel acc 92.09% [hops] 2024-09-23 16:19:24.044 | INFO | New best model: head accuracy 88.20% > 87.96% [hops] 2024-09-23 16:19:33.500 | INFO | Epoch 35: train loss 0.1298 dev loss 0.5739 dev tag acc 97.57% dev head acc 88.23% dev deprel acc 92.11% [hops] 2024-09-23 16:19:33.501 | INFO | New best model: head accuracy 88.23% > 88.20% [hops] 2024-09-23 16:19:42.967 | INFO | Epoch 36: train loss 0.1229 dev loss 0.5932 dev tag acc 97.61% dev head acc 87.88% dev deprel acc 92.20% [hops] 2024-09-23 16:19:50.940 | INFO | Epoch 37: train loss 0.1197 dev loss 0.6220 dev tag acc 97.62% dev head acc 88.34% dev deprel acc 91.96% [hops] 2024-09-23 16:19:50.941 | INFO | New best model: head accuracy 88.34% > 88.23% [hops] 2024-09-23 16:20:00.495 | INFO | Epoch 38: train loss 0.1157 dev loss 0.6152 dev tag acc 97.80% dev head acc 88.19% dev deprel acc 92.25% [hops] 2024-09-23 16:20:09.048 | INFO | Epoch 39: train loss 0.1110 dev loss 0.6413 dev tag acc 97.75% dev head acc 88.12% dev deprel acc 92.16% [hops] 2024-09-23 16:20:16.738 | INFO | Epoch 40: train loss 0.1085 dev loss 0.6398 dev tag acc 97.85% dev head acc 88.15% dev deprel acc 92.10% [hops] 2024-09-23 16:20:25.222 | INFO | Epoch 41: train loss 0.1037 dev loss 0.6470 dev tag acc 97.89% dev head acc 88.26% dev deprel acc 92.18% [hops] 2024-09-23 16:20:33.433 | INFO | Epoch 42: train loss 0.1017 dev loss 0.6571 dev tag acc 97.86% dev head acc 88.01% dev deprel acc 92.11% [hops] 2024-09-23 16:20:41.541 | INFO | Epoch 43: train loss 0.1002 dev loss 0.6578 dev tag acc 97.91% dev head acc 88.16% dev deprel acc 92.24% [hops] 2024-09-23 16:20:49.568 | INFO | Epoch 44: train loss 0.0947 dev loss 0.6615 dev tag acc 97.95% dev head acc 88.47% dev deprel acc 92.31% [hops] 2024-09-23 16:20:49.569 | INFO | New best model: head accuracy 88.47% > 88.34% [hops] 2024-09-23 16:20:59.008 | INFO | Epoch 45: train loss 0.0922 dev loss 0.6607 dev tag acc 97.95% dev head acc 88.04% dev deprel acc 92.31% [hops] 2024-09-23 16:21:06.657 | INFO | Epoch 46: train loss 0.0900 dev loss 0.6693 dev tag acc 98.02% dev head acc 88.31% dev deprel acc 92.22% [hops] 2024-09-23 16:21:14.318 | INFO | Epoch 47: train loss 0.0843 dev loss 0.6898 dev tag acc 97.93% dev head acc 88.45% dev deprel acc 92.34% [hops] 2024-09-23 16:21:21.994 | INFO | Epoch 48: train loss 0.0823 dev loss 0.6972 dev tag acc 98.08% dev head acc 88.53% dev deprel acc 92.38% [hops] 2024-09-23 16:21:21.995 | INFO | New best model: head accuracy 88.53% > 88.47% [hops] 2024-09-23 16:21:31.473 | INFO | Epoch 49: train loss 0.0809 dev loss 0.6935 dev tag acc 98.13% dev head acc 88.57% dev deprel acc 92.51% [hops] 2024-09-23 16:21:31.474 | INFO | New best model: head accuracy 88.57% > 88.53% [hops] 2024-09-23 16:21:41.678 | INFO | Epoch 50: train loss 0.0764 dev loss 0.7254 dev tag acc 98.09% dev head acc 88.46% dev deprel acc 92.61% [hops] 2024-09-23 16:21:49.318 | INFO | Epoch 51: train loss 0.0743 dev loss 0.7382 dev tag acc 98.05% dev head acc 88.35% dev deprel acc 92.48% [hops] 2024-09-23 16:21:56.978 | INFO | Epoch 52: train loss 0.0765 dev loss 0.7135 dev tag acc 98.13% dev head acc 88.49% dev deprel acc 92.55% [hops] 2024-09-23 16:22:04.687 | INFO | Epoch 53: train loss 0.0736 dev loss 0.7191 dev tag acc 98.10% dev head acc 88.55% dev deprel acc 92.50% [hops] 2024-09-23 16:22:12.515 | INFO | Epoch 54: train loss 0.0703 dev loss 0.7270 dev tag acc 98.12% dev head acc 88.46% dev deprel acc 92.49% [hops] 2024-09-23 16:22:20.205 | INFO | Epoch 55: train loss 0.0693 dev loss 0.7236 dev tag acc 98.13% dev head acc 88.58% dev deprel acc 92.31% [hops] 2024-09-23 16:22:20.205 | INFO | New best model: head accuracy 88.58% > 88.57% [hops] 2024-09-23 16:22:29.691 | INFO | Epoch 56: train loss 0.0676 dev loss 0.7135 dev tag acc 98.12% dev head acc 88.74% dev deprel acc 92.52% [hops] 2024-09-23 16:22:29.691 | INFO | New best model: head accuracy 88.74% > 88.58% [hops] 2024-09-23 16:22:39.168 | INFO | Epoch 57: train loss 0.0656 dev loss 0.7150 dev tag acc 98.18% dev head acc 88.74% dev deprel acc 92.42% [hops] 2024-09-23 16:22:46.828 | INFO | Epoch 58: train loss 0.0653 dev loss 0.7252 dev tag acc 98.13% dev head acc 88.66% dev deprel acc 92.40% [hops] 2024-09-23 16:22:54.486 | INFO | Epoch 59: train loss 0.0623 dev loss 0.7325 dev tag acc 98.13% dev head acc 88.74% dev deprel acc 92.43% [hops] 2024-09-23 16:23:02.154 | INFO | Epoch 60: train loss 0.0619 dev loss 0.7384 dev tag acc 98.13% dev head acc 88.79% dev deprel acc 92.41% [hops] 2024-09-23 16:23:02.155 | INFO | New best model: head accuracy 88.79% > 88.74% [hops] 2024-09-23 16:23:11.676 | INFO | Epoch 61: train loss 0.0638 dev loss 0.7352 dev tag acc 98.15% dev head acc 88.80% dev deprel acc 92.47% [hops] 2024-09-23 16:23:11.677 | INFO | New best model: head accuracy 88.80% > 88.79% [hops] 2024-09-23 16:23:21.684 | INFO | Epoch 62: train loss 0.0625 dev loss 0.7389 dev tag acc 98.15% dev head acc 88.80% dev deprel acc 92.50% [hops] 2024-09-23 16:23:29.540 | INFO | Epoch 63: train loss 0.0602 dev loss 0.7405 dev tag acc 98.16% dev head acc 88.82% dev deprel acc 92.50% [hops] 2024-09-23 16:23:29.541 | INFO | New best model: head accuracy 88.82% > 88.80% [hops] 2024-09-23 16:23:35.588 | WARNING | You're using a RobertaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding. [hops] 2024-09-23 16:23:41.549 | WARNING | You're using a RobertaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding. [hops] 2024-09-23 16:23:43.669 | INFO | Metrics for Rhapsodie-camembertv2_base_p2_17k_last_layer+rand_seed=123 ─────────────────────────────── Split UPOS UAS LAS ─────────────────────────────── Dev 98.16 88.88 85.00 Test 97.56 88.98 84.50 ───────────────────────────────