Upload folder using huggingface_hub

Browse files

Files changed (16) hide show

README.md +161 -0
camembertav2_base_p2_17k_last_layer.yaml +32 -0
fr_gsd-ud-dev.parsed.conllu +0 -0
fr_gsd-ud-test.parsed.conllu +0 -0
model/config.json +1 -0
model/lexers/camembertav2_base_p2_17k_last_layer/config.json +1 -0
model/lexers/camembertav2_base_p2_17k_last_layer/model/config.json +41 -0
model/lexers/camembertav2_base_p2_17k_last_layer/model/special_tokens_map.json +51 -0
model/lexers/camembertav2_base_p2_17k_last_layer/model/tokenizer.json +0 -0
model/lexers/camembertav2_base_p2_17k_last_layer/model/tokenizer_config.json +57 -0
model/lexers/char_level_embeddings/config.json +1 -0
model/lexers/fasttext/config.json +1 -0
model/lexers/fasttext/fasttext_model.bin +3 -0
model/lexers/word_embeddings/config.json +0 -0
model/weights.pt +3 -0
train.log +91 -0

README.md ADDED Viewed

	@@ -0,0 +1,161 @@

+---
+language: fr
+license: mit
+tags:
+- deberta-v2
+- token-classification
+base_model: almanach/camembertav2-base
+datasets:
+- GSD
+metrics:
+- las
+- upos
+model-index:
+- name: almanach/camembertav2-base-gsd
+  results:
+  - task:
+      type: token-classification
+      name: Part-of-Speech Tagging
+    dataset:
+      type: GSD
+      name: GSD
+    metrics:
+    - name: upos
+      type: upos
+      value: 0.98572
+      verified: false
+  - task:
+      type: token-classification
+      name: Dependency Parsing
+    dataset:
+      type: GSD
+      name: GSD
+    metrics:
+    - name: las
+      type: las
+      value: 0.94517
+      verified: false
+---
+# Model Card for almanach/camembertav2-base-gsd
+almanach/camembertav2-base-gsd is a deberta-v2 model for token classification. It is trained on the GSD dataset for the task of Part-of-Speech Tagging and Dependency Parsing.
+ The model achieves an f1 score of  on the GSD dataset.
+The model is part of the almanach/camembertav2-base family of model finetunes.
+## Model Details
+### Model Description
+- **Developed by:** Wissam Antoun (Phd Student at Almanach, Inria-Paris)
+- **Model type:** deberta-v2
+- **Language(s) (NLP):** French
+- **License:** MIT
+- **Finetuned from model :** almanach/camembertav2-base
+### Model Sources
+<!-- Provide the basic links for the model. -->
+- **Repository:** https://github.com/WissamAntoun/camemberta
+- **Paper:** https://arxiv.org/abs/2411.08868
+## Uses
+The model can be used for token classification tasks in French for Part-of-Speech Tagging and Dependency Parsing.
+## Bias, Risks, and Limitations
+The model may exhibit biases based on the training data. The model may not generalize well to other datasets or tasks. The model may also have limitations in terms of the data it was trained on.
+## How to Get Started with the Model
+You can use the models directly with the hopsparser library in server mode https://github.com/hopsparser/hopsparser/blob/main/docs/server.md
+## Training Details
+### Training Procedure
+Model trained with the [hopsparser](https://github.com/hopsparser/hopsparser) library on the GSD dataset.
+#### Training Hyperparameters
+```yml
+# Layer dimensions
+mlp_input: 1024
+mlp_tag_hidden: 16
+mlp_arc_hidden: 512
+mlp_lab_hidden: 128
+# Lexers
+lexers:
+  - name: word_embeddings
+    type: words
+    embedding_size: 256
+    word_dropout: 0.5
+  - name: char_level_embeddings
+    type: chars_rnn
+    embedding_size: 64
+    lstm_output_size: 128
+  - name: fasttext
+    type: fasttext
+  - name: camembertav2_base_p2_17k_last_layer
+    type: bert
+    model: /scratch/camembertv2/runs/models/camembertav2-base-bf16/post/ckpt-p2-17000/pt/discriminator/
+    layers: [11]
+    subwords_reduction: "mean"
+# Training hyperparameters
+encoder_dropout: 0.5
+mlp_dropout: 0.5
+batch_size: 8
+epochs: 64
+lr:
+  base: 0.00003
+  schedule:
+    shape: linear
+    warmup_steps: 100
+```
+#### Results
+**UPOS:** 0.98572
+**LAS:** 0.94517
+## Technical Specifications
+### Model Architecture and Objective
+deberta-v2 custom model for token classification.
+## Citation
+**BibTeX:**
+```bibtex
+@misc{antoun2024camembert20smarterfrench,
+      title={CamemBERT 2.0: A Smarter French Language Model Aged to Perfection},
+      author={Wissam Antoun and Francis Kulumba and Rian Touchent and Éric de la Clergerie and Benoît Sagot and Djamé Seddah},
+      year={2024},
+      eprint={2411.08868},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2411.08868},
+}
+@inproceedings{grobol:hal-03223424,
+    title = {Analyse en dépendances du français avec des plongements contextualisés},
+    author = {Grobol, Loïc and Crabbé, Benoît},
+    url = {https://hal.archives-ouvertes.fr/hal-03223424},
+    booktitle = {Actes de la 28ème Conférence sur le Traitement Automatique des Langues Naturelles},
+    eventtitle = {TALN-RÉCITAL 2021},
+    venue = {Lille, France},
+    pdf = {https://hal.archives-ouvertes.fr/hal-03223424/file/HOPS_final.pdf},
+    hal_id = {hal-03223424},
+    hal_version = {v1},
+}
+```

camembertav2_base_p2_17k_last_layer.yaml ADDED Viewed

	@@ -0,0 +1,32 @@

+# Layer dimensions
+mlp_input: 1024
+mlp_tag_hidden: 16
+mlp_arc_hidden: 512
+mlp_lab_hidden: 128
+# Lexers
+lexers:
+  - name: word_embeddings
+    type: words
+    embedding_size: 256
+    word_dropout: 0.5
+  - name: char_level_embeddings
+    type: chars_rnn
+    embedding_size: 64
+    lstm_output_size: 128
+  - name: fasttext
+    type: fasttext
+  - name: camembertav2_base_p2_17k_last_layer
+    type: bert
+    model: /scratch/camembertv2/runs/models/camembertav2-base-bf16/post/ckpt-p2-17000/pt/discriminator/
+    layers: [11]
+    subwords_reduction: "mean"
+# Training hyperparameters
+encoder_dropout: 0.5
+mlp_dropout: 0.5
+batch_size: 8
+epochs: 64
+lr:
+  base: 0.00003
+  schedule:
+    shape: linear
+    warmup_steps: 100

fr_gsd-ud-dev.parsed.conllu ADDED Viewed

The diff for this file is too large to render. See raw diff

fr_gsd-ud-test.parsed.conllu ADDED Viewed

The diff for this file is too large to render. See raw diff

model/config.json ADDED Viewed

	@@ -0,0 +1 @@

+ {"mlp_input": 1024, "mlp_tag_hidden": 16, "mlp_arc_hidden": 512, "mlp_lab_hidden": 128, "biased_biaffine": true, "default_batch_size": 8, "encoder_dropout": 0.5, "extra_annotations": {}, "labels": ["acl", "acl:relcl", "advcl", "advcl:cleft", "advmod", "amod", "appos", "aux:caus", "aux:pass", "aux:tense", "case", "cc", "ccomp", "compound", "conj", "cop", "csubj", "csubj:pass", "dep", "dep:comp", "det", "discourse", "dislocated", "expl", "expl:pass", "expl:pv", "expl:subj", "fixed", "flat", "flat:foreign", "flat:name", "goeswith", "iobj", "iobj:agent", "mark", "nmod", "nsubj", "nsubj:caus", "nsubj:pass", "nummod", "obj", "obj:agent", "obj:lvc", "obl", "obl:agent", "obl:arg", "obl:mod", "orphan", "parataxis", "punct", "reparandum", "root", "vocative", "xcomp"], "mlp_dropout": 0.5, "tagset": ["ADJ", "ADP", "ADV", "AUX", "CCONJ", "DET", "INTJ", "NOUN", "NUM", "PRON", "PROPN", "PUNCT", "SCONJ", "SYM", "VERB", "X"], "lexers": {"word_embeddings": "words", "char_level_embeddings": "chars_rnn", "fasttext": "fasttext", "camembertav2_base_p2_17k_last_layer": "bert"}, "multitask_loss": "sum"}

model/lexers/camembertav2_base_p2_17k_last_layer/config.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"layers": [11], "subwords_reduction": "mean", "weight_layers": false}

model/lexers/camembertav2_base_p2_17k_last_layer/model/config.json ADDED Viewed

	@@ -0,0 +1,41 @@

+{
+  "_name_or_path": "/scratch/camembertv2/runs/models/camembertav2-base-bf16/post/ckpt-p2-17000/pt/discriminator/",
+  "architectures": [
+    "DebertaV2Model"
+  ],
+  "attention_probs_dropout_prob": 0.1,
+  "bos_token_id": 1,
+  "conv_act": "gelu",
+  "conv_kernel_size": 0,
+  "embedding_size": 768,
+  "eos_token_id": 2,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 768,
+  "initializer_range": 0.02,
+  "intermediate_size": 3072,
+  "layer_norm_eps": 1e-07,
+  "max_position_embeddings": 1024,
+  "max_relative_positions": -1,
+  "model_name": "camembertav2-base-bf16",
+  "model_type": "deberta-v2",
+  "norm_rel_ebd": "layer_norm",
+  "num_attention_heads": 12,
+  "num_hidden_layers": 12,
+  "pad_token_id": 0,
+  "pooler_dropout": 0,
+  "pooler_hidden_act": "gelu",
+  "pooler_hidden_size": 768,
+  "pos_att_type": [
+    "p2c",
+    "c2p"
+  ],
+  "position_biased_input": false,
+  "position_buckets": 256,
+  "relative_attention": true,
+  "share_att_key": true,
+  "torch_dtype": "float32",
+  "transformers_version": "4.44.2",
+  "type_vocab_size": 0,
+  "vocab_size": 32768
+}

model/lexers/camembertav2_base_p2_17k_last_layer/model/special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,51 @@

+{
+  "bos_token": {
+    "content": "[CLS]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "cls_token": {
+    "content": "[CLS]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "[SEP]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "mask_token": {
+    "content": "[MASK]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "[PAD]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "sep_token": {
+    "content": "[SEP]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "[UNK]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

model/lexers/camembertav2_base_p2_17k_last_layer/model/tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

model/lexers/camembertav2_base_p2_17k_last_layer/model/tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,57 @@

+{
+  "add_prefix_space": true,
+  "added_tokens_decoder": {
+    "0": {
+      "content": "[PAD]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "1": {
+      "content": "[CLS]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "2": {
+      "content": "[SEP]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "3": {
+      "content": "[UNK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "4": {
+      "content": "[MASK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "bos_token": "[CLS]",
+  "clean_up_tokenization_spaces": true,
+  "cls_token": "[CLS]",
+  "eos_token": "[SEP]",
+  "errors": "replace",
+  "mask_token": "[MASK]",
+  "model_max_length": 1000000000000000019884624838656,
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "tokenizer_class": "RobertaTokenizer",
+  "trim_offsets": true,
+  "unk_token": "[UNK]"
+}

model/lexers/char_level_embeddings/config.json ADDED Viewed

	@@ -0,0 +1 @@

+ {"char_embeddings_dim": 64, "output_dim": 128, "special_tokens": ["<root>"], "charset": ["<pad>", "<special>", " ", "!", "\"", "#", "$", "%", "&", "'", "(", ")", "*", "+", ",", "-", ".", "/", "0", "1", "2", "3", "4", "5", "6", "7", "8", "9", ":", ";", "=", ">", "?", "@", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z", "[", "]", "^", "_", "`", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "{", "|", "}", "\u00a3", "\u00ab", "\u00b0", "\u00b1", "\u00b2", "\u00b3", "\u00b7", "\u00ba", "\u00bb", "\u00c0", "\u00c1", "\u00c2", "\u00c5", "\u00c6", "\u00c7", "\u00c8", "\u00c9", "\u00ca", "\u00cd", "\u00ce", "\u00d3", "\u00d4", "\u00d6", "\u00d7", "\u00d9", "\u00da", "\u00dc", "\u00df", "\u00e0", "\u00e1", "\u00e2", "\u00e3", "\u00e4", "\u00e5", "\u00e6", "\u00e7", "\u00e8", "\u00e9", "\u00ea", "\u00eb", "\u00ec", "\u00ed", "\u00ee", "\u00ef", "\u00f0", "\u00f1", "\u00f2", "\u00f3", "\u00f4", "\u00f6", "\u00f8", "\u00f9", "\u00fa", "\u00fb", "\u00fc", "\u00fd", "\u00ff", "\u0101", "\u0103", "\u0105", "\u0107", "\u010c", "\u010d", "\u0119", "\u011b", "\u011f", "\u0123", "\u012b", "\u012d", "\u0131", "\u013d", "\u013e", "\u0141", "\u0142", "\u0144", "\u0148", "\u014c", "\u014d", "\u0151", "\u0153", "\u0159", "\u015b", "\u015f", "\u0160", "\u0161", "\u0163", "\u0169", "\u016b", "\u017b", "\u017c", "\u017d", "\u017e", "\u01b0", "\u025f", "\u0268", "\u0274", "\u0282", "\u02bf", "\u0301", "\u0361", "\u03a9", "\u03b3", "\u03b5", "\u03c9", "\u0409", "\u040f", "\u0410", "\u0411", "\u0412", "\u0413", "\u0414", "\u0418", "\u041b", "\u041c", "\u041e", "\u041f", "\u0420", "\u0421", "\u0422", "\u0424", "\u0428", "\u0430", "\u0431", "\u0432", "\u0433", "\u0434", "\u0435", "\u0436", "\u0437", "\u0438", "\u0439", "\u043a", "\u043b", "\u043c", "\u043d", "\u043e", "\u043f", "\u0440", "\u0441", "\u0442", "\u0443", "\u0445", "\u0446", "\u0447", "\u0448", "\u0449", "\u044a", "\u044c", "\u044f", "\u0451", "\u0458", "\u0459", "\u045a", "\u045b", "\u0627", "\u062c", "\u062f", "\u0630", "\u0631", "\u0634", "\u0643", "\u0644", "\u0645", "\u0646", "\u1e0f", "\u1e25", "\u1e92", "\u1ea3", "\u1ead", "\u1ec5", "\u1edd", "\u1edf", "\u1ee7", "\u1ef1", "\u2013", "\u2014", "\u2020", "\u2032", "\u2082", "\u20ac", "\u2212", "\u25b6", "\u4e0a", "\u4e2d", "\u4e34", "\u4e49", "\u4e59", "\u4e95", "\u4ecb", "\u4f0e", "\u5247", "\u53f7", "\u56db", "\u5712", "\u5927", "\u5b89", "\u5bae", "\u5bbf", "\u5f81", "\u614e", "\u697d", "\u6d4e", "\u706b", "\u7384", "\u7530", "\u753a", "\u7bad", "\u7c89", "\u80e1", "\u82a6", "\u85e9", "\u898f", "\u90e8", "\u957f", "\uac15", "\uc131", "\ud638"]}

model/lexers/fasttext/config.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"special_tokens": ["<root>"]}

model/lexers/fasttext/fasttext_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8d6f1eb15261a1adf98177911e327232fe0071037e3a425fae779dd864a9cb58
+size 805269874

model/lexers/word_embeddings/config.json ADDED Viewed

The diff for this file is too large to render. See raw diff

model/weights.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:61fde4fa2810db2c82bf5742194cf186d99964bf36122a77a7de4baa2580119b
+size 1811911596

train.log ADDED Viewed

	@@ -0,0 +1,91 @@

+[hops] 2024-09-23 23:33:51.375 | INFO     | Initializing a parser from /workspace/configs/exp_camembertv2/camembertav2_base_p2_17k_last_layer.yaml
+[hops] 2024-09-23 23:33:51.646 | INFO     | Generating a FastText model from the treebank
+[hops] 2024-09-23 23:33:51.732 | INFO     | Training fasttext model
+[hops] 2024-09-23 23:34:06.093 | INFO     | Start training on cuda:1
+[hops] 2024-09-23 23:34:06.099 | WARNING  | You're using a RobertaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
+[hops] 2024-09-23 23:35:55.963 | INFO     | Epoch 0: train loss 1.0562	dev loss 0.2558	dev tag acc 96.04%	dev head acc 90.39%	dev deprel acc 93.38%
+[hops] 2024-09-23 23:35:55.965 | INFO     | New best model: head accuracy 90.39% > 0.00%
+[hops] 2024-09-23 23:37:48.043 | INFO     | Epoch 1: train loss 0.2514	dev loss 0.1541	dev tag acc 98.06%	dev head acc 94.42%	dev deprel acc 95.97%
+[hops] 2024-09-23 23:37:48.044 | INFO     | New best model: head accuracy 94.42% > 90.39%
+[hops] 2024-09-23 23:39:43.313 | INFO     | Epoch 2: train loss 0.1571	dev loss 0.1317	dev tag acc 98.31%	dev head acc 95.61%	dev deprel acc 96.65%
+[hops] 2024-09-23 23:39:43.314 | INFO     | New best model: head accuracy 95.61% > 94.42%
+[hops] 2024-09-23 23:41:33.012 | INFO     | Epoch 3: train loss 0.1159	dev loss 0.1340	dev tag acc 98.45%	dev head acc 96.15%	dev deprel acc 97.10%
+[hops] 2024-09-23 23:41:33.012 | INFO     | New best model: head accuracy 96.15% > 95.61%
+[hops] 2024-09-23 23:43:24.869 | INFO     | Epoch 4: train loss 0.0939	dev loss 0.1318	dev tag acc 98.52%	dev head acc 96.17%	dev deprel acc 97.31%
+[hops] 2024-09-23 23:43:24.870 | INFO     | New best model: head accuracy 96.17% > 96.15%
+[hops] 2024-09-23 23:45:15.729 | INFO     | Epoch 5: train loss 0.0764	dev loss 0.1402	dev tag acc 98.63%	dev head acc 96.43%	dev deprel acc 97.52%
+[hops] 2024-09-23 23:45:15.730 | INFO     | New best model: head accuracy 96.43% > 96.17%
+[hops] 2024-09-23 23:47:08.331 | INFO     | Epoch 6: train loss 0.0641	dev loss 0.1569	dev tag acc 98.56%	dev head acc 96.39%	dev deprel acc 97.55%
+[hops] 2024-09-23 23:48:59.039 | INFO     | Epoch 7: train loss 0.0546	dev loss 0.1513	dev tag acc 98.55%	dev head acc 96.48%	dev deprel acc 97.54%
+[hops] 2024-09-23 23:48:59.041 | INFO     | New best model: head accuracy 96.48% > 96.43%
+[hops] 2024-09-23 23:50:49.507 | INFO     | Epoch 8: train loss 0.0466	dev loss 0.1620	dev tag acc 98.58%	dev head acc 96.69%	dev deprel acc 97.63%
+[hops] 2024-09-23 23:50:49.508 | INFO     | New best model: head accuracy 96.69% > 96.48%
+[hops] 2024-09-23 23:52:41.287 | INFO     | Epoch 9: train loss 0.0402	dev loss 0.1985	dev tag acc 98.55%	dev head acc 96.55%	dev deprel acc 97.59%
+[hops] 2024-09-23 23:54:29.356 | INFO     | Epoch 10: train loss 0.0351	dev loss 0.2068	dev tag acc 98.56%	dev head acc 96.61%	dev deprel acc 97.64%
+[hops] 2024-09-23 23:56:18.326 | INFO     | Epoch 11: train loss 0.0313	dev loss 0.2216	dev tag acc 98.58%	dev head acc 96.68%	dev deprel acc 97.65%
+[hops] 2024-09-23 23:58:08.684 | INFO     | Epoch 12: train loss 0.0283	dev loss 0.2436	dev tag acc 98.57%	dev head acc 96.64%	dev deprel acc 97.61%
+[hops] 2024-09-23 23:59:58.968 | INFO     | Epoch 13: train loss 0.0264	dev loss 0.2162	dev tag acc 98.66%	dev head acc 96.66%	dev deprel acc 97.70%
+[hops] 2024-09-24 00:01:54.046 | INFO     | Epoch 14: train loss 0.0229	dev loss 0.2235	dev tag acc 98.62%	dev head acc 96.84%	dev deprel acc 97.77%
+[hops] 2024-09-24 00:01:54.047 | INFO     | New best model: head accuracy 96.84% > 96.69%
+[hops] 2024-09-24 00:03:43.593 | INFO     | Epoch 15: train loss 0.0205	dev loss 0.2751	dev tag acc 98.58%	dev head acc 96.62%	dev deprel acc 97.55%
+[hops] 2024-09-24 00:05:39.698 | INFO     | Epoch 16: train loss 0.0191	dev loss 0.2808	dev tag acc 98.61%	dev head acc 96.76%	dev deprel acc 97.69%
+[hops] 2024-09-24 00:07:31.899 | INFO     | Epoch 17: train loss 0.0176	dev loss 0.2993	dev tag acc 98.60%	dev head acc 96.70%	dev deprel acc 97.66%
+[hops] 2024-09-24 00:09:19.834 | INFO     | Epoch 18: train loss 0.0160	dev loss 0.2995	dev tag acc 98.61%	dev head acc 96.80%	dev deprel acc 97.68%
+[hops] 2024-09-24 00:11:12.806 | INFO     | Epoch 19: train loss 0.0148	dev loss 0.3006	dev tag acc 98.66%	dev head acc 96.75%	dev deprel acc 97.71%
+[hops] 2024-09-24 00:13:04.349 | INFO     | Epoch 20: train loss 0.0135	dev loss 0.3259	dev tag acc 98.60%	dev head acc 96.73%	dev deprel acc 97.68%
+[hops] 2024-09-24 00:14:54.792 | INFO     | Epoch 21: train loss 0.0134	dev loss 0.3173	dev tag acc 98.63%	dev head acc 96.77%	dev deprel acc 97.75%
+[hops] 2024-09-24 00:16:44.895 | INFO     | Epoch 22: train loss 0.0121	dev loss 0.3125	dev tag acc 98.57%	dev head acc 96.77%	dev deprel acc 97.68%
+[hops] 2024-09-24 00:18:36.256 | INFO     | Epoch 23: train loss 0.0116	dev loss 0.3590	dev tag acc 98.60%	dev head acc 96.81%	dev deprel acc 97.70%
+[hops] 2024-09-24 00:20:24.902 | INFO     | Epoch 24: train loss 0.0110	dev loss 0.3452	dev tag acc 98.62%	dev head acc 96.73%	dev deprel acc 97.79%
+[hops] 2024-09-24 00:22:17.255 | INFO     | Epoch 25: train loss 0.0106	dev loss 0.3329	dev tag acc 98.66%	dev head acc 96.80%	dev deprel acc 97.74%
+[hops] 2024-09-24 00:24:08.189 | INFO     | Epoch 26: train loss 0.0095	dev loss 0.3955	dev tag acc 98.62%	dev head acc 96.79%	dev deprel acc 97.75%
+[hops] 2024-09-24 00:25:56.986 | INFO     | Epoch 27: train loss 0.0089	dev loss 0.3992	dev tag acc 98.65%	dev head acc 96.82%	dev deprel acc 97.70%
+[hops] 2024-09-24 00:27:47.176 | INFO     | Epoch 28: train loss 0.0081	dev loss 0.3990	dev tag acc 98.62%	dev head acc 96.77%	dev deprel acc 97.70%
+[hops] 2024-09-24 00:29:35.106 | INFO     | Epoch 29: train loss 0.0080	dev loss 0.3668	dev tag acc 98.68%	dev head acc 96.81%	dev deprel acc 97.74%
+[hops] 2024-09-24 00:31:26.250 | INFO     | Epoch 30: train loss 0.0073	dev loss 0.4144	dev tag acc 98.68%	dev head acc 96.69%	dev deprel acc 97.74%
+[hops] 2024-09-24 00:33:13.839 | INFO     | Epoch 31: train loss 0.0071	dev loss 0.3842	dev tag acc 98.66%	dev head acc 96.82%	dev deprel acc 97.78%
+[hops] 2024-09-24 00:35:08.530 | INFO     | Epoch 32: train loss 0.0067	dev loss 0.4032	dev tag acc 98.73%	dev head acc 96.79%	dev deprel acc 97.73%
+[hops] 2024-09-24 00:36:59.998 | INFO     | Epoch 33: train loss 0.0063	dev loss 0.4348	dev tag acc 98.68%	dev head acc 96.81%	dev deprel acc 97.74%
+[hops] 2024-09-24 00:38:50.141 | INFO     | Epoch 34: train loss 0.0060	dev loss 0.4725	dev tag acc 98.69%	dev head acc 96.79%	dev deprel acc 97.72%
+[hops] 2024-09-24 00:40:41.736 | INFO     | Epoch 35: train loss 0.0056	dev loss 0.4214	dev tag acc 98.70%	dev head acc 96.84%	dev deprel acc 97.75%
+[hops] 2024-09-24 00:40:41.737 | INFO     | New best model: head accuracy 96.84% > 96.84%
+[hops] 2024-09-24 00:42:33.044 | INFO     | Epoch 36: train loss 0.0057	dev loss 0.4583	dev tag acc 98.68%	dev head acc 96.81%	dev deprel acc 97.78%
+[hops] 2024-09-24 00:44:25.093 | INFO     | Epoch 37: train loss 0.0054	dev loss 0.4637	dev tag acc 98.68%	dev head acc 96.73%	dev deprel acc 97.80%
+[hops] 2024-09-24 00:46:17.082 | INFO     | Epoch 38: train loss 0.0046	dev loss 0.4691	dev tag acc 98.68%	dev head acc 96.92%	dev deprel acc 97.78%
+[hops] 2024-09-24 00:46:17.083 | INFO     | New best model: head accuracy 96.92% > 96.84%
+[hops] 2024-09-24 00:48:08.386 | INFO     | Epoch 39: train loss 0.0045	dev loss 0.4648	dev tag acc 98.67%	dev head acc 96.98%	dev deprel acc 97.82%
+[hops] 2024-09-24 00:48:08.387 | INFO     | New best model: head accuracy 96.98% > 96.92%
+[hops] 2024-09-24 00:49:58.962 | INFO     | Epoch 40: train loss 0.0046	dev loss 0.4496	dev tag acc 98.65%	dev head acc 96.93%	dev deprel acc 97.75%
+[hops] 2024-09-24 00:51:50.710 | INFO     | Epoch 41: train loss 0.0042	dev loss 0.5003	dev tag acc 98.62%	dev head acc 96.90%	dev deprel acc 97.81%
+[hops] 2024-09-24 00:53:39.287 | INFO     | Epoch 42: train loss 0.0038	dev loss 0.5135	dev tag acc 98.66%	dev head acc 96.89%	dev deprel acc 97.82%
+[hops] 2024-09-24 00:55:32.513 | INFO     | Epoch 43: train loss 0.0039	dev loss 0.4876	dev tag acc 98.64%	dev head acc 96.90%	dev deprel acc 97.80%
+[hops] 2024-09-24 00:57:22.732 | INFO     | Epoch 44: train loss 0.0034	dev loss 0.5043	dev tag acc 98.66%	dev head acc 96.89%	dev deprel acc 97.78%
+[hops] 2024-09-24 00:59:14.557 | INFO     | Epoch 45: train loss 0.0032	dev loss 0.4942	dev tag acc 98.64%	dev head acc 96.93%	dev deprel acc 97.80%
+[hops] 2024-09-24 01:01:07.692 | INFO     | Epoch 46: train loss 0.0030	dev loss 0.5200	dev tag acc 98.70%	dev head acc 96.93%	dev deprel acc 97.80%
+[hops] 2024-09-24 01:02:54.754 | INFO     | Epoch 47: train loss 0.0029	dev loss 0.5822	dev tag acc 98.69%	dev head acc 96.87%	dev deprel acc 97.76%
+[hops] 2024-09-24 01:04:41.248 | INFO     | Epoch 48: train loss 0.0028	dev loss 0.5760	dev tag acc 98.67%	dev head acc 96.84%	dev deprel acc 97.76%
+[hops] 2024-09-24 01:06:35.294 | INFO     | Epoch 49: train loss 0.0025	dev loss 0.5652	dev tag acc 98.68%	dev head acc 96.91%	dev deprel acc 97.81%
+[hops] 2024-09-24 01:08:28.546 | INFO     | Epoch 50: train loss 0.0028	dev loss 0.5409	dev tag acc 98.69%	dev head acc 96.86%	dev deprel acc 97.82%
+[hops] 2024-09-24 01:10:18.082 | INFO     | Epoch 51: train loss 0.0023	dev loss 0.5947	dev tag acc 98.68%	dev head acc 96.90%	dev deprel acc 97.76%
+[hops] 2024-09-24 01:12:08.464 | INFO     | Epoch 52: train loss 0.0022	dev loss 0.5857	dev tag acc 98.67%	dev head acc 96.92%	dev deprel acc 97.81%
+[hops] 2024-09-24 01:13:58.452 | INFO     | Epoch 53: train loss 0.0021	dev loss 0.5704	dev tag acc 98.66%	dev head acc 96.95%	dev deprel acc 97.80%
+[hops] 2024-09-24 01:15:51.753 | INFO     | Epoch 54: train loss 0.0017	dev loss 0.6437	dev tag acc 98.68%	dev head acc 96.88%	dev deprel acc 97.81%
+[hops] 2024-09-24 01:17:45.296 | INFO     | Epoch 55: train loss 0.0018	dev loss 0.6292	dev tag acc 98.69%	dev head acc 96.93%	dev deprel acc 97.82%
+[hops] 2024-09-24 01:19:33.503 | INFO     | Epoch 56: train loss 0.0019	dev loss 0.6395	dev tag acc 98.68%	dev head acc 96.87%	dev deprel acc 97.82%
+[hops] 2024-09-24 01:21:23.357 | INFO     | Epoch 57: train loss 0.0017	dev loss 0.6440	dev tag acc 98.69%	dev head acc 96.93%	dev deprel acc 97.82%
+[hops] 2024-09-24 01:23:14.665 | INFO     | Epoch 58: train loss 0.0014	dev loss 0.6858	dev tag acc 98.69%	dev head acc 96.95%	dev deprel acc 97.79%
+[hops] 2024-09-24 01:25:07.000 | INFO     | Epoch 59: train loss 0.0014	dev loss 0.6657	dev tag acc 98.69%	dev head acc 96.94%	dev deprel acc 97.79%
+[hops] 2024-09-24 01:26:59.729 | INFO     | Epoch 60: train loss 0.0015	dev loss 0.6721	dev tag acc 98.69%	dev head acc 96.96%	dev deprel acc 97.81%
+[hops] 2024-09-24 01:28:50.260 | INFO     | Epoch 61: train loss 0.0014	dev loss 0.6910	dev tag acc 98.69%	dev head acc 96.96%	dev deprel acc 97.81%
+[hops] 2024-09-24 01:30:41.731 | INFO     | Epoch 62: train loss 0.0011	dev loss 0.6898	dev tag acc 98.70%	dev head acc 96.94%	dev deprel acc 97.81%
+[hops] 2024-09-24 01:32:31.913 | INFO     | Epoch 63: train loss 0.0017	dev loss 0.6923	dev tag acc 98.70%	dev head acc 96.97%	dev deprel acc 97.82%
+[hops] 2024-09-24 01:32:36.833 | WARNING  | You're using a RobertaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
+[hops] 2024-09-24 01:32:45.457 | WARNING  | You're using a RobertaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
+[hops] 2024-09-24 01:32:47.135 | INFO     | Metrics for GSD-camembertav2_base_p2_17k_last_layer+rand_seed=666
+ ───────────────────────────────
+  Split   UPOS     UAS     LAS
+ ───────────────────────────────
+  Dev     98.67   97.00   95.91
+  Test    98.57   96.11   94.52
+ ───────────────────────────────