stefan-it commited on
Commit
4029f7e
1 Parent(s): dec0f7c

readme: add initial version

Browse files
Files changed (1) hide show
  1. README.md +111 -0
README.md ADDED
@@ -0,0 +1,111 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ datasets:
3
+ - germeval_14
4
+ tags:
5
+ - flair
6
+ - token-classification
7
+ - sequence-tagger-model
8
+ language: de
9
+ inference: false
10
+ license: mit
11
+ ---
12
+
13
+ # Flair NER model trained on GermEval14 dataset
14
+
15
+ This model was trained on the official [GermEval14](https://sites.google.com/site/germeval2014ner/data)
16
+ dataset using the [Flair](https://github.com/flairNLP/flair) framework.
17
+
18
+ It uses a fine-tuned German DistilBERT model from [here](https://huggingface.co/distilbert-base-german-cased).
19
+
20
+ # Results
21
+
22
+ | Dataset \ Run | Run 1 | Run 2 | Run 3† | Run 4 | Run 5 | Avg.
23
+ | ------------- | ----- | ----- | --------- | ----- | ----- | ----
24
+ | Development | 87.05 | 86.52 | **87.34** | 86.85 | 86.46 | 86.84
25
+ | Test | 85.43 | 85.88 | 85.72 | 85.47 | 85.62 | 85.62
26
+
27
+ † denotes that this model is selected for upload.
28
+
29
+ # Flair Fine-Tuning
30
+
31
+ We used the following script to fine-tune the model on the GermEval14 dataset:
32
+
33
+ ```python
34
+ from argparse import ArgumentParser
35
+ import torch, flair
36
+
37
+ # dataset, model and embedding imports
38
+ from flair.datasets import GERMEVAL_14
39
+ from flair.embeddings import TransformerWordEmbeddings
40
+ from flair.models import SequenceTagger
41
+ from flair.trainers import ModelTrainer
42
+
43
+ if __name__ == "__main__":
44
+
45
+ # All arguments that can be passed
46
+ parser = ArgumentParser()
47
+ parser.add_argument("-s", "--seeds", nargs='+', type=int, default='42') # pass list of seeds for experiments
48
+ parser.add_argument("-c", "--cuda", type=int, default=0, help="CUDA device") # which cuda device to use
49
+ parser.add_argument("-m", "--model", type=str, help="Model name (such as Hugging Face model hub name")
50
+
51
+ # Parse experimental arguments
52
+ args = parser.parse_args()
53
+
54
+ # use cuda device as passed
55
+ flair.device = f'cuda:{str(args.cuda)}'
56
+
57
+ # for each passed seed, do one experimental run
58
+ for seed in args.seeds:
59
+ flair.set_seed(seed)
60
+
61
+ # model
62
+ hf_model = args.model
63
+
64
+ # initialize embeddings
65
+ embeddings = TransformerWordEmbeddings(
66
+ model=hf_model,
67
+ layers="-1",
68
+ subtoken_pooling="first",
69
+ fine_tune=True,
70
+ use_context=False,
71
+ respect_document_boundaries=False,
72
+ )
73
+
74
+ # select dataset depending on which language variable is passed
75
+ corpus = GERMEVAL_14()
76
+
77
+ # make the dictionary of tags to predict
78
+ tag_dictionary = corpus.make_tag_dictionary('ner')
79
+
80
+ # init bare-bones sequence tagger (no reprojection, LSTM or CRF)
81
+ tagger: SequenceTagger = SequenceTagger(
82
+ hidden_size=256,
83
+ embeddings=embeddings,
84
+ tag_dictionary=tag_dictionary,
85
+ tag_type='ner',
86
+ use_crf=False,
87
+ use_rnn=False,
88
+ reproject_embeddings=False,
89
+ )
90
+
91
+ # init the model trainer
92
+ trainer = ModelTrainer(tagger, corpus, optimizer=torch.optim.AdamW)
93
+
94
+ # make string for output folder
95
+ output_folder = f"flert-ner-{hf_model}-{seed}"
96
+
97
+ # train with XLM parameters (AdamW, 20 epochs, small LR)
98
+ from torch.optim.lr_scheduler import OneCycleLR
99
+
100
+ trainer.train(
101
+ output_folder,
102
+ learning_rate=5.0e-5,
103
+ mini_batch_size=16,
104
+ mini_batch_chunk_size=1,
105
+ max_epochs=10,
106
+ scheduler=OneCycleLR,
107
+ embeddings_storage_mode='none',
108
+ weight_decay=0.,
109
+ train_with_dev=False,
110
+ )
111
+ ```