davda54 commited on
Commit
0811758
·
1 Parent(s): f0a4fc8

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +46 -0
README.md ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - 'no'
4
+ - nb
5
+ - nn
6
+ inference: false
7
+ tags:
8
+ - BERT
9
+ - NorBERT
10
+ - Norwegian
11
+ - encoder
12
+ license: cc-by-4.0
13
+ ---
14
+
15
+ # NorBERT 3 large
16
+
17
+
18
+ ## Other sizes:
19
+ - [NorBERT 3 xs (15M)](https://huggingface.co/ltg/norbert3-xs)
20
+ - [NorBERT 3 small (40M)](https://huggingface.co/ltg/norbert3-small)
21
+ - [NorBERT 3 base (123M)](https://huggingface.co/ltg/norbert3-base)
22
+ - [NorBERT 3 large (323M)](https://huggingface.co/ltg/norbert3-large)
23
+
24
+
25
+ ## Example usage
26
+
27
+ This model currently needs a custom wrapper from `modeling_norbert.py`. Then you can use it like this:
28
+
29
+ ```python
30
+ import torch
31
+ from transformers import AutoTokenizer
32
+ from modeling_norbert import NorbertForMaskedLM
33
+
34
+ tokenizer = AutoTokenizer.from_pretrained(“path/to/folder”)
35
+ bert = NorbertForMaskedLM.from_pretrained(“path/to/folder”)
36
+
37
+ mask_id = tokenizer.convert_tokens_to_ids("[MASK]")
38
+ input_text = tokenizer("Nå ønsker de seg en[MASK] bolig.", return_tensors="pt")
39
+ output_p = bert(**input_text)
40
+ output_text = torch.where(input_text.input_ids == mask_id, output_p.logits.argmax(-1), input_text.input_ids)
41
+
42
+ # should output: '[CLS] Nå ønsker de seg en ny bolig.[SEP]'
43
+ print(tokenizer.decode(output_text[0].tolist()))
44
+ ```
45
+
46
+ The following classes are currently implemented: `NorbertForMaskedLM`, `NorbertForSequenceClassification`, `NorbertForTokenClassification`, `NorbertForQuestionAnswering` and `NorbertForMultipleChoice`.