File size: 1,600 Bytes
cf4e167
 
 
 
 
 
 
 
 
 
376f404
 
d267b1a
cf4e167
 
 
 
 
 
 
 
 
 
 
1a53820
 
86a8e76
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
cf4e167
86a8e76
 
 
cf4e167
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
---
language:
- en
metrics:
- accuracy
- f1
library_name: transformers
pipeline_tag: token-classification
tags:
- deberta-v3
datasets:
- DFKI-SLT/few-nerd
license: mit
---

## Deberta for Named Entity Recognition

I used a Pretrained Deberta-v3-base and finetuned it on Few-NERD, A NER dataset that contains over 180k examples and over 4.6 million tokens.

The Token labels are Person, Organisation, Location, Building, Event, Product, Art & Misc.

## How to use the model

```python
from transformers import pipeline

def print_ner(sentences):
    """Cleaning and printing NER results

    """
    for sentence in sentences:
        last_entity_type = sentence[0]['entity']
        last_index = sentence[0]['index']
        word = sentence[0]['word']
        for i, token in enumerate(sentence):
            if (i > 0):
                if (token['entity'] == last_entity_type) and (token['index'] == last_index + 1):
                    word = word + '' + token['word']

                else:
                    word = word.replace('▁', ' ')
                    print(f"{word[1:]} {last_entity_type}")
                    word = token['word']
                last_entity_type = token['entity']
                last_index = token['index']

                if i == len(sentence) - 1:
                    word = word.replace('▁', ' ')
                    print(f"{word[1:]} {last_entity_type}")


pipe = pipeline(model='RashidNLP/NER-Deberta')
sentence = pipe(["Elon Musk will be at SpaceX's Starbase facility in Boca Chica for the orbital launch of starship next month"])
print_ner(sentence)

```