File size: 1,586 Bytes
cf4e167 376f404 cf4e167 86a8e76 cf4e167 86a8e76 cf4e167 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 |
---
language:
- en
metrics:
- accuracy
- f1
library_name: transformers
pipeline_tag: token-classification
tags:
- deberta-v3
datasets:
- DFKI-SLT/few-nerd
---
## Deberta for Named Entity Recognition
I used a Pretrained Deberta-v3-base and finetuned it on Few-NERD, A NER dataset that contains over 180k examples and over 4.6 million tokens.
The Token labels are Person, Organisation, Location, Building, Event, Product, Art & Misc.
## How to use the model
```python
def print_ner(sentences):
"""Cleaning and printing NER results
"""
for sentence in sentences:
last_entity_type = sentence[0]['entity']
last_index = sentence[0]['index']
word = sentence[0]['word']
for i, token in enumerate(sentence):
if (i > 0):
if (token['entity'] == last_entity_type) and (token['index'] == last_index + 1):
word = word + '' + token['word']
else:
word = word.replace('▁', ' ')
print(f"{word[1:]} {last_entity_type}")
word = token['word']
last_entity_type = token['entity']
last_index = token['index']
if i == len(sentence) - 1:
word = word.replace('▁', ' ')
print(f"{word[1:]} {last_entity_type}")
from transformers import pipeline
pipe = pipeline(model='RashidNLP/NER-Deberta')
sentence = pipe(["Elon Musk will be at SpaceX's Starbase facility in Boca Chica for the orbital launch of starship next month"])
print_ner(sentence)
``` |