MassMin commited on
Commit
80768d5
1 Parent(s): af96897

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +32 -2
README.md CHANGED
@@ -5,7 +5,7 @@
5
  ---
6
 
7
  # XLM-RoBERTa Token Classification for Named Entity Recognition (NER)
8
- This model is a fine-tuned version of XLM-RoBERTa (xlm-roberta-base) for Named Entity Recognition (NER) tasks. It has been trained on the PAN-X subset of the XTREME dataset for four languages: German (de), French (fr), Italian (it), and English (en). The model identifies the following entity types:
9
 
10
  PER: Person names
11
  ORG: Organization names
@@ -118,7 +118,37 @@ The model's performance is evaluated using the F1 score for NER. The predictions
118
  [More Information Needed]
119
 
120
  ## Evaluation
121
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
122
  <!-- This section describes the evaluation protocols and provides the results. -->
123
 
124
  ### Testing Data, Factors & Metrics
 
5
  ---
6
 
7
  # XLM-RoBERTa Token Classification for Named Entity Recognition (NER)
8
+ This model is a fine-tuned version of XLM-RoBERTa (xlm-roberta-base) for Named Entity Recognition (NER) tasks. It has been trained on the PAN-X subset of the XTREME dataset for German Language . The model identifies the following entity types:
9
 
10
  PER: Person names
11
  ORG: Organization names
 
118
  [More Information Needed]
119
 
120
  ## Evaluation
121
+ ('''import torch
122
+ from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline
123
+ import pandas as pd
124
+
125
+ # Load the fine-tuned XLM-RoBERTa model and tokenizer from Hugging Face
126
+ model_checkpoint = "MassMin/xlm-roberta-base-finetuned-panx-de" # Replace with your Hugging Face model name
127
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
128
+
129
+ # Load the tokenizer and model
130
+ tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
131
+ model = AutoModelForTokenClassification.from_pretrained(model_checkpoint).to(device)
132
+
133
+ # Create the NER pipeline
134
+ ner_pipeline = pipeline("ner", model=model, tokenizer=tokenizer, framework="pt", device=0 if torch.cuda.is_available() else -1)
135
+
136
+ # Define the helper function to use the NER pipeline
137
+ def tag_text_with_pipeline(text, ner_pipeline):
138
+ # Use the NER pipeline to get predictions
139
+ results = ner_pipeline(text)
140
+
141
+ # Convert results to a DataFrame for easy viewing
142
+ df = pd.DataFrame(results)
143
+ df = df[['word', 'entity', 'score']]
144
+ df.columns = ['Tokens', 'Tags', 'Score'] # Rename columns for clarity
145
+ return df
146
+
147
+ # Example usage
148
+ text = "Jeff Dean works at Google in California."
149
+ result = tag_text_with_pipeline(text, ner_pipeline)
150
+ print(result)
151
+ ''')
152
  <!-- This section describes the evaluation protocols and provides the results. -->
153
 
154
  ### Testing Data, Factors & Metrics