Prikshit7766 commited on
Commit
6739e16
·
verified ·
1 Parent(s): 2778ac5

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +106 -0
README.md ADDED
@@ -0,0 +1,106 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Fine-Tuned BERT Model for Named Entity Recognition (NER) with Accelerate Library
2
+
3
+ This repository contains a fine-tuned BERT model for Named Entity Recognition (NER) tasks, trained on the [CoNLL 2003 dataset](https://huggingface.co/datasets/eriktks/conll2003) using the Hugging Face Accelerate library.
4
+
5
+ The dataset includes the following labels:
6
+ - `O`, `B-PER`, `I-PER`, `B-ORG`, `I-ORG`, `B-LOC`, `I-LOC`, `B-MISC`, `I-MISC`
7
+
8
+ ## Model Training Details
9
+
10
+ ### Training Arguments
11
+ - **Library**: Hugging Face Accelerate
12
+ - **Model Architecture**: `bert-base-cased` for token classification
13
+ - **Learning Rate**: `2e-5`
14
+ - **Number of Epochs**: `20`
15
+ - **Weight Decay**: `0.01`
16
+ - **Batch Size**: `8`
17
+ - **Evaluation Strategy**: `epoch`
18
+ - **Save Strategy**: `epoch`
19
+
20
+ *Additional default parameters from the Accelerate and Transformers libraries were used.*
21
+
22
+ ---
23
+
24
+ ## Evaluation Results
25
+
26
+ ### Validation Set Performance
27
+ - **Overall Metrics**:
28
+ - Precision: 95.17%
29
+ - Recall: 93.87%
30
+ - F1 Score: 94.52%
31
+ - Accuracy: 98.62%
32
+
33
+ #### Per-Label Performance
34
+ | Entity Type | Precision | Recall | F1 Score |
35
+ |-------------|-----------|--------|----------|
36
+ | LOC | 96.46% | 96.51% | 96.49% |
37
+ | MISC | 90.78% | 89.14% | 89.95% |
38
+ | ORG | 92.61% | 90.26% | 91.42% |
39
+ | PER | 97.94% | 96.32% | 97.12% |
40
+
41
+ ### Test Set Performance
42
+ - **Overall Metrics**:
43
+ - Precision: 91.82%
44
+ - Recall: 89.68%
45
+ - F1 Score: 90.74%
46
+ - Accuracy: 97.23%
47
+
48
+ #### Per-Label Performance
49
+ | Entity Type | Precision | Recall | F1 Score |
50
+ |-------------|-----------|--------|----------|
51
+ | LOC | 92.99% | 92.10% | 92.54% |
52
+ | MISC | 82.05% | 75.00% | 78.37% |
53
+ | ORG | 90.67% | 88.28% | 89.46% |
54
+ | PER | 96.04% | 95.57% | 95.81% |
55
+
56
+ ---
57
+
58
+ ## How to Use the Model
59
+
60
+ You can load the model directly from the Hugging Face Model Hub:
61
+
62
+ ```python
63
+ from transformers import pipeline
64
+
65
+ # Replace with your specific model checkpoint
66
+ model_checkpoint = "Prikshit7766/bert-finetuned-ner-accelerate"
67
+ token_classifier = pipeline(
68
+ "token-classification",
69
+ model=model_checkpoint,
70
+ aggregation_strategy="simple"
71
+ )
72
+
73
+ # Example usage
74
+ result = token_classifier("My name is Sylvain and I work at Hugging Face in Brooklyn.")
75
+ print(result)
76
+ ```
77
+
78
+ ### Example Output
79
+ ```python
80
+ [
81
+ {
82
+ "entity_group": "PER",
83
+ "score": 0.9999658,
84
+ "word": "Sylvain",
85
+ "start": 11,
86
+ "end": 18
87
+ },
88
+ {
89
+ "entity_group": "ORG",
90
+ "score": 0.99996203,
91
+ "word": "Hugging Face",
92
+ "start": 33,
93
+ "end": 45
94
+ },
95
+ {
96
+ "entity_group": "LOC",
97
+ "score": 0.9999542,
98
+ "word": "Brooklyn",
99
+ "start": 49,
100
+ "end": 57
101
+ }
102
+ ]
103
+ ```
104
+
105
+ ---
106
+