Upload README.md with huggingface_hub
Browse files
README.md
ADDED
@@ -0,0 +1,61 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language:
|
3 |
+
- es
|
4 |
+
license: apache-2.0
|
5 |
+
datasets:
|
6 |
+
- eriktks/conll2002
|
7 |
+
metrics:
|
8 |
+
- precision
|
9 |
+
- recall
|
10 |
+
- f1
|
11 |
+
- accuracy
|
12 |
+
pipeline_tag: token-classification
|
13 |
+
---
|
14 |
+
|
15 |
+
# Model Name: NER-finetuned-BETO
|
16 |
+
|
17 |
+
This is a BERT model fine-tuned for Named Entity Recognition (NER).
|
18 |
+
|
19 |
+
# Model Description
|
20 |
+
|
21 |
+
This is a fine-tuned BERT model for Named Entity Recognition (NER) task using CONLL2002 dataset.
|
22 |
+
|
23 |
+
In the first part, the dataset must be pre-processed in order to give it to the model. This is done using the 🤗 Transformers and BERT tokenizers. Once this is done, finetuning is applied from *[bert-base-cased](https://huggingface.co/google-bert/bert-base-cased)* and using the 🤗 *AutoModelForTokenClassification*.
|
24 |
+
|
25 |
+
Finally, the model is trained obtaining the neccesary metrics for evaluating its performance (Precision, Recall, F1 and Accuracy)
|
26 |
+
|
27 |
+
Summary of executed tests can be found in: https://docs.google.com/spreadsheets/d/1lI7skNIvRurwq3LA5ps7JFK5TxToEx4s7Kaah3ezyQc/edit?usp=sharing
|
28 |
+
|
29 |
+
Model can be found in: https://huggingface.co/paulrojasg/bert-finetuned-ner-1
|
30 |
+
|
31 |
+
Github repository: https://github.com/paulrojasg/nlp_4th_workshop
|
32 |
+
|
33 |
+
# Training
|
34 |
+
|
35 |
+
## Training Details
|
36 |
+
|
37 |
+
- Epochs: 5
|
38 |
+
- Learning Rate: 2e-05
|
39 |
+
- Weight Decay: 0.01
|
40 |
+
- Batch Size (Train): 16
|
41 |
+
- Batch Size (Eval): 8
|
42 |
+
|
43 |
+
## Training Metrics
|
44 |
+
|
45 |
+
| Epoch | Training Loss | Validation Loss | Precision | Recall | F1 Score | Accuracy |
|
46 |
+
|:----:|:-------------:|:---------------:|:---------:|:------:|:--------:|:--------:|
|
47 |
+
| 1 | 0.0507| 0.1354 | 0.8310 | 0.8518 | 0.8413 | 0.9700 |
|
48 |
+
| 2 | 0.0292| 0.1598 | 0.8331 | 0.8433 | 0.8382 | 0.9684 |
|
49 |
+
| 3 | 0.0172| 0.1565 | 0.8392 | 0.8550 | 0.8470 | 0.9705 |
|
50 |
+
| 4 | 0.0136| 0.1812 | 0.8456 | 0.8534 | 0.8495 | 0.9698 |
|
51 |
+
| 5 | 0.0088| 0.1861 | 0.8395 | 0.8543 | 0.8468 | 0.9699 |
|
52 |
+
|
53 |
+
|
54 |
+
# Authors
|
55 |
+
Made by:
|
56 |
+
- Paul Rodrigo Rojas Guerrero
|
57 |
+
- Jose Luis Hincapie Bucheli
|
58 |
+
- Sebastián Idrobo Avirama
|
59 |
+
|
60 |
+
With help from:
|
61 |
+
- [Raúl Ernesto Gutiérrez](https://huggingface.co/raulgdp)
|