metadata

language:
  - es
license: apache-2.0
datasets:
  - eriktks/conll2002
metrics:
  - precision
  - recall
  - f1
  - accuracy
pipeline_tag: token-classification

Model Name: NER-finetuned-BETO

This is a BERT model fine-tuned for Named Entity Recognition (NER).

Model Description

This is a fine-tuned BERT model for Named Entity Recognition (NER) task using CONLL2002 dataset.

In the first part, the dataset must be pre-processed in order to give it to the model. This is done using the 🤗 Transformers and BERT tokenizers. Once this is done, finetuning is applied from BETO and using the 🤗 AutoModelForTokenClassification.

Finally, the model is trained obtaining the neccesary metrics for evaluating its performance (Precision, Recall, F1 and Accuracy)

Summary of executed tests can be found in: https://docs.google.com/spreadsheets/d/1lI7skNIvRurwq3LA5ps7JFK5TxToEx4s7Kaah3ezyQc/edit?usp=sharing

Model can be found in: https://huggingface.co/Seb00927/NER-finetuned-BETO

Github repository: https://github.com/paulrojasg/nlp_4th_workshop

Training

Training Details

Epochs: 10
Learning Rate: 2e-05
Weight Decay: 0.01
Batch Size (Train): 16
Batch Size (Eval): 8

Training Metrics

Epoch	Training Loss	Validation Loss	Precision	Recall	F1 Score	Accuracy
1	0.0104	0.1915	0.8359	0.8568	0.8462	0.9701
2	0.0101	0.2187	0.8226	0.8387	0.8306	0.9676
3	0.0066	0.2085	0.8551	0.8637	0.8594	0.9699
4	0.0069	0.2139	0.8342	0.8431	0.8386	0.9698
5	0.0070	0.2110	0.8480	0.8536	0.8508	0.9708
6	0.0060	0.2214	0.8378	0.8497	0.8437	0.9703
7	0.0042	0.2284	0.8437	0.8596	0.8516	0.9704
8	0.0034	0.2344	0.8417	0.8566	0.8491	0.9702
9	0.0026	0.2385	0.8400	0.8580	0.8489	0.9698
10	0.0023	0.2412	0.8460	0.8610	0.8534	0.9704

Authors

Made by:

Paul Rodrigo Rojas Guerrero
Jose Luis Hincapie Bucheli
Sebastián Idrobo Avirama

With help from:

Raúl Ernesto Gutiérrez