RobBERTBestModelOct13

This model is a fine-tuned version of pdelobelle/robbert-v2-dutch-base (https://huggingface.co/ pdelobelle/robbert-v2-dutch-base) on the annotated part of the Moroccorp. It achieves the following results on the evaluation set:

eval_loss: 0.3695
eval_precisions: 0.8647
eval_recall: 0.8151
eval_f-measure: 0.8341
eval_accuracy: 0.9448
eval_runtime: 9.7585
eval_samples_per_second: 82.698
eval_steps_per_second: 5.226
step: 0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

Both training and evalutation data are sampled from the Moroccorp, a dataset that consists of chat conversation from an internet forum for Moroccan-Dutch people called maroc.nl The dataset is labeled on word-level with labels for the three most common languages in the dataset: Dutch (NL), English (ENG), Moroccan Languages (MOR). Additionally, labels for Named entities (NAME), language independent utterances (NON) and words from other languages (OTH) are used.

Training procedure

Here is the code to run this model: https://colab.research.google.com/drive/1h_HiQkoo_yALTvHtiWleF9MMvCmPqmXk?usp=sharing

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 7.5e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 14

Framework versions

Transformers 4.34.0
Pytorch 2.0.1+cu118
Datasets 2.14.5
Tokenizers 0.14.1

Tommert25
/

RobBERTBestModelOct13