Model Card for Cardioner Medroberta.Nl Multilabel

This a medroberta.nl base model finetuned for span classification. This specific model is the average of the best checkpoints per fold over a ten-fold cross-validation. For this model we used the IOB-tagged. Using the IOB-tagging schema facilitates the aggregation of predictions over sequences.

For the chunking we used paragraph based chunking, and we assumed the maximum context length of the base model, i.e. 512 tokens.

Expected input and output

The input should be a string with Dutch cardio clinical text.

CardioNER_MedRoBERTa.nl_multilabel is a muticlass span classification model. The classes that can be predicted are disease, medication, procedure and symptom.

Extracting span classification from CardioNER_MedRoBERTa.nl_multilabel

The following script converts a string of <512 tokens to a list of span predictions.

from transformers import pipeline

le_pipe = pipeline('ner', 
                    model=model, 
                    tokenizer=model, aggregation_strategy="simple", 
                    device=-1)

named_ents = le_pipe(SOME_TEXT)

To process a string of arbitrary length you can split the string into sentences or paragraphs using e.g. pysbd or spacy(sentencizer) and iteratively parse the list of with the span-classification pipe.

Data description

CardioCCC; Manually annotated parallel-language corpus for the clinical cardiology domain

On a 10-fold cross-validations the multilabel metrics are:

Metric	Mean	Median	Stdev
---	---	---	---
eval_f1_B-DISEASE	0.782	0.777	0.024
eval_f1_B-MEDICATION	0.898	0.905	0.036
eval_f1_B-PROCEDURE	0.788	0.79	0.027
eval_f1_B-SYMPTOM	0.73	0.73	0.018
eval_f1_I-DISEASE	0.776	0.781	0.022
eval_f1_I-MEDICATION	0.8	0.803	0.086
eval_f1_I-PROCEDURE	0.759	0.757	0.018
eval_f1_I-SYMPTOM	0.725	0.723	0.017
eval_f1_O	0.935	0.936	0.005
eval_f1_macro	0.799	0.799	0.018
eval_f1_micro	0.884	0.886	0.008
eval_loss	0.095	0.092	0.01
eval_precision_B-DISEASE	0.784	0.774	0.029
eval_precision_B-MEDICATION	0.907	0.917	0.035
eval_precision_B-PROCEDURE	0.791	0.795	0.031
eval_precision_B-SYMPTOM	0.721	0.72	0.017
eval_precision_I-DISEASE	0.79	0.79	0.025
eval_precision_I-MEDICATION	0.835	0.863	0.075
eval_precision_I-PROCEDURE	0.784	0.779	0.023
eval_precision_I-SYMPTOM	0.727	0.72	0.021
eval_precision_O	0.935	0.938	0.009
eval_precision_macro	0.808	0.81	0.015
eval_precision_micro	0.888	0.889	0.008
eval_rauc_macro	0.883	0.885	0.012
eval_rauc_micro	0.933	0.934	0.005
eval_recall_B-DISEASE	0.781	0.785	0.025
eval_recall_B-MEDICATION	0.889	0.893	0.039
eval_recall_B-PROCEDURE	0.785	0.783	0.025
eval_recall_B-SYMPTOM	0.739	0.74	0.023
eval_recall_I-DISEASE	0.763	0.774	0.028
eval_recall_I-MEDICATION	0.77	0.767	0.103
eval_recall_I-PROCEDURE	0.735	0.744	0.025
eval_recall_I-SYMPTOM	0.724	0.724	0.032
eval_recall_O	0.934	0.934	0.004
eval_recall_macro	0.791	0.795	0.022
eval_recall_micro	0.88	0.883	0.009
eval_roc_auc_B-DISEASE	0.888	0.889	0.013
eval_roc_auc_B-MEDICATION	0.944	0.946	0.019
eval_roc_auc_B-PROCEDURE	0.89	0.889	0.013
eval_roc_auc_B-SYMPTOM	0.866	0.867	0.011
eval_roc_auc_I-DISEASE	0.873	0.879	0.014
eval_roc_auc_I-MEDICATION	0.884	0.883	0.052
eval_roc_auc_I-PROCEDURE	0.862	0.866	0.013
eval_roc_auc_I-SYMPTOM	0.85	0.851	0.016
eval_roc_auc_O	0.887	0.889	0.008

Acknowledgement

This is part of the DT4H project.

Doi and reference

For more details about training/eval and other scripts, see CardioNER github repo. and for more information on the background, see Datatools4Heart Huggingface/Website

UMCU
/

cardioner_medroberta.nl_multilabel