|
--- |
|
license: mit |
|
|
|
inference: |
|
parameters: |
|
aggregation_strategy: "average" |
|
|
|
language: |
|
- pt |
|
pipeline_tag: token-classification |
|
tags: |
|
- medialbertina-ptpt |
|
- deberta |
|
- portuguese |
|
- european portuguese |
|
- medical |
|
- clinical |
|
- healthcare |
|
- NER |
|
- Named Entity Recognition |
|
- IE |
|
- Information Extraction |
|
widget: |
|
- text: Durante a cirurgia ortopédica para corrigir a fratura no tornozelo, os sinais vitais do utente, incluindo a pressão arterial, com leitura de 120/87 mmHg, a frequência cardíaca, de 80 batimentos por minuto, e SpO2 a 98%, foram monitorizados. Após a cirurgia o utente apresentava dor intensa no local e inchaço no tornozelo, mas os resultados dos exames de radiografia revelaram uma recuperação satisfatória. |
|
example_title: Example 1 |
|
- text: Durante o procedimento endoscópico, foram encontrados pólipos no cólon do paciente. |
|
example_title: Example 2 |
|
- text: Foi recomendada aspirina de 500mg a cada 4 horas, durante 3 dias. |
|
example_title: Example 3 |
|
- text: Após as sessões de fisioterapia o paciente apresenta recuperação de mobilidade. |
|
example_title: Example 4 |
|
- text: O paciente está em Quimioterapia com uma dosagem específica de Cisplatina para o tratamento do cancro do pulmão. |
|
example_title: Example 5 |
|
- text: Monitorização da Freq. cardíaca com 90 bpm. P Arterial de 120-80 mmHg |
|
example_title: Example 6 |
|
- text: A ressonância magnética da utente revelou uma ruptura no menisco lateral do joelho. |
|
example_title: Example 7 |
|
- text: A paciente foi diagnosticada com esclerose múltipla e iniciou terapia com imunomoduladores. |
|
--- |
|
|
|
# MediAlbertina |
|
The first publicly available medical language models trained with real European Portuguese data. |
|
|
|
MediAlbertina is a family of encoders from the Bert family, DeBERTaV2-based, resulting from the continuation of the pre-training of [PORTULAN's Albertina](https://huggingface.co/PORTULAN) models with Electronic Medical Records shared by Portugal's largest public hospital. |
|
|
|
Like its antecessors, MediAlbertina models are distributed under the [MIT license](https://huggingface.co/portugueseNLP/medialbertina_pt-pt_900m/blob/main/LICENSE). |
|
|
|
|
|
|
|
# Model Description |
|
|
|
MediAlbertina PT-PT 900M NER was created through fine-tuning of [MediAlbertina PT-PT 900M](https://huggingface.co/portugueseNLP/medialbertina_pt-pt_900m) on real European Portuguese EMRs that have been hand-annotated for the following entities: |
|
- Diagnostico |
|
- Sintoma |
|
- Medicamento |
|
- Dosagem |
|
- ProcedimentoMedico |
|
- SinalVital |
|
- Resultado |
|
- Progresso |
|
|
|
MediAlbertina PT-PT 900M NER achieved superior results to the same adaptation made on a non-medical Portuguese language model, demonstrating the effectiveness of this domain adaptation, and its potential for medical AI in Portugal. |
|
|
|
| Model | NER single-model | NER multi-models | Assertion Status | |
|
|-------------------------|:----------------:|:----------------:|:----------------:| |
|
| | F1-score | F1-score | F1-score | |
|
|albertina-900m-portuguese-ptpt-encoder | 0.813 | 0.811 | 0.687 | |
|
| **medialbertina_pt-pt_900m** | **0.832** | **0.848** | **0.755** | |
|
|
|
## Data |
|
|
|
MediAlbertina PT-PT 900M NER was fine-tuned on more than 10k hand-annotated entities from more than a thousand fully anonymized medical sentences from Portugal's largest public hospital. This data was acquired under the framework of the [FCT project DSAIPA/AI/0122/2020 AIMHealth-Mobile Applications Based on Artificial Intelligence](https://ciencia.iscte-iul.pt/projects/aplicacoes-moveis-baseadas-em-inteligencia-artificial-para-resposta-de-saude-publica/1567). |
|
|
|
|
|
## How to use |
|
|
|
```Python |
|
from transformers import pipeline |
|
|
|
ner_pipeline = pipeline('ner', model='portugueseNLP/medialbertina_pt-pt_900m_NER', aggregation_strategy='average') |
|
sentence = 'Durante o procedimento endoscópico, foram encontrados pólipos no cólon do paciente.' |
|
entities = ner_pipeline(sentence) |
|
for entity in entities: |
|
print(f"{entity['entity_group']} - {sentence[entity['start']:entity['end']]}") |
|
``` |
|
|
|
## Citation |
|
|
|
MediAlbertina is developed by a joint team from [ISCTE-IUL](https://www.iscte-iul.pt/), Portugal, and [Select Data](https://selectdata.com/), CA USA. For a fully detailed description, check the respective publication: |
|
|
|
```latex |
|
In publishing process. Reference will be added soon. |
|
``` |
|
Please use the above cannonical reference when using or citing this model. |
|
|