|
--- |
|
license: mit |
|
datasets: |
|
- brwac |
|
- carolina-c4ai/corpus-carolina |
|
language: |
|
- pt |
|
--- |
|
|
|
|
|
# DeBERTinha XSmall (aka "debertinha-ptbr-xsmall") |
|
|
|
## NOTE |
|
We have received feedback of people getting poor results on unbalanced datasets. A more robust training script, like scaling |
|
the loss and adding weight decay (1e-3 to 1e-5) seems to fix it. |
|
|
|
Please refer to [this notebook](https://colab.research.google.com/drive/1mYsAk6RgzWsSGmRzcE4mV-UqM9V7_Jes?usp=sharing) to check how performance |
|
on unbalanced datasets can be improved. |
|
|
|
If you have any problems using the model, please contact us. |
|
|
|
Thanks! |
|
|
|
## Introduction |
|
|
|
DeBERTinha is a pretrained DeBERTa model for Brazilian Portuguese. |
|
|
|
## Available models |
|
|
|
| Model | Arch. | #Params | |
|
| ---------------------------------------- | ---------- | ------- | |
|
| `sagui-nlp/debertinha-ptbr-xsmall` | DeBERTa-V3-Xsmall | 40M | |
|
|
|
## Usage |
|
|
|
```python |
|
from transformers import AutoTokenizer |
|
from transformers import AutoModelForPreTraining |
|
from transformers import AutoModel |
|
|
|
model = AutoModelForPreTraining.from_pretrained('sagui-nlp/debertinha-ptbr-xsmall') |
|
tokenizer = AutoTokenizer.from_pretrained('sagui-nlp/debertinha-ptbr-xsmall') |
|
``` |
|
|
|
### For embeddings |
|
|
|
```python |
|
import torch |
|
|
|
model = AutoModel.from_pretrained('sagui-nlp/debertinha-ptbr-xsmall') |
|
input_ids = tokenizer.encode('Tinha uma pedra no meio do caminho.', return_tensors='pt') |
|
|
|
with torch.no_grad(): |
|
outs = model(input_ids) |
|
encoded = outs.last_hidden_state[0, 0] # Take [CLS] special token representation |
|
``` |
|
|
|
## Citation |
|
|
|
If you use our work, please cite: |
|
|
|
``` |
|
@misc{campiotti2023debertinha, |
|
title={DeBERTinha: A Multistep Approach to Adapt DebertaV3 XSmall for Brazilian Portuguese Natural Language Processing Task}, |
|
author={Israel Campiotti and Matheus Rodrigues and Yuri Albuquerque and Rafael Azevedo and Alyson Andrade}, |
|
year={2023}, |
|
eprint={2309.16844}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.CL} |
|
} |
|
``` |