|
--- |
|
license: apache-2.0 |
|
language: |
|
- nl |
|
metrics: |
|
- f1 |
|
- exact_match |
|
library_name: transformers |
|
tags: |
|
- dutch |
|
- restaurant |
|
- mt5 |
|
--- |
|
|
|
# Dutch-Restaurant-mT5-Small |
|
|
|
|
|
The Dutch-Restaurant-mT5-Small model was introduced in [A Simple yet Effective Framework for Few-Shot Aspect-Based Sentiment Analysis (SIGIR'23)](https://doi.org/10.1145/3539618.3591940) by Zengzhi Wang, Qiming Xie, and Rui Xia. |
|
|
|
The details are available at [Github:FS-ABSA](https://github.com/nustm/fs-absa) and [SIGIR'23 paper](https://doi.org/10.1145/3539618.3591940). |
|
|
|
|
|
# Model Description |
|
|
|
To bridge the domain (and lingual) gap between general pre-training and the task of interest in a specific domain (i.e., `restaurant` in this repo), we conducted *domain-adaptive pre-training*, |
|
i.e., continuing pre-training the language model (i.e., mT5-small) on the unlabeled corpus of the domain (and lingual) of interest (i.e., `restaurant`) with the *text-infilling objective* |
|
(corruption rate of 15% and average span length of 1). We collect relevant 100k unlabeled reviews from Yelp for the restaurant domain and then translate them into Dutch with the DeepL translator. |
|
For pre-training, we employ the [Adafactor](https://arxiv.org/abs/1804.04235) optimizer with a batch size of 16 and a constant learning rate of 1e-4. |
|
|
|
Our model can be seen as an enhanced T5 model in the restaurant domain, which can be used for various NLP tasks related to the restaurant domain, |
|
including but not limited to fine-grained sentiment analysis (ABSA), product-relevant Question Answering (PrQA), text style transfer, etc. |
|
|
|
|
|
|
|
```python |
|
>>> from transformers import AutoTokenizer, AutoModelForSeq2SeqLM |
|
|
|
>>> tokenizer = AutoTokenizer.from_pretrained("NUSTM/dutch-restaurant-mt5-small") |
|
>>> model = AutoModelForSeq2SeqLM.from_pretrained("NUSTM/dutch-restaurant-mt5-small") |
|
|
|
>>> input_ids = tokenizer( |
|
... "De pizza's hier zijn heerlijk!!!", return_tensors="pt" |
|
... ).input_ids # Batch size 1 |
|
>>> outputs = model(input_ids=input_ids) |
|
``` |
|
|
|
|
|
# Citation |
|
|
|
If you find this work helpful, please cite our paper as follows: |
|
|
|
```bibtex |
|
@inproceedings{wang2023fs-absa, |
|
author = {Wang, Zengzhi and Xie, Qiming and Xia, Rui}, |
|
title = {A Simple yet Effective Framework for Few-Shot Aspect-Based Sentiment Analysis}, |
|
year = {2023}, |
|
isbn = {9781450394086}, |
|
publisher = {Association for Computing Machinery}, |
|
address = {New York, NY, USA}, |
|
url = {https://doi.org/10.1145/3539618.3591940}, |
|
doi = {10.1145/3539618.3591940}, |
|
booktitle = {Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval}, |
|
numpages = {6}, |
|
location = {Taipei, Taiwan}, |
|
series = {SIGIR '23} |
|
} |
|
``` |
|
|
|
Note that the complete citation format will be announced once our paper is published in the SIGIR 2023 conference proceedings. |
|
|
|
|
|
|