SinclairWang's picture
Update README.md
9c25c22
|
raw
history blame
2.79 kB
---
license: apache-2.0
language:
- nl
metrics:
- f1
- exact_match
library_name: transformers
tags:
- dutch
- restaurant
- mt5
---
# Dutch-Restaurant-mT5-Small
The Dutch-Restaurant-mT5-Small model was introduced in [A Simple yet Effective Framework for Few-Shot Aspect-Based Sentiment Analysis (SIGIR'23)](https://doi.org/10.1145/3539618.3591940) by Zengzhi Wang, Qiming Xie, and Rui Xia.
The details are available at [Github:FS-ABSA](https://github.com/nustm/fs-absa) and [SIGIR'23 paper](https://doi.org/10.1145/3539618.3591940).
# Model Description
To bridge the domain (and lingual) gap between general pre-training and the task of interest in a specific domain (i.e., `restaurant` in this repo), we conducted *domain-adaptive pre-training*,
i.e., continuing pre-training the language model (i.e., mT5-small) on the unlabeled corpus of the domain (and lingual) of interest (i.e., `restaurant`) with the *text-infilling objective*
(corruption rate of 15% and average span length of 1). We collect relevant 100k unlabeled reviews from Yelp for the restaurant domain and then translate them into Dutch with the DeepL translator.
For pre-training, we employ the [Adafactor](https://arxiv.org/abs/1804.04235) optimizer with a batch size of 16 and a constant learning rate of 1e-4.
Our model can be seen as an enhanced T5 model in the restaurant domain, which can be used for various NLP tasks related to the restaurant domain,
including but not limited to fine-grained sentiment analysis (ABSA), product-relevant Question Answering (PrQA), text style transfer, etc.
```python
>>> from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
>>> tokenizer = AutoTokenizer.from_pretrained("NUSTM/dutch-restaurant-mt5-small")
>>> model = AutoModelForSeq2SeqLM.from_pretrained("NUSTM/dutch-restaurant-mt5-small")
>>> input_ids = tokenizer(
... "De pizza's hier zijn heerlijk!!!", return_tensors="pt"
... ).input_ids # Batch size 1
>>> outputs = model(input_ids=input_ids)
```
# Citation
If you find this work helpful, please cite our paper as follows:
```bibtex
@inproceedings{wang2023fs-absa,
author = {Wang, Zengzhi and Xie, Qiming and Xia, Rui},
title = {A Simple yet Effective Framework for Few-Shot Aspect-Based Sentiment Analysis},
year = {2023},
isbn = {9781450394086},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3539618.3591940},
doi = {10.1145/3539618.3591940},
booktitle = {Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval},
numpages = {6},
location = {Taipei, Taiwan},
series = {SIGIR '23}
}
```
Note that the complete citation format will be announced once our paper is published in the SIGIR 2023 conference proceedings.