NUSTM
/

dutch-restaurant-mt5-small

Text2Text Generation

Inference Endpoints

Model card Files Files and versions Community

dutch-restaurant-mt5-small / README.md

SinclairWang's picture

Update README.md

9c25c22 over 1 year ago

|

2.79 kB

	---
	license: apache-2.0
	language:
	- nl
	metrics:
	- f1
	- exact_match
	library_name: transformers
	tags:
	- dutch
	- restaurant
	- mt5
	---

	# Dutch-Restaurant-mT5-Small


	The Dutch-Restaurant-mT5-Small model was introduced in [A Simple yet Effective Framework for Few-Shot Aspect-Based Sentiment Analysis (SIGIR'23)](https://doi.org/10.1145/3539618.3591940) by Zengzhi Wang, Qiming Xie, and Rui Xia.

	The details are available at [Github:FS-ABSA](https://github.com/nustm/fs-absa) and [SIGIR'23 paper](https://doi.org/10.1145/3539618.3591940).


	# Model Description

	To bridge the domain (and lingual) gap between general pre-training and the task of interest in a specific domain (i.e., `restaurant` in this repo), we conducted domain-adaptive pre-training,
	i.e., continuing pre-training the language model (i.e., mT5-small) on the unlabeled corpus of the domain (and lingual) of interest (i.e., `restaurant`) with the text-infilling objective
	(corruption rate of 15% and average span length of 1). We collect relevant 100k unlabeled reviews from Yelp for the restaurant domain and then translate them into Dutch with the DeepL translator.
	For pre-training, we employ the [Adafactor](https://arxiv.org/abs/1804.04235) optimizer with a batch size of 16 and a constant learning rate of 1e-4.

	Our model can be seen as an enhanced T5 model in the restaurant domain, which can be used for various NLP tasks related to the restaurant domain,
	including but not limited to fine-grained sentiment analysis (ABSA), product-relevant Question Answering (PrQA), text style transfer, etc.



	```python
	>>> from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

	>>> tokenizer = AutoTokenizer.from_pretrained("NUSTM/dutch-restaurant-mt5-small")
	>>> model = AutoModelForSeq2SeqLM.from_pretrained("NUSTM/dutch-restaurant-mt5-small")

	>>> input_ids = tokenizer(
	... "De pizza's hier zijn heerlijk!!!", return_tensors="pt"
	... ).input_ids # Batch size 1
	>>> outputs = model(input_ids=input_ids)
	```


	# Citation

	If you find this work helpful, please cite our paper as follows:

	```bibtex
	@inproceedings{wang2023fs-absa,
	author = {Wang, Zengzhi and Xie, Qiming and Xia, Rui},
	title = {A Simple yet Effective Framework for Few-Shot Aspect-Based Sentiment Analysis},
	year = {2023},
	isbn = {9781450394086},
	publisher = {Association for Computing Machinery},
	address = {New York, NY, USA},
	url = {https://doi.org/10.1145/3539618.3591940},
	doi = {10.1145/3539618.3591940},
	booktitle = {Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval},
	numpages = {6},
	location = {Taipei, Taiwan},
	series = {SIGIR '23}
	}
	```

	Note that the complete citation format will be announced once our paper is published in the SIGIR 2023 conference proceedings.