classla
/

xlm-r-parlasent

Text Classification

sentiment-analysis

text-regression

sentiment-regression

sentiment-classification

Model card Files Files and versions Community

xlm-r-parlasent / README.md

nljubesi's picture

Update README.md

afc547a over 1 year ago

|

2.55 kB

	---
	license: apache-2.0
	language:
	- bs
	- hr
	- sr
	- sl
	- sk
	- cs
	- en
	tags:
	- sentiment-analysis
	- text-regression
	- text-classification
	- sentiment-regression
	- sentiment-classification
	- parliament
	inference: false
	---


	# Multilingual parliament sentiment regression model XLM-R-Parla-Sent

	This model is based on [xlm-r-parla](https://huggingface.co/classla/xlm-r-parla), an XLM-R-large model additionally pre-trained on parliamentary proceedings. The model was fine-tuned on the [ParlaSent dataset](http://hdl.handle.net/11356/1868), a manually annotated selection of sentences of parliamentary proceedings from Bosnia and Herzegovina, Croatia, Czechia, Serbia, Slovakia, Slovenia, and the United Kingdom.

	Both the additionally pre-trained model, as the training dataset are results of the [ParlaMint project](https://www.clarin.eu/parlamint). The details on the models and the dataset are described in the following publication (to be published soon):

	Michal Mochtak, Peter Rupnik, Nikola Ljubešić: The ParlaSent Multilingual Training Dataset for Sentiment Identification in Parliamentary Proceedings.

	## Annotation schema

	The discrete labels, present in the original dataset, were mapped to integers as follows:

	```
	"Negative": 0.0,
	"M_Negative": 1.0,
	"N_Neutral": 2.0,
	"P_Neutral": 3.0,
	"M_Positive": 4.0,
	"Positive": 5.0,
	```
	The model was then fine-tuned on numeric labels and set up as a regressor.

	## Finetuning procedure

	The fine-tuning procedure is described in the pending paper. Presumed optimal hyperparameters used are
	```
	num_train_epochs=4,
	train_batch_size=32,
	learning_rate=8e-6,
	regression=True
	```

	## Results

	Results reported were obtained from 5 fine-tuning runs.

	test dataset \| R^2 \| MAE
	--- \| --- \| ---
	BCS \| 0.6146 ± 0.0104 \| 0.7050 ± 0.0089
	EN \| 0.6722 ± 0.0100 \| 0.6755 ± 0.0076

	## Usage Example

	With `simpletransformers==0.64.3`.
	```python
	from simpletransformers.classification import ClassificationModel, ClassificationArgs
	import torch
	model_args = ClassificationArgs(
	regression=True,
	)
	model = ClassificationModel(model_type="xlmroberta", model_name="classla/xlm-r-parlasent",use_cuda=torch.cuda.is_available(), num_labels=1,args=model_args)
	model.predict(["I fully disagree with this argument.", "The ministers are entering the chamber.", "Things can always be improved in the future.", "These are great news."])
	```

	Output:
	```python
	(
	array([0.11633301, 3.63671875, 4.203125 , 5.30859375]),
	array([0.11633301, 3.63671875, 4.203125 , 5.30859375])
	)
	```