NTCAL
/

norbert2_sentiment_norec

Text Classification

Norwegian Bokmål

Norwegian Nynorsk

Inference Endpoints

Model card Files Files and versions Community

norbert2_sentiment_norec / README.md

marcuskd's picture

Create README.md

91d966c about 2 years ago

|

history blame contribute delete

2.14 kB

	---
	datasets:
	- marcuskd/reviews_binary_not4_concat
	language:
	- 'no'
	- nb
	- nn
	metrics:
	- accuracy
	- recall
	- precision
	- f1
	---
	# Model Card for Model ID

	Sentiment analysis for Norwegian reviews.

	# Model Description

	This model is trained using a self-concatinated dataset consisting of Norwegian Review Corpus dataset (https://github.com/ltgoslo/norec) and a sentiment dataset from huggingface (https://huggingface.co/datasets/sepidmnorozy/Norwegian_sentiment).
	Its purpose is merely for testing.


	- Developed by: Simen Aabol and Marcus Dragsten
	- Finetuned from model: norbert2

	# Direct Use

	Plug in Norwegian sentences to check its sentiment (negative to positive)

	# Training Details

	## Training and Testing Data

	<!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

	https://huggingface.co/datasets/marcuskd/reviews_binary_not4_concat

	### Preprocessing

	Tokenized using:

	```python
	tokenizer = AutoTokenizer.from_pretrained("ltgoslo/norbert2")
	```
	Training arguments for this model:
	```python
	training_args = TrainingArguments(
	output_dir='./results', # output directory
	num_train_epochs=10, # total number of training epochs
	per_device_train_batch_size=16, # batch size per device during training
	per_device_eval_batch_size=64, # batch size for evaluation
	warmup_steps=500, # number of warmup steps for learning rate scheduler
	weight_decay=0.01, # strength of weight decay
	logging_dir='./logs', # directory for storing logs
	logging_steps=10,
	)
	```

	# Evaluation

	<!-- This section describes the evaluation protocols and provides the results. -->
	Evaluation by testing using test-split of dataset.
	```python
	{
	'accuracy': 0.8357214261912695,
	'recall': 0.886873508353222,
	'precision': 0.8789025543992431,
	'f1': 0.8828700403896412,
	'total_time_in_seconds': 94.33071640000003,
	'samples_per_second': 31.81360340013276,
	'latency_in_seconds': 0.03143309443518828
	}
	```