--- datasets: - Kushtrim/sst2-norwegian-bokmaal language: - 'no' widget: - text: >- Dette var en vakker film pipeline_tag: text-classification --- # Model Card for Kushtrim/norbert3-large-norsk-sentiment-sst2 ## Model Description This model is a sentiment analysis tool specifically tailored for the Norwegian language. It leverages the BERT architecture, which is renowned for its effectiveness in understanding the context of a word in a sentence. The model is fine-tuned on [LTG NorBERT 3 Large](https://huggingface.co/ltg/norbert3-large) model to specifically enhance its performance on Norwegian texts. It's designed to classify sentiments as either positive or negative. ## Intended Use Primary Use: Sentiment analysis for Norwegian text. Target Audience: Data scientists, NLP practitioners, researchers, and businesses interested in understanding sentiment in Norwegian language texts. Application Examples: Analyzing customer feedback, social media monitoring, market research. ## Training Data The model is trained on the SST2 (Stanford Sentiment Treebank 2) dataset that has been machine-translated into Norwegian. The SST2 dataset is originally in English and comprises sentences from movie reviews, annotated for sentiment (positive/negative). This rich dataset provides a broad range of colloquial and formal language use, reflecting a wide array of sentiments. The machine translation process aimed to retain the sentiment and linguistic nuances of the original dataset while adapting it to the Norwegian linguistic context. However, potential translation inaccuracies may affect the model's understanding and classification of sentiments in certain cases. ## Limitations The model might not perform well on dialects or slang. Context understanding might be limited in complex sentences. Performance might degrade on texts from domains not represented in the training set. ## Ethical Considerations Care should be taken not to use the model to amplify biases present in the training data. The model should not be used for manipulative or harmful purposes, such as influencing political elections. ## Instructions on how to implement and use the model ```python from transformers import pipeline, AutoModelForSequenceClassification, AutoTokenizer import pandas as pd tokenizer = AutoTokenizer.from_pretrained("Kushtrim/norbert3-large-norsk-sentiment-sst2", trust_remote_code=True) model = AutoModelForSequenceClassification.from_pretrained("Kushtrim/norbert3-large-norsk-sentiment-sst2", trust_remote_code=True) classifier = pipeline("text-classification", model=model, tokenizer=tokenizer) text = "Dette var en vakker film" output = classifier(text) print(output) ```