--- license: apache-2.0 --- This language model is designed to assess the attitude expressed in texts about **climate change**. It categorizes the attitude into three types: risk, neutral, and opportunity. These categories correspond to the negative, neutral, and positive classifications commonly used in sentiment analysis. In comparison to similar existing models, such as "climatebert/distilroberta-base-climate-sentiment" and "XerOpred/twitter-climate-sentiment-model," which typically achieve accuracies ranging from 10% to 30% and F1 scores around 15%, our model demonstrates exceptional performance. When evaluated using the test dataset from "climatebert/climate_sentiment," it achieves an accuracy of 89% and an F1 score of 89%. **Note** that you should paste or type a text concerning the **climate change** in the API input bar or using the testing code. Otherwise, the model does not work so well. e,.g, An example input could be, "Major oil companies have misled Americans for decades about the threat of human-caused climate change, according to a new report released Tuesday by Democrats in Congress. The 65-page report was the result of a three-year investigation and was made public hours before a Senate Budget Committee hearing about the role that oil and gas companies have played in global warming. " Please **cite**: "Sun., K, and Wang, R. 2024. The fine-tuned language model for detecting human attitudes to climate changes" if you use this model. The project in github (including training code) is available at: https://github.com/fivehills/climate_attitude_LM/ The following code shows how to test in the model. ``` from transformers import AutoModelForSequenceClassification, AutoTokenizer import torch # Load model and tokenizer model_path = "KevSun/climate-attitude-LM" # Ensure this path points to the correct directory model = AutoModelForSequenceClassification.from_pretrained(model_path) tokenizer = AutoTokenizer.from_pretrained(model_path) # Define the path to your text file file_path = 'yourtext.txt' # Read the content of the file with open(file_path, 'r', encoding='utf-8') as file: new_text = file.read() # Encode the text using the tokenizer used during training encoded_input = tokenizer(new_text, return_tensors='pt', padding=True, truncation=True, max_length=64) # Move the model to the correct device (CPU or GPU if available) device = "cuda" if torch.cuda.is_available() else "cpu" model = model.to(device) # Move model to the correct device encoded_input = {k: v.to(device) for k, v in encoded_input.items()} # Move tensor to the correct device model.eval() # Set the model to evaluation mode # Perform the prediction with torch.no_grad(): outputs = model(**encoded_input) # Get the predictions (assumes classification with labels) predictions = outputs.logits.squeeze() # Assuming softmax is needed to interpret the logits as probabilities probabilities = torch.softmax(predictions, dim=0) # Define labels for each class index based on your classification categories labels = ["risk", "neutral", "opportunity"] predicted_index = torch.argmax(probabilities).item() # Get the index of the max probability predicted_label = labels[predicted_index] predicted_probability = probabilities[predicted_index].item() # Print the predicted label and its probability print(f"Predicted Label: {predicted_label}, Probability: {predicted_probability:.4f}") ##the output example: predicted Label: neutral, Probability: 0.8377 ```