File size: 4,645 Bytes
0bb179a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 |
---
license: apache-2.0
tags:
- generated_from_trainer
- financial
- stocks
- sentiment
- sentiment-analysis
- financial-news
widget:
- text: The company's quarterly earnings surpassed all estimates, indicating strong growth.
datasets:
- financial_phrasebank
metrics:
- accuracy
model-index:
- name: AnkitAI/distilbert-base-uncased-financial-news-sentiment-analysis
results:
- task:
name: Text Classification
type: text-classification
dataset:
name: financial_phrasebank
type: financial_phrasebank
args: sentences_allagree
metrics:
- name: Accuracy
type: accuracy
value: 0.96688
language:
- en
base_model:
- distilbert/distilbert-base-uncased-finetuned-sst-2-english
pipeline_tag: text-classification
library_name: transformers
---
# DistilBERT Fine-Tuned for Financial Sentiment Analysis
## Model Description
This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) specifically tailored for sentiment analysis in the financial domain. It has been trained on the [Financial PhraseBank](https://huggingface.co/datasets/financial_phrasebank) dataset to classify financial texts into three sentiment categories:
- Negative (label `0`)
- Neutral (label `1`)
- Positive (label `2`)
## Model Performance
The model was trained for 5 epochs and evaluated on a held-out test set constituting 20 of the dataset.
### Evaluation Metrics
| Epoch | Eval Loss | Eval Accuracy |
|-----------|---------------|-------------------|
| 1 | 0.2210 | 94.26% |
| 2 | 0.1997 | 95.81% |
| 3 | 0.1719 | 96.69% |
| 4 | 0.2073 | 96.03% |
| 5 | 0.1941 | **96.69%** |
Final Evaluation Accuracy**: **96.69%**
### Training Metrics
- **Final Training Loss**: 0.0797
- **Total Training Time**: Approximately 3869 seconds (~1.07 hours)
- **Training Samples per Second**: 2.34
- **Training Steps per Second**: 0.147
## Training Procedure
### Data
- **Dataset**: [Financial PhraseBank](https://huggingface.co/datasets/financial_phrasebank)
- **Configuration**: `sentences_allagree` (sentences where all annotators agreed on the sentiment)
- **Dataset Size**: 2264 sentences
- **Data Split**: 80% training (1811 samples), 20% testing (453 samples)
### Model Configuration
- **Base Model**: [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased)
- **Number of Labels**: 3 (negative, neutral, positive)
- **Tokenizer**: Same as the base model's tokenizer
### Hyperparameters
- **Number of Epochs**: 5
- **Batch Size**: 16 (training), 64 (evaluation)
- **Learning Rate**: 5e-5
- **Optimizer**: AdamW
- **Evaluation Metric**: Accuracy
- **Seed**: 42 (for reproducibility)
## Usage
You can load and use the model with the Hugging Face `transformers` library as follows:
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained('AnkitAI/distilbert-base-uncased-financial-news-sentiment-analysis')
model = AutoModelForSequenceClassification.from_pretrained('AnkitAI/distilbert-base-uncased-financial-news-sentiment-analysis')
text = "The company's quarterly earnings surpassed all estimates, indicating strong growth."
inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)
predictions = outputs.logits.argmax(dim=-1)
label_mapping = {0: 'Negative', 1: 'Neutral', 2: 'Positive'}
print(f"Sentiment: {label_mapping[predictions.item()]}")
```
## License
This model is licensed under the **Apache 2.0 License**. You are free to use, modify, and distribute this model in your applications.
## Citation
If you use this model in your research or applications, please cite it as:
```
@misc{AnkitAI_2024_financial_sentiment_model,
title={DistilBERT Fine-Tuned for Financial Sentiment Analysis},
author={Ankit Aglawe},
year={2024},
howpublished={\url{https://huggingface.co/AnkitAI/distilbert-base-uncased-financial-news-sentiment-analysis}},
}
```
## Acknowledgments
- **Hugging Face**: For providing the Transformers library and model hosting.
- **Data Providers**: Thanks to the creators of the Financial PhraseBank dataset.
- **Community**: Appreciation to the open-source community for continual support and contributions.
## Contact Information
For questions, feedback, or collaboration opportunities, please contact:
- **Name**: Ankit Aglawe
- **Email**: [aglawe.ankit@gmail.com]
- **GitHub**: [GitHub Profile](https://github.com/ankit-aglawe)
- **LinkedIn**: [LinkedIn Profile](https://www.linkedin.com/in/ankit-aglawe)
|