AnkitAI's picture
Update README.md
0bb179a verified
|
raw
history blame
4.65 kB
metadata
license: apache-2.0
tags:
  - generated_from_trainer
  - financial
  - stocks
  - sentiment
  - sentiment-analysis
  - financial-news
widget:
  - text: >-
      The company's quarterly earnings surpassed all estimates, indicating
      strong growth.
datasets:
  - financial_phrasebank
metrics:
  - accuracy
model-index:
  - name: AnkitAI/distilbert-base-uncased-financial-news-sentiment-analysis
    results:
      - task:
          name: Text Classification
          type: text-classification
        dataset:
          name: financial_phrasebank
          type: financial_phrasebank
          args: sentences_allagree
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.96688
language:
  - en
base_model:
  - distilbert/distilbert-base-uncased-finetuned-sst-2-english
pipeline_tag: text-classification
library_name: transformers

DistilBERT Fine-Tuned for Financial Sentiment Analysis

Model Description

This model is a fine-tuned version of distilbert-base-uncased specifically tailored for sentiment analysis in the financial domain. It has been trained on the Financial PhraseBank dataset to classify financial texts into three sentiment categories:

  • Negative (label 0)
  • Neutral (label 1)
  • Positive (label 2)

Model Performance

The model was trained for 5 epochs and evaluated on a held-out test set constituting 20 of the dataset.

Evaluation Metrics

Epoch Eval Loss Eval Accuracy
1 0.2210 94.26%
2 0.1997 95.81%
3 0.1719 96.69%
4 0.2073 96.03%
5 0.1941 96.69%

Final Evaluation Accuracy**: 96.69%

Training Metrics

  • Final Training Loss: 0.0797
  • Total Training Time: Approximately 3869 seconds (~1.07 hours)
  • Training Samples per Second: 2.34
  • Training Steps per Second: 0.147

Training Procedure

Data

  • Dataset: Financial PhraseBank
  • Configuration: sentences_allagree (sentences where all annotators agreed on the sentiment)
  • Dataset Size: 2264 sentences
  • Data Split: 80% training (1811 samples), 20% testing (453 samples)

Model Configuration

  • Base Model: distilbert-base-uncased
  • Number of Labels: 3 (negative, neutral, positive)
  • Tokenizer: Same as the base model's tokenizer

Hyperparameters

  • Number of Epochs: 5
  • Batch Size: 16 (training), 64 (evaluation)
  • Learning Rate: 5e-5
  • Optimizer: AdamW
  • Evaluation Metric: Accuracy
  • Seed: 42 (for reproducibility)

Usage

You can load and use the model with the Hugging Face transformers library as follows:

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained('AnkitAI/distilbert-base-uncased-financial-news-sentiment-analysis')
model = AutoModelForSequenceClassification.from_pretrained('AnkitAI/distilbert-base-uncased-financial-news-sentiment-analysis')

text = "The company's quarterly earnings surpassed all estimates, indicating strong growth."
inputs = tokenizer(text, return_tensors="pt")

outputs = model(**inputs)
predictions = outputs.logits.argmax(dim=-1)

label_mapping = {0: 'Negative', 1: 'Neutral', 2: 'Positive'}
print(f"Sentiment: {label_mapping[predictions.item()]}")

License

This model is licensed under the Apache 2.0 License. You are free to use, modify, and distribute this model in your applications.

Citation

If you use this model in your research or applications, please cite it as:

@misc{AnkitAI_2024_financial_sentiment_model,
  title={DistilBERT Fine-Tuned for Financial Sentiment Analysis},
  author={Ankit Aglawe},
  year={2024},
  howpublished={\url{https://huggingface.co/AnkitAI/distilbert-base-uncased-financial-news-sentiment-analysis}},
}

Acknowledgments

  • Hugging Face: For providing the Transformers library and model hosting.
  • Data Providers: Thanks to the creators of the Financial PhraseBank dataset.
  • Community: Appreciation to the open-source community for continual support and contributions.

Contact Information

For questions, feedback, or collaboration opportunities, please contact: