metadata

tags:
  - generated_from_trainer
  - finance
base_model: cardiffnlp/twitter-roberta-base-sentiment
metrics:
  - accuracy
model-index:
  - name: fine-tuned-cardiffnlp-twitter-roberta-base-sentiment-finance-dataset
    results: []
datasets:
  - CJCJ3030/twitter-financial-news-sentiment
language:
  - en
library_name: transformers
pipeline_tag: text-classification
widget:
  - text: UK house sales up 12% in April
  - text: Singapore oil trader convicted of abetting forgery and cheating HSBC
  - text: >-
      ‘There’s money everywhere’: Milken conference-goers look for a dealmaking
      revival
  - text: ETF buying nearly halves in April as US rate cut hopes recede
  - text: >-
      Todd Boehly’s investment house in advanced talks to buy private credit
      firm
  - text: Berkshire Hathaway’s cash pile hits new record as Buffett dumps stocks
  - text: Harvest partnership to bring HK-listed crypto ETFs to Singapore
  - text: Kazakh oligarch Timur Kulibayev sells Mayfair mansion for £35mn
  - text: Deutsche Bank’s DWS inflated client asset inflows by billions of euro
  - text: UBS reports stronger than expected profit in first quarter

fine-tuned-cardiffnlp-twitter-roberta-base-sentiment-finance-dataset

This model is a fine-tuned version of cardiffnlp/twitter-roberta-base-sentiment on an twitter finance news sentiment dataset. It achieves the following results on the evaluation set:

Loss: 0.3123
Accuracy: 0.8559

10 examples in Inference API are gathered from https://twitter.com/ftfinancenews in early may 2024

Colab Notebook for fine tuning : https://colab.research.google.com/drive/1gvpFbazlxg3AdSldH3w6TYjGUByxqCrh?usp=sharing

Training Data

https://huggingface.co/datasets/CJCJ3030/twitter-financial-news-sentiment/viewer/default/train

Evaluation Data

https://huggingface.co/datasets/CJCJ3030/twitter-financial-news-sentiment/viewer/default/validation

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 120
eval_batch_size: 120
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 5

Training results

Epoch	Step	Validation Loss	Accuracy
1.0	80	0.3123	0.8559
2.0	160	0.3200	0.8576
3.0	240	0.3538	0.8819
4.0	320	0.3695	0.8882
5.0	400	0.4108	0.8869

Framework versions

Transformers 4.40.2
Pytorch 2.2.1+cu121
Datasets 2.19.1
Tokenizers 0.19.1

Citation

@inproceedings{barbieri-etal-2020-tweeteval,
    title = "{T}weet{E}val: Unified Benchmark and Comparative Evaluation for Tweet Classification",
    author = "Barbieri, Francesco  and
      Camacho-Collados, Jose  and
      Espinosa Anke, Luis  and
      Neves, Leonardo",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.148",
    doi = "10.18653/v1/2020.findings-emnlp.148",
    pages = "1644--1650"
}