Text Classification
Transformers
PyTorch
English
roberta
fill-mask
finance
Inference Endpoints

We collects financial domain terms from Investopedia's Financia terms dictionary, NYSSCPA's accounting terminology guide and Harvey's Hypertextual Finance Glossary to expand RoBERTa's vocab dict.

Based on added-financial-terms RoBERTa, we pretrained our model on multilple financial corpus:

In continual pretraining step, we apply following experiments settings to achieve better finetuned results on Four Financial Datasets:

  1. Masking Probability: 0.4 (instead of default 0.15)
  2. Warmup Steps: 0 (deriving better results than models with warmup steps)
  3. Epochs: 1 (is enough in case of overfitting)
  4. weight_decay: 0.01
  5. Train Batch Size: 64
  6. FP16
Downloads last month
28
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Datasets used to train SUFEHeisenberg/Fin-RoBERTa