metadata
license: mit
tags:
- generated_from_trainer
model-index:
- name: deberta_finetune
results: []
deberta_finetune
This model is a fine-tuned version of microsoft/deberta-v3-base on an unknown dataset. It achieves the following results on the evaluation set:
- eval_loss: 0.3943
- eval_accuracy: 0.8673
- eval_runtime: 164.2323
- eval_samples_per_second: 29.178
- eval_steps_per_second: 1.827
- epoch: 2.0
- step: 4164
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3
Framework versions
- Transformers 4.25.1
- Pytorch 1.13.0+cu116
- Datasets 2.8.0
- Tokenizers 0.13.2
Model Recycling
Evaluation on 36 datasets using nc33/deberta_finetune as a base model yields average score of 79.51 in comparison to 79.04 by microsoft/deberta-v3-base.
The model is ranked 3rd among all tested models for the microsoft/deberta-v3-base architecture as of 06/02/2023 Results:
20_newsgroup | ag_news | amazon_reviews_multi | anli | boolq | cb | cola | copa | dbpedia | esnli | financial_phrasebank | imdb | isear | mnli | mrpc | multirc | poem_sentiment | qnli | qqp | rotten_tomatoes | rte | sst2 | sst_5bins | stsb | trec_coarse | trec_fine | tweet_ev_emoji | tweet_ev_emotion | tweet_ev_hate | tweet_ev_irony | tweet_ev_offensive | tweet_ev_sentiment | wic | wnli | wsc | yahoo_answers |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
86.1922 | 90.3667 | 67.48 | 58.5625 | 84.3425 | 73.2143 | 86.5772 | 68 | 79.6667 | 91.5717 | 88.6 | 94.472 | 72.2295 | 89.6359 | 90.1961 | 63.5314 | 87.5 | 93.5567 | 91.672 | 90.2439 | 83.0325 | 95.1835 | 58.371 | 90.4054 | 97.2 | 90.8 | 47.122 | 85.0809 | 59.3939 | 79.0816 | 83.7209 | 70.197 | 70.6897 | 67.6056 | 64.4231 | 72.3333 |
For more information, see: Model Recycling