finetuned-vit-base-patch16-224-upside-down-detector

This model is a fine-tuned version of vit-base-patch16-224-in21k on the custom image orientation dataset adapted from the beans dataset. It achieves the following results on the evaluation set:

  • Accuracy: 0.8947

Training and evaluation data

The custom dataset for image orientation adapted from beans dataset contains a total of 2,590 image samples with 1,295 original and 1,295 upside down. The model was fine-tuned on the train subset and evaluated on validation and test subsets. The dataset splits are listed below:

Split # examples
train 2068
validation 133
test 128

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-04
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 32
  • num_epochs: 5

Training results

Epoch Accuracy
0 0.8609
1 0.8835
2 0.8571
3 0.8941
4 0.8941

Framework versions

  • Transformers 4.17.0
  • Pytorch 1.9.0+cu111
  • Pytorch/XLA 1.9
  • Datasets 2.0.0
  • Tokenizers 0.12.0
Downloads last month
17
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.