dennisjooo's picture
Update README.md
55480e5
|
raw
history blame
8.32 kB
metadata
license: apache-2.0
base_model: google/vit-base-patch16-224-in21k
tags:
  - generated_from_trainer
datasets:
  - FastJobs/Visual_Emotional_Analysis
metrics:
  - accuracy
  - precision
  - f1
model-index:
  - name: emotion_classification
    results:
      - task:
          name: Image Classification
          type: image-classification
        dataset:
          name: FastJobs/Visual_Emotional_Analysis
          type: FastJobs/Visual_Emotional_Analysis
          config: FastJobs--Visual_Emotional_Analysis
          split: train
          args: FastJobs--Visual_Emotional_Analysis
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.675
          - name: Precision
            type: precision
            value: 0.6854354001733034
          - name: F1
            type: f1
            value: 0.6750572520063745

Emotion Classification

This model is a fine-tuned version of google/vit-base-patch16-224-in21k on the FastJobs/Visual_Emotional_Analysis dataset.

In theory, the accuracy for a random guess on this dataset is 0.1429.

It achieves the following results on the evaluation set:

  • Loss: 1.0683
  • Accuracy: 0.675
  • Precision: 0.6854
  • F1: 0.6751

Model description

The Vision Transformer base version trained on ImageNet-21K released by Google. Further details can be found on their repo.

Training and evaluation data

Data Split

Used a 4:1 ratio for training and development sets and a random seed of 42. Also used a seed of 42 for batching the data, completely unrelated lol.

Pre-processing Augmentation

The main pre-processing phase for both training and evaluation includes:

  • Bilinear interpolation to resize the image to (224, 224, 3) because it uses ImageNet images to train the original model
  • Normalizing images using a mean and standard deviation of [0.5, 0.5, 0.5] just like the original model

Other than the aforementioned pre-processing, the training set was augmented using:

  • Random horizontal & vertical flip
  • Color jitter
  • Random resized crop

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine_with_restarts
  • lr_scheduler_warmup_steps: 150
  • num_epochs: 300

Training results

Training Loss Epoch Step Validation Loss Accuracy Precision F1
2.0804 1.0 10 2.0881 0.1437 0.2313 0.1165
2.0839 2.0 20 2.0846 0.1562 0.1772 0.1250
2.072 3.0 30 2.0786 0.1562 0.1835 0.1251
2.0676 4.0 40 2.0702 0.1562 0.2213 0.1265
2.053 5.0 50 2.0586 0.1625 0.2289 0.1330
2.0346 6.0 60 2.0390 0.1938 0.3508 0.1830
2.0072 7.0 70 2.0080 0.2437 0.3131 0.2285
1.9672 8.0 80 1.9506 0.325 0.3516 0.3209
1.8907 9.0 90 1.8587 0.3438 0.4010 0.3361
1.7841 10.0 100 1.7300 0.3937 0.4617 0.3860
1.6688 11.0 110 1.6084 0.4625 0.4958 0.4402
1.5803 12.0 120 1.5305 0.4875 0.5327 0.4661
1.5069 13.0 130 1.4577 0.5437 0.5171 0.5126
1.4353 14.0 140 1.3955 0.55 0.6004 0.5380
1.3913 15.0 150 1.3353 0.5437 0.6508 0.4995
1.3551 16.0 160 1.2874 0.5563 0.5251 0.5201
1.2889 17.0 170 1.2618 0.5687 0.5829 0.5475
1.2387 18.0 180 1.2455 0.5687 0.5723 0.5587
1.1977 19.0 190 1.2210 0.5875 0.6221 0.5858
1.1447 20.0 200 1.1909 0.6 0.6153 0.5840
1.0959 21.0 210 1.1918 0.5813 0.5896 0.5609
1.0657 22.0 220 1.1343 0.625 0.6352 0.6184
0.9869 23.0 230 1.1309 0.625 0.6549 0.6258
0.9576 24.0 240 1.1071 0.6312 0.6373 0.6280
0.9234 25.0 250 1.1407 0.6312 0.6469 0.6279
0.876 26.0 260 1.2006 0.5625 0.6040 0.5514
0.8969 27.0 270 1.1007 0.6125 0.6290 0.6121
0.8066 28.0 280 1.1208 0.6 0.6650 0.5971
0.7579 29.0 290 1.1328 0.6125 0.6625 0.6035
0.7581 30.0 300 1.1039 0.6125 0.6401 0.6121
0.7164 31.0 310 1.0862 0.65 0.6723 0.6494
0.7075 32.0 320 1.0575 0.65 0.6683 0.6485
0.6655 33.0 330 1.1186 0.6125 0.6483 0.6134
0.5947 34.0 340 1.1133 0.625 0.6439 0.6272
0.5813 35.0 350 1.1071 0.6312 0.6735 0.6337
0.6322 36.0 360 1.0839 0.6312 0.6591 0.6324
0.561 37.0 370 1.1040 0.625 0.6425 0.6220
0.558 38.0 380 1.0727 0.6125 0.6255 0.6112
0.5372 39.0 390 1.1417 0.6312 0.6545 0.6292
0.5146 40.0 400 1.0967 0.6312 0.6645 0.6285
0.4968 41.0 410 1.1187 0.6312 0.6543 0.6316
0.4593 42.0 420 1.0683 0.675 0.6854 0.6751
0.4392 43.0 430 1.0937 0.6375 0.6481 0.6374
0.4503 44.0 440 1.1320 0.625 0.6536 0.6255
0.3918 45.0 450 1.1218 0.6312 0.6464 0.6312
0.4236 46.0 460 1.2074 0.5938 0.6188 0.5911
0.3858 47.0 470 1.1769 0.5813 0.6106 0.5809
0.392 48.0 480 1.1572 0.625 0.6381 0.6216
0.3708 49.0 490 1.2293 0.6 0.6388 0.5953
0.3346 50.0 500 1.2205 0.5938 0.6188 0.5943
0.3831 51.0 510 1.2875 0.5875 0.5982 0.5845
0.4161 52.0 520 1.2355 0.5938 0.6421 0.5799
0.3736 53.0 530 1.2361 0.6062 0.6301 0.6006
0.3278 54.0 540 1.1670 0.6312 0.6520 0.6286
0.3295 55.0 550 1.1807 0.6438 0.6712 0.6457
0.3357 56.0 560 1.2007 0.625 0.6279 0.6239
0.3169 57.0 570 1.2314 0.5938 0.6257 0.5942
0.3193 58.0 580 1.2068 0.6188 0.6397 0.6208
0.3128 59.0 590 1.2753 0.5875 0.5919 0.5760
0.3077 60.0 600 1.2154 0.625 0.6432 0.6238
0.2751 61.0 610 1.2596 0.6125 0.6216 0.6099
0.2921 62.0 620 1.2716 0.6188 0.6467 0.6189
0.2939 63.0 630 1.2213 0.625 0.6350 0.6264
0.2732 64.0 640 1.3456 0.5938 0.6189 0.5897
0.2806 65.0 650 1.2491 0.6188 0.6393 0.6162
0.2453 66.0 660 1.2312 0.6188 0.6465 0.6195
0.3077 67.0 670 1.2356 0.6375 0.6564 0.6373

Framework versions

  • Transformers 4.33.0
  • Pytorch 2.0.0
  • Datasets 2.1.0
  • Tokenizers 0.13.3