|
--- |
|
library_name: transformers |
|
tags: |
|
- vit |
|
- cifar10 |
|
- image classification |
|
license: apache-2.0 |
|
datasets: |
|
- uoft-cs/cifar10 |
|
language: |
|
- en |
|
metrics: |
|
- accuracy |
|
- perplexity |
|
pipeline_tag: image-classification |
|
widget: |
|
- src: ./deer_224x224.png |
|
example_title: deer 224x224 image example |
|
--- |
|
|
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
An adapter for the [google/vit-base-patch16-224](https://huggingface.co/google/vit-base-patch16-224) ViT trained on CIFAR10 classification task |
|
|
|
## Loading guide |
|
|
|
```py |
|
from transformers import AutoModelForImageClassification |
|
|
|
labels2title = ['plane', 'car', 'bird', 'cat', |
|
'deer', 'dog', 'frog', 'horse', 'ship', 'truck'] |
|
model = AutoModelForImageClassification.from_pretrained( |
|
'google/vit-base-patch16-224-in21k', |
|
num_labels=len(labels2title), |
|
id2label={i: c for i, c in enumerate(labels2title)}, |
|
label2id={c: i for i, c in enumerate(labels2title)} |
|
) |
|
model.load_adapter("yturkunov/cifar10_vit16_lora") |
|
``` |
|
|
|
## Learning curves |
|
|
|
![image/png](https://cdn-uploads.huggingface.co/production/uploads/655221be7bd4634260e032ca/Ji1ewA_8T1rJuQkdNCIXQ.png) |
|
|
|
### Recommendations to input |
|
The model expects an image that has went through the following preprocessing stages: |
|
* Scaling range: |
|
<img src="https://latex.codecogs.com/gif.latex?[0, 255]\rightarrow[0, 1]" /> |
|
* Normalization parameters: |
|
<img src="https://latex.codecogs.com/gif.latex?\mu=(.5,.5,.5),\sigma=(.5,.5,.5)" /> |
|
* Dimensions: 224x224 |
|
* Number of channels: 3 |
|
|
|
### Inference on 3x4 random sample |
|
|
|
![image/png](https://cdn-uploads.huggingface.co/production/uploads/655221be7bd4634260e032ca/zxj9ID37gJJnkmc8Sl97A.png) |
|
|
|
|