Edit model card

Fine-Tuned Vision Transformer (ViT) on Traffic Sign Recognition

Vision Transformer (ViT) model pre-trained on ImageNet-21k (14 million images, 21,843 classes) at resolution 224x224. It was introduced in the paper An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale by Dosovitskiy et al. and first released in this repository. Fine-tuned on the German Traffic Sign Recognition Benchmark Dataset.

Model description

  • Model Architecture: Vision Transformer (ViT) - google/vit-base-patch16-224-21k.
  • Fine-tuning Objective: Classify traffic signs into 43 different categories, including various speed limits, warning signs, and prohibitory or regulatory signs.
  • Developer: Aleksandra Cvetanovska

Example Use

from transformers import ViTForImageClassification, ViTImageProcessor
from torch.utils.data import DataLoader
import torch

url = 'https://images.unsplash.com/photo-1572670014853-1d3a3f22b40f?q=80&w=2942&auto=format&fit=crop&ixlib=rb-4.0.3&ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D'
image = Image.open(requests.get(url, stream=True).raw)

model_name = "cvetanovskaa/vit-base-patch16-224-in21k-gtsrb-tuned"
model = ViTForImageClassification.from_pretrained(model_name)
processor = ViTImageProcessor.from_pretrained(model_name)

inputs = processor(images=image, return_tensors="pt")

outputs = model(**inputs)
last_hidden_states = outputs.last_hidden_state

Limitations and Bias

  • The model is trained exclusively on data from German traffic signs, which may not generalize well to signs in other regions due to differences in design and context.
  • Performance may vary under different lighting conditions or when signs are partially occluded

Intended uses & limitations

You can use the fine-tuned model for image classification.

Downloads last month
Model size
85.8M params
Tensor type
Inference API
Drag image file here or click to browse from your device
This model can be loaded on Inference API (serverless).

Dataset used to train cvetanovskaa/vit-base-patch16-224-in21k-gtsrb-tuned

Space using cvetanovskaa/vit-base-patch16-224-in21k-gtsrb-tuned 1