---
library_name: transformers
tags: []
---

# Model Card for Model ID

A fine-tune of Google's ViT-384 model for multi-label image classification on tongue images.


## Model Details

### Model Description

The model will predict the presence/absence of three features; Cracks, Red Dots and Toothmarks.

- **Model type:** Vision Transformer
- **Finetuned from model [optional]:** https://huggingface.co/google/vit-base-patch16-384