--- library_name: transformers tags: [] --- # Model Card for Model ID A fine-tune of Google's ViT-384 model for multi-label image classification on tongue images. ## Model Details ### Model Description The model will predict the presence/absence of three features; Cracks, Red Dots and Toothmarks. - **Model type:** Vision Transformer - **Finetuned from model [optional]:** https://huggingface.co/google/vit-base-patch16-384