metadata

language:
  - en
license: apache-2.0
tags:
  - image-to-text

ViTSTR small v1.0

ViTSTR model pre-trained on various real STR datasets at image size 224x224 with a patch size of 16x16.

Disclaimer: this model card was not written by the original author.

Model description

TODO

Intended uses & limitations

You can use the model for STR on images containing Latin characters (62 case-sensitive alphanumeric + 32 punctuation marks).

How to use

TODO

BibTeX entry and citation info

@InProceedings{atienza2021vision,
  title={Vision transformer for fast and efficient scene text recognition},
  author={Atienza, Rowel},
  booktitle={International Conference on Document Analysis and Recognition},
  pages={319--334},
  year={2021},
  organization={Springer}
}