--- library_name: transformers metrics: - cer widget: - src: "https://i.ibb.co/QXZFSNx/test7.png" output: text: รมว.ธรรมนัส ลงพื้นที่ language: - th pipeline_tag: image-to-text --- # thai_trocr_thaigov_v2 Vision Encoder Decoder Models - Use microsoft/trocr-base-handwritten as encoder. - Use airesearch/wangchanberta-base-att-spm-uncased as decoder - Fine-tune on 250k synthetic text images dataset using [ThaiGov V2 Corpus](https://github.com/PyThaiNLP/thaigov-v2-corpus) - Use [SynthTIGER](https://github.com/clovaai/synthtiger) to generate synthetic text image. - It is useful to fine-tune any Thai OCR task. # Usage ``` python from PIL import Image from transformers import TrOCRProcessor, VisionEncoderDecoderModel processor = TrOCRProcessor.from_pretrained("kkatiz/thai-trocr-thaigov-v2") model = VisionEncoderDecoderModel.from_pretrained("kkatiz/thai-trocr-thaigov-v2") image = Image.open(img_path).convert("RGB") pixel_values = processor(image, return_tensors="pt").pixel_values generated_ids = model.generate(pixel_values) generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0] print(generated_text) ```