kkatiz's picture
Update README.md
94fd645 verified
|
raw
history blame
1.27 kB
metadata
library_name: transformers
metrics:
  - cer
widget:
  - src: https://i.ibb.co/QXZFSNx/test7.png
    output:
      text: รมว.ธรรมนัส ลงพื้นที่
language:
  - th
pipeline_tag: image-to-text

thai_trocr_thaigov_v2

Vision Encoder Decoder Models

  • Use microsoft/trocr-base-handwritten as encoder.
  • Use airesearch/wangchanberta-base-att-spm-uncased as decoder
  • Fine-tune on 250k synthetic text images dataset using ThaiGov V2 Corpus
  • Use SynthTIGER to generate synthetic text image.
  • It is useful to fine-tune any Thai OCR task.

Usage

from PIL import Image
from transformers import TrOCRProcessor, VisionEncoderDecoderModel

processor = TrOCRProcessor.from_pretrained("kkatiz/thai-trocr-thaigov-v2")
model = VisionEncoderDecoderModel.from_pretrained("kkatiz/thai-trocr-thaigov-v2")

image = Image.open("... your image path").convert("RGB")
pixel_values = processor(image, return_tensors="pt").pixel_values
generated_ids = model.generate(pixel_values)

generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(generated_text)