kkatiz's picture
Update README.md
b56e2a8 verified
|
raw
history blame
1.03 kB
metadata
library_name: transformers
metrics:
  - cer

thai_trocr_thaigov_v2

Vision Encoder Decoder Models

  • Use microsoft/trocr-base-handwritten as encoder.
  • Use airesearch/wangchanberta-base-att-spm-uncased as decoder
  • Fine-tune on 250k synthetic text images dataset using ThaiGov V2 Corpus
  • Use SynthTIGER to generate synthetic text image.
  • It is useful to fine-tune any Thai OCR task.

Usage

from transformers import TrOCRProcessor, VisionEncoderDecoderModel

processor = TrOCRProcessor.from_pretrained("kkatiz/ocr-nithan")
model = VisionEncoderDecoderModel.from_pretrained("kkatiz/ocr-nithan")

image = Image.open(img_path).convert("RGB")
pixel_values = processor(image, return_tensors="pt").pixel_values
generated_ids = model.generate(pixel_values)

generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(generated_text)