Fine Tuning Script For Layout Model Of Surya OCR.
This repository contains layout-fine-tune.ipynb file, Please use this file to fine tune Surya Layout Model. This model uses modified architecture of Segformer.
Setup Instructions
Clone the Surya OCR GitHub Repository
git clone https://github.com/vikp/surya.git
cd surya
Switch to v0.4.14
git checkout f7c6c04
Install Dependencies
You can install the required dependencies using the following command:
pip install -r requirements.txt
Image Pre-processing
For image pre-processing we can directly import a function and image processor from surya ocr github repository.
from surya.input.processing import prepare_image_detection
from surya.model.detection.segformer import load_processor
from PIL import Image
image = Image.open("path/to/image")
images = [prepare_image_detection(img=image, processor=load_processor())]
import torch
images = torch.stack(images, dim=0).to(model.dtype).to(model.device)
Loading Model
from surya.model.detection.segformer import load_model
model = load_model("vikp/surya_layout2")
output = model(pixel_values=images)
Note : Loss function
Surya-layout-Model does not have pre-defined loss function, We have to define it according to our dataset and the Requirements.
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.