What is the methodology used?

#1
by MedisettyMadhuri - opened

We've been working on the project to classify the documents. It would be of great help if you share the methodology you have used.

Hi, I'm using a pretrained DIT model from microsoft, which is trained on RVL-CDIP dataset(which consists 16 classes). You can customise/finetune this model just by creating a dataset with images afnd resp. labels

can you share a link of training file

@MedisettyMadhuri I'm using Microsoft's https://github.com/microsoft/unilm/tree/master/dit(Document Image transformer's Pretrained model which is specifically trained on https://huggingface.co/datasets/rvl_cdip dataset for document classification downstream task

Sign up or log in to comment