Matching the Table Transformer in Hugging Face
Hi,
I am unable to replicate the results from the Table Transformer available in the Hugging Face webpage. I don't get the same results when I use the very straight forward code available in the documentation, I oftentimes get not as good results.
Is there any additional pre-processing done on the image in Hugging Face? If so, what are they?
Thanks,
Hi,
Refer to this notebook to get one-on-one matching results with the Microsoft implementation (this was manually verified).
Hi
@nielsr
,
two backup questions:
- You provide two notebooks that use the table transformer models, here and here. Both notebooks are quite similar, however, in one you defiine the image preprocessor by yourself, in the other you use the DetrFeatureExtractor. Which one is to prefer?
- Are there any image preprocessing steps prior to applying both, the detection and the structure recognition, that improve the quality of the results? I am thinking about adding some padding (if yes, how much?), deskewing, greyscale, binarization, etc.?
Any advice would be highly appreciated!
Hi,
The recommended preprocessing functions are defined here. So I would recommend those image processing steps.
Ideally there should be a TableTransformerImageProcessor
class in the Transformers library which replicates this, I opened an issue for that: https://github.com/huggingface/transformers/issues/30718.
Thank you very much for your quick response, it was very helpful!