Spaces:

Joker1212
/

TableDetAndRec

Running

App Files Files Community

Extracting Tables from PDFs without Extra Content: Using Table Transformer for Accurate Recognition

by ashanq - opened 13 days ago

Discussion

ashanq

13 days ago

•

edited 13 days ago

I have uploaded an image that contains extra content along with the table. However, it extracts other text from the PDF file as well. I think you should add a Table Transformer to first recognize the table, and then you can extract the table into HTML or any other format.

ashanq changed discussion title from Extracting Tables from PDFs with Extra Content: Using Table Transformer for Accurate Recognition to Extracting Tables from PDFs without Extra Content: Using Table Transformer for Accurate Recognition 13 days ago

Joker1212

Owner 13 days ago

actually, there is a repo https://huggingface.co/spaces/Joker1212/RapidTableDetection for table extraction

Joker1212

Owner 13 days ago

for pdf, we also has layout repo https://github.com/RapidAI/RapidLayout with many onnx model,easy to use

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment