to-be/invoice_document_headers_extraction_with_donut

Feb 19, 2023

Hi Toon,

Thanks for uploading your finetuned DONUT model! The results look very promising. Have you performed any benchmarking or comparisons? Would be very interested in hearing about your impressions.

Best,
Louis

to-be

Owner Mar 2, 2023

Hi Louis,

Yes, i have some stats on the validation set
About 200 docs in validation set, and from 60% of them all indexes were captured correctly.

Some observations:

when trying with a non invoice document, it's quite reliably identified as Doctype: 'Other'
validation set contained mostly same layout invoices as the train set. If it was validated against completely differently sourced invoices, the results would be different
Document date is able to recognize different notations, however, it's often wrong because the data set was not diverse enough

Regards

Toon

to-be changed discussion status to closed Mar 9, 2023

Spaces:
Duplicated from nielsr/donut-cord

to-be
/

invoice_document_headers_extraction_with_donut

Running

Benchmarking