Sharka commited on
Commit
b0f6e89
1 Parent(s): f25bc04

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -1
README.md CHANGED
@@ -4,4 +4,29 @@ datasets:
4
  - fimu-docproc-research/CIVQA-TesseractOCR
5
  language:
6
  - cs
7
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  - fimu-docproc-research/CIVQA-TesseractOCR
5
  language:
6
  - cs
7
+ tags:
8
+ - document question answering
9
+ ---
10
+
11
+ # LayoutXLM Model Fine-tuned with CIVQA (Tesseract) dataset
12
+
13
+ This is a fine-tuned version of the [LayoutXLM model](https://huggingface.co/microsoft/layoutxlm-base), which was trained on Czech Invoice Visual Question Answering (CIVQA) datasets containing invoices in the Czech language.
14
+
15
+ This model enables Document Visual Question Answering on Czech invoices.
16
+
17
+ All invoices used in this dataset were obtained from public sources. Over these invoices, we were focusing on 15 different entities, which are crucial for processing the invoices.
18
+ - Invoice number
19
+ - Variable symbol
20
+ - Specific symbol
21
+ - Constant symbol
22
+ - Bank code
23
+ - Account number
24
+ - ICO
25
+ - Total amount
26
+ - Invoice date
27
+ - Due date
28
+ - Name of supplier
29
+ - IBAN
30
+ - DIC
31
+ - QR code
32
+ - Supplier's address