The finetuned LayoutXLm model on Czech dataset for Visual Question Answering
The original model can be found here
The CIVQA dataset is the Czech Invoice dataset for Visual Question Answering
Achieved results:
eval_answer_text_recall = 0.7065
eval_answer_text_f1 = 0.6998
eval_answer_text_precision = 0.7319