The finetuned LayoutXLm model on Czech dataset for Visual Question Answering

The original model can be found here

The CIVQA dataset is the Czech Invoice dataset for Visual Question Answering

Achieved results:

  eval_answer_text_recall = 0.7065

  eval_answer_text_f1 = 0.6998

  eval_answer_text_precision = 0.7319