license: cc-by-nc-sa-4.0
tags:
- generated_from_trainer
datasets:
- funsd-layoutlmv3
metrics:
- precision
- recall
- f1
- accuracy
model-index:
- name: OCR-LayoutLMv3
results:
- task:
name: Token Classification
type: token-classification
dataset:
name: funsd-layoutlmv3
type: funsd-layoutlmv3
config: funsd
split: train
args: funsd
metrics:
- name: Precision
type: precision
value: 0.8988653182042428
- name: Recall
type: recall
value: 0.905116741182315
- name: F1
type: f1
value: 0.9019801980198019
- name: Accuracy
type: accuracy
value: 0.8403661000832046
OCR-LayoutLMv3
This model is a fine-tuned version of microsoft/layoutlmv3-base on the funsd-layoutlmv3 dataset. It achieves the following results on the evaluation set:
- Loss: 0.9788
- Precision: 0.8989
- Recall: 0.9051
- F1: 0.9020
- Accuracy: 0.8404
Model description
LayoutLMv3 is a pre-trained multimodal Transformer for Document AI with unified text and image masking. The simple unified architecture and training objectives make LayoutLMv3 a general-purpose pre-trained model. For example, LayoutLMv3 can be fine-tuned for both text-centric tasks, including form understanding, receipt understanding, and document visual question answering, and image-centric tasks such as document image classification and document layout analysis.
LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking Yupan Huang, Tengchao Lv, Lei Cui, Yutong Lu, Furu Wei, Preprint 2022.
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- training_steps: 2000
Training results
Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 | Accuracy |
---|---|---|---|---|---|---|---|
No log | 1.33 | 100 | 0.6966 | 0.7418 | 0.8063 | 0.7727 | 0.7801 |
No log | 2.67 | 200 | 0.5767 | 0.8104 | 0.8644 | 0.8365 | 0.8117 |
No log | 4.0 | 300 | 0.5355 | 0.8246 | 0.8852 | 0.8539 | 0.8295 |
No log | 5.33 | 400 | 0.5240 | 0.8706 | 0.8922 | 0.8813 | 0.8427 |
0.5326 | 6.67 | 500 | 0.6337 | 0.8528 | 0.8778 | 0.8651 | 0.8260 |
0.5326 | 8.0 | 600 | 0.6870 | 0.8698 | 0.8828 | 0.8762 | 0.8240 |
0.5326 | 9.33 | 700 | 0.6584 | 0.8723 | 0.9061 | 0.8889 | 0.8342 |
0.5326 | 10.67 | 800 | 0.7186 | 0.8868 | 0.9031 | 0.8949 | 0.8335 |
0.5326 | 12.0 | 900 | 0.6822 | 0.9040 | 0.9076 | 0.9058 | 0.8526 |
0.1248 | 13.33 | 1000 | 0.7042 | 0.8872 | 0.9021 | 0.8946 | 0.8511 |
0.1248 | 14.67 | 1100 | 0.7920 | 0.9027 | 0.9036 | 0.9032 | 0.8480 |
0.1248 | 16.0 | 1200 | 0.8052 | 0.8964 | 0.9151 | 0.9056 | 0.8389 |
0.1248 | 17.33 | 1300 | 0.8932 | 0.8995 | 0.9066 | 0.9030 | 0.8329 |
0.1248 | 18.67 | 1400 | 0.8728 | 0.8950 | 0.9061 | 0.9005 | 0.8398 |
0.0442 | 20.0 | 1500 | 0.9051 | 0.8960 | 0.9116 | 0.9037 | 0.8347 |
0.0442 | 21.33 | 1600 | 0.9587 | 0.8947 | 0.9031 | 0.8989 | 0.8401 |
0.0442 | 22.67 | 1700 | 0.9822 | 0.9042 | 0.9046 | 0.9044 | 0.8389 |
0.0442 | 24.0 | 1800 | 0.9734 | 0.9043 | 0.9061 | 0.9052 | 0.8391 |
0.0442 | 25.33 | 1900 | 0.9842 | 0.9042 | 0.9091 | 0.9066 | 0.8410 |
0.0225 | 26.67 | 2000 | 0.9788 | 0.8989 | 0.9051 | 0.9020 | 0.8404 |
Framework versions
- Transformers 4.25.0.dev0
- Pytorch 1.12.1
- Datasets 2.6.1
- Tokenizers 0.13.1