README.md · Kansallisarkisto/court-records-region-detection at main

metadata

base_model:
  - Ultralytics/YOLOv8
pipeline_tag: image-segmentation
license: agpl-3.0

Text region detection from Finnish 19th century Court Records

The model is trained to segment digitized 19th century court record documents based on four predefined text region categories. The model has been trained using yolov8x-seg by Ultralytics as the base model.

Intended uses & limitations

The model has been trained to detect four types of text regions from digitized Finnish 19th century court record documents:

text paragraphs
marginalia
page numbers
case starts

Case starts refer to symbols used for indicating the start of a new court case in the data.

Most of the training data consist of handwritten documents, but the model appears to generalize quite well also to typeset data.

Training data

Training dataset consisted of 815 digitized and annotated 19th century court record documents, while validation and test datasets both contained 102 annotated document images.

Training procedure

This model was trained using 2 NVIDIA RTX A6000 GPUs with the following hyperparameters:

image size: 640
learning rate (lr0): 0.01
train batch size: 32
epochs: 100
patience: 20 epochs
optimizer: SGD
scheduler: cosine learning rate scheduler (cos_lr=True)
workers: 4

Default settings were used for other training hyperparameters (find more information here).

Model training was performed using the following code:

from ultralytics import YOLO

# Use pretrained Yolo segmentation model
model = YOLO('yolov8x-seg.pt')                                                                                                               

# Path to .yaml file where data location and object classes are defined
yaml_path = 'text_regions.yaml'

# Start model training with the defined parameters
model.train(data=yaml_path, name='model_name', epochs=100, imgsz=640, workers=4, optimizer='SGD', lr0=0.01, seed=551, val=True, cos_lr=True, patience=10, batch=32, device=[0,1])

Evaluation results

Evaluation results using the validation dataset are listed below:

Class	Images	Class instances	Box precision	Box recall	Box mAP50	Box mAP50-95	Mask precision	Mask recall	Mask mAP50	Mask mAP50-95
All	102	563	0.909	0.918	0.94	0.648	0.889	0.892	0.896	0.567
Paragraph	102	197	0.957	0.99	0.985	0.966	0.957	0.99	0.985	0.952
Marginalia	102	48	0.887	0.917	0.922	0.664	0.888	0.917	0.927	0.669
Page number	102	98	0.884	0.796	0.87	0.421	0.851	0.758	0.808	0.32
Case start	102	220	0.91	0.968	0.984	0.541	0.858	0.905	0.864	0.328

More information on the performance metrics can be found here.

Inference

If the model file tuomiokirja_regions_04122023.pt is downloaded to a folder \models\tuomiokirja_regions_04122023.pt and the input image path is `\data\image.jpg', inference can be perfomed using the following code:

from ultralytics import YOLO

# Initialize model
model = YOLO('\models\tuomiokirja_regions_04122023.pt')
prediction_results = model.predict(source='\data\image.jpg', save=True)

More information for available inference arguments can be found here.