Kansallisarkisto
/

court-records-region-detection

Image Segmentation

Model card Files Files and versions Community

court-records-region-detection / README.md

MikkoLipsanen

Add AGPL-3.0 License

3034da2 verified 5 months ago

preview code

raw

history blame contribute delete

3.43 kB

	---
	base_model:
	- Ultralytics/YOLOv8
	pipeline_tag: image-segmentation
	license: agpl-3.0
	---

	## Text region detection from Finnish 19th century Court Records

	The model is trained to segment digitized 19th century court record documents based on four predefined text region categories.
	The model has been trained using yolov8x-seg by Ultralytics as the base model.


	## Intended uses & limitations

	The model has been trained to detect four types of text regions from digitized Finnish 19th century court record documents:

	- text paragraphs
	- marginalia
	- page numbers
	- case starts

	Case starts refer to symbols used for indicating the start of a new court case in the data.

	<img src='segmentation_example.jpg' width='700'>

	Most of the training data consist of handwritten documents, but the model appears to generalize quite well also to typeset data.

	## Training data

	Training dataset consisted of 815 digitized and annotated 19th century court record documents, while validation and test
	datasets both contained 102 annotated document images.

	## Training procedure

	This model was trained using 2 NVIDIA RTX A6000 GPUs with the following hyperparameters:

	- image size: 640
	- learning rate (lr0): 0.01
	- train batch size: 32
	- epochs: 100
	- patience: 20 epochs
	- optimizer: SGD
	- scheduler: cosine learning rate scheduler (cos_lr=True)
	- workers: 4

	Default settings were used for other training hyperparameters (find more information [here](https://docs.ultralytics.com/modes/train/#train-settings)).

	Model training was performed using the following code:

	```python
	from ultralytics import YOLO

	# Use pretrained Yolo segmentation model
	model = YOLO('yolov8x-seg.pt')

	# Path to .yaml file where data location and object classes are defined
	yaml_path = 'text_regions.yaml'

	# Start model training with the defined parameters
	model.train(data=yaml_path, name='model_name', epochs=100, imgsz=640, workers=4, optimizer='SGD', lr0=0.01, seed=551, val=True, cos_lr=True, patience=10, batch=32, device=[0,1])
	```

	## Evaluation results

	Evaluation results using the validation dataset are listed below:
	\|Class\|Images\|Class instances\|Box precision\|Box recall\|Box mAP50\|Box mAP50-95\|Mask precision\|Mask recall\|Mask mAP50\|Mask mAP50-95
	\|:----\|:----\|:----\|:----\|:----\|:----\|:----\|:----\|:----\|:----\|:----\|
	All\|102\|563\|0.909\|0.918\|0.94\|0.648\|0.889\|0.892\|0.896\|0.567
	Paragraph\|102\|197\|0.957\|0.99\|0.985\|0.966\|0.957\|0.99\|0.985\|0.952
	Marginalia\|102\|48\|0.887\|0.917\|0.922\|0.664\|0.888\|0.917\|0.927\|0.669
	Page number\|102\|98\|0.884\|0.796\|0.87\|0.421\|0.851\|0.758\|0.808\|0.32
	Case start\|102\|220\|0.91\|0.968\|0.984\|0.541\|0.858\|0.905\|0.864\|0.328

	More information on the performance metrics can be found [here](https://docs.ultralytics.com/guides/yolo-performance-metrics/).

	## Inference

	If the model file `tuomiokirja_regions_04122023.pt` is downloaded to a folder `\models\tuomiokirja_regions_04122023.pt`
	and the input image path is `\data\image.jpg', inference can be perfomed using the following code:

	```python
	from ultralytics import YOLO

	# Initialize model
	model = YOLO('\models\tuomiokirja_regions_04122023.pt')
	prediction_results = model.predict(source='\data\image.jpg', save=True)
	```
	More information for available inference arguments can be found [here](https://docs.ultralytics.com/modes/predict/#inference-arguments).