riikkamarttila commited on
Commit
bc20f5a
·
verified ·
1 Parent(s): e42a587

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -17
README.md CHANGED
@@ -8,61 +8,63 @@
8
 
9
  This model is a fine-tuned version of the microsoft/trocr-large-handwritten model, specialized for recognizing handwritten text. It has been trained on various dataset from 17th to 20th centuries and can be used for applications such as document digitization, form recognition, or any task involving handwritten text extraction.
10
 
11
- #Model Architecture
 
12
  The model is based on a Transformer architecture (TrOCR) with an encoder-decoder setup:
13
 
14
  - The encoder processes images of handwritten text.
15
  - The decoder generates corresponding text output.
16
 
17
- #Intended Use
 
18
  This model is designed for handwritten text recognition and is intended for use in:
19
 
20
  - Document digitization (e.g., archival work, historical manuscripts)
21
  - Handwritten notes transcription
22
 
23
- #Training data
24
  The training dataset includes more than 760 000 samples of handwritten text rows, covering a wide variety of handwriting styles and text samples.
25
 
26
- #Evaluation
27
  The model was evaluated on test dataset. Below are key metrics:
28
 
29
  **Character Error Rate (CER):** 3.2
30
  **Test Dataset Description:** size ~94 900 text rows
31
 
32
- #How to Use the Model
33
  You can use the model directly with Hugging Face’s pipeline function or by manually loading the processor and model.
34
 
35
  ```from transformers import TrOCRProcessor, VisionEncoderDecoderModel
36
- ```from PIL import Image
37
 
38
  # Load the model and processor
39
- ```processor = TrOCRProcessor.from_pretrained("")
40
- ```model = VisionEncoderDecoderModel.from_pretrained("")
41
 
42
  # Open an image of handwritten text
43
- ```image = Image.open("path_to_image.png")
44
 
45
  # Preprocess and predict
46
- ```pixel_values = processor(image, return_tensors="pt").pixel_values
47
- ```generated_ids = model.generate(pixel_values)
48
- ```generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
49
 
50
- ```print(generated_text)
51
 
52
- #Limitations and Biases
53
  The model was trained primarily on handwritten text that uses basic Latin characters (A-Z, a-z) and includes Nordic special characters (å, ä, ö). It has not been trained on non-Latin alphabets, such as Chinese characters, Cyrillic script, or other writing systems like Arabic or Hebrew.
54
  The model may not generalize well to any other languages than Finnish, Swedish or English.
55
  Ethical Considerations
56
  Data Privacy: Be aware of any privacy concerns when applying this model to personal handwritten documents.
57
  Bias: Since the model is trained on specific handwriting styles and datasets, it may exhibit biases towards certain scripts or writing conventions.
58
 
59
- #Future Work
60
  Potential improvements for this model include:
61
 
62
  - Expanding training data: Incorporating more diverse handwriting styles and languages.
63
  - Optimizing for specific domains: Fine-tuning the model on domain-specific handwriting.
64
 
65
- #Citation
66
  If you use this model in your work, please cite it as:
67
 
68
  @misc{multicentury_htr_model_2024,
@@ -73,7 +75,7 @@ If you use this model in your work, please cite it as:
73
  howpublished = {\url{https://huggingface.co/Kansallisarkisto/multicentury-htr-model/}},
74
  }
75
 
76
- ##Model Card Authors
77
  Author: Kansallisarkisto
78
  Contact Information: riikka.marttila@kansallisarkisto.fi, ilkka.jokipii@kansallisarkisto.fi
79
 
 
8
 
9
  This model is a fine-tuned version of the microsoft/trocr-large-handwritten model, specialized for recognizing handwritten text. It has been trained on various dataset from 17th to 20th centuries and can be used for applications such as document digitization, form recognition, or any task involving handwritten text extraction.
10
 
11
+ # Model Architecture
12
+
13
  The model is based on a Transformer architecture (TrOCR) with an encoder-decoder setup:
14
 
15
  - The encoder processes images of handwritten text.
16
  - The decoder generates corresponding text output.
17
 
18
+ # Intended Use
19
+
20
  This model is designed for handwritten text recognition and is intended for use in:
21
 
22
  - Document digitization (e.g., archival work, historical manuscripts)
23
  - Handwritten notes transcription
24
 
25
+ # Training data
26
  The training dataset includes more than 760 000 samples of handwritten text rows, covering a wide variety of handwriting styles and text samples.
27
 
28
+ # Evaluation
29
  The model was evaluated on test dataset. Below are key metrics:
30
 
31
  **Character Error Rate (CER):** 3.2
32
  **Test Dataset Description:** size ~94 900 text rows
33
 
34
+ # How to Use the Model
35
  You can use the model directly with Hugging Face’s pipeline function or by manually loading the processor and model.
36
 
37
  ```from transformers import TrOCRProcessor, VisionEncoderDecoderModel
38
+ from PIL import Image
39
 
40
  # Load the model and processor
41
+ processor = TrOCRProcessor.from_pretrained("")
42
+ model = VisionEncoderDecoderModel.from_pretrained("")
43
 
44
  # Open an image of handwritten text
45
+ image = Image.open("path_to_image.png")
46
 
47
  # Preprocess and predict
48
+ pixel_values = processor(image, return_tensors="pt").pixel_values
49
+ generated_ids = model.generate(pixel_values)
50
+ generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
51
 
52
+ print(generated_text)´´´
53
 
54
+ # Limitations and Biases
55
  The model was trained primarily on handwritten text that uses basic Latin characters (A-Z, a-z) and includes Nordic special characters (å, ä, ö). It has not been trained on non-Latin alphabets, such as Chinese characters, Cyrillic script, or other writing systems like Arabic or Hebrew.
56
  The model may not generalize well to any other languages than Finnish, Swedish or English.
57
  Ethical Considerations
58
  Data Privacy: Be aware of any privacy concerns when applying this model to personal handwritten documents.
59
  Bias: Since the model is trained on specific handwriting styles and datasets, it may exhibit biases towards certain scripts or writing conventions.
60
 
61
+ # Future Work
62
  Potential improvements for this model include:
63
 
64
  - Expanding training data: Incorporating more diverse handwriting styles and languages.
65
  - Optimizing for specific domains: Fine-tuning the model on domain-specific handwriting.
66
 
67
+ # Citation
68
  If you use this model in your work, please cite it as:
69
 
70
  @misc{multicentury_htr_model_2024,
 
75
  howpublished = {\url{https://huggingface.co/Kansallisarkisto/multicentury-htr-model/}},
76
  }
77
 
78
+ ## Model Card Authors
79
  Author: Kansallisarkisto
80
  Contact Information: riikka.marttila@kansallisarkisto.fi, ilkka.jokipii@kansallisarkisto.fi
81