Update README.md
Browse files
README.md
CHANGED
@@ -14,7 +14,9 @@ TB-OCR-preview (Text Block OCR), created by [Yifei Hu](https://x.com/hu_yifei),
|
|
14 |
|
15 |
**Running the model in 4-bit only requires ~2.8GB VRAM to load and exhibits little to none degradation.**
|
16 |
|
17 |
-
|
|
|
|
|
18 |
|
19 |
![image/png](https://huggingface.co/yifeihu/TB-OCR-preview-0.1/resolve/main/tb-ocr-cover.png)
|
20 |
|
@@ -82,7 +84,7 @@ print(response)
|
|
82 |
|
83 |
## About this preview checkpoint
|
84 |
|
85 |
-
This is a preview model to verify the quality of a dataset from a synthetic data pipeline. The preview checkpoint only used
|
86 |
|
87 |
The current model is based on Phi-3.5-vision. Smaller models with even stronger performance are currently being trained or tested.
|
88 |
|
|
|
14 |
|
15 |
**Running the model in 4-bit only requires ~2.8GB VRAM to load and exhibits little to none degradation.**
|
16 |
|
17 |
+
## Use Case (Important!)
|
18 |
+
|
19 |
+
**This model is NOT designed to perform OCR on full pages.** Please consider using **TFT-ID-1.0**[[HF]](https://huggingface.co/yifeihu/TFT-ID-1.0), a text/tale/figure detection model, for full page OCR. It's also faster to split the larger text blocks into smaller ones and perform OCR in parallel (batch inference).
|
20 |
|
21 |
![image/png](https://huggingface.co/yifeihu/TB-OCR-preview-0.1/resolve/main/tb-ocr-cover.png)
|
22 |
|
|
|
84 |
|
85 |
## About this preview checkpoint
|
86 |
|
87 |
+
This is a preview model to verify the quality of a dataset from a synthetic data pipeline. The preview checkpoint only used \~250k image-text pairs (\~50M tokens).
|
88 |
|
89 |
The current model is based on Phi-3.5-vision. Smaller models with even stronger performance are currently being trained or tested.
|
90 |
|