yifeihu
/

TB-OCR-preview-0.1

Image-Text-to-Text

text-generation

Model card Files Files and versions Community

yifeihu commited on Aug 30, 2024

Commit

0edbcd9

·

verified ·

1 Parent(s): 9af6bc5

Update README.md

Files changed (1) hide show

README.md +4 -2

README.md CHANGED Viewed

@@ -14,7 +14,9 @@ TB-OCR-preview (Text Block OCR), created by [Yifei Hu](https://x.com/hu_yifei),
 **Running the model in 4-bit only requires ~2.8GB VRAM to load and exhibits little to none degradation.**
-This model is recommended to work with **TFT-ID-1.0**[[HF]](https://huggingface.co/yifeihu/TFT-ID-1.0), a text/tale/figure detection model, for full page document parsing.
 ![image/png](https://huggingface.co/yifeihu/TB-OCR-preview-0.1/resolve/main/tb-ocr-cover.png)
@@ -82,7 +84,7 @@ print(response)
 ## About this preview checkpoint
-This is a preview model to verify the quality of a dataset from a synthetic data pipeline. The preview checkpoint only used ~250k image-text pairs (~50M tokens).
 The current model is based on Phi-3.5-vision. Smaller models with even stronger performance are currently being trained or tested.

 **Running the model in 4-bit only requires ~2.8GB VRAM to load and exhibits little to none degradation.**
+## Use Case (Important!)
+**This model is NOT designed to perform OCR on full pages.** Please consider using **TFT-ID-1.0**[[HF]](https://huggingface.co/yifeihu/TFT-ID-1.0), a text/tale/figure detection model, for full page OCR. It's also faster to split the larger text blocks into smaller ones and perform OCR in parallel (batch inference).
 ![image/png](https://huggingface.co/yifeihu/TB-OCR-preview-0.1/resolve/main/tb-ocr-cover.png)
 ## About this preview checkpoint
+This is a preview model to verify the quality of a dataset from a synthetic data pipeline. The preview checkpoint only used \~250k image-text pairs (\~50M tokens).
 The current model is based on Phi-3.5-vision. Smaller models with even stronger performance are currently being trained or tested.