Daemontatox
/

PixelParse_AI

Image-Text-to-Text

text-generation-inference

vision-language

document-understanding

data-extraction

Inference Endpoints

Model card Files Files and versions Community

Daemontatox commited on 25 days ago

Commit

5377bb4

·

verified ·

1 Parent(s): df06cb6

Update README.md

Files changed (1) hide show

README.md +13 -6

README.md CHANGED Viewed

@@ -10,12 +10,19 @@ language:
 - en
 ---
-# Uploaded finetuned  model
-- **Developed by:** Daemontatox
-- **License:** apache-2.0
-- **Finetuned from model :** unsloth/Llama-3.2-11B-Vision-Instruct
-This mllama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
-[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

 - en
 ---
+# Vision-Language Model for Document Data Extraction
+- **Developed by:** Daemontatox
+- **License:** apache-2.0
+- **Finetuned from model:** unsloth/Llama-3.2-11B-Vision-Instruct
+This Vision-Language Model (VLM) is fine-tuned for extracting structured data from diverse document types such as invoices, timesheets, and forms. Leveraging the capabilities of [Unsloth](https://github.com/unslothai/unsloth) and Hugging Face's TRL library, the model achieves fast and efficient training with superior accuracy for document understanding tasks.
+### Features:
+- Extracts structured JSON data from images of documents.
+- Handles diverse formats, including invoices, timesheets, and forms.
+- Optimized for semantic accuracy in key fields such as dates, amounts, and itemized details.
+This fine-tuned model was trained twice as fast using Unsloth’s advanced optimization techniques, ensuring high performance with reduced computational overhead.
+[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)