Fine-Tuned LLAVA Model
This repository hosts the fine-tuned LLAVA model files, which have been adapted for data parsing and extracting JSON information from image reciepts. The model was fine-tuned on cord-v2 dataset.
Model Details
Model Versions
- LLAVA 1.6 Mistral 7B
Fine-tuned version on Cord-V2 datasets.
How to Use
You can load and use this model directly from the HuggingFace Hub with the transformers
library. Below is an example of how to load the model:
from transformers import AutoProcessor, BitsAndBytesConfig, LlavaNextForConditionalGeneration
MODEL_ID = "llava-hf/llava-v1.6-mistral-7b-hf"
REPO_ID = "abhisheksinghrathore/Finetuned-Llava"
processor = AutoProcessor.from_pretrained(MODEL_ID)
quantization_config = BitsAndBytesConfig(
load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.float16
)
model = LlavaNextForConditionalGeneration.from_pretrained(
REPO_ID,
torch_dtype=torch.float16,
quantization_config=quantization_config,
)
image = Image.open(io.BytesIO(image_bytes))
# Prepare input
prompt = f"[INST] <image>\nExtract JSON [/INST]"
max_output_token = 256
inputs = processor(prompt, image, return_tensors="pt").to("cuda:0")
output = model.generate(**inputs, max_new_tokens=max_output_token)
response = processor.decode(output[0], skip_special_tokens=True)
# Convert response to JSON
generated_json = token2json(response)
To see the fine-tuning process and training configurtaton please visit this GitHub repository.
Additional Resources
- GitHub Repository for Fine-Tuning LLAVA
- A link to a YouTube video will be added here soon to provide further insights and demonstrations.
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.