Fine-Tuned LLAVA Model

This repository hosts the fine-tuned LLAVA model files, which have been adapted for data parsing and extracting JSON information from image reciepts. The model was fine-tuned on cord-v2 dataset.

Model Details

Model Versions

  • LLAVA 1.6 Mistral 7B
    Fine-tuned version on Cord-V2 datasets.

How to Use

You can load and use this model directly from the HuggingFace Hub with the transformers library. Below is an example of how to load the model:

from transformers import AutoProcessor, BitsAndBytesConfig, LlavaNextForConditionalGeneration

MODEL_ID = "llava-hf/llava-v1.6-mistral-7b-hf"
REPO_ID = "abhisheksinghrathore/Finetuned-Llava"

processor = AutoProcessor.from_pretrained(MODEL_ID)

quantization_config = BitsAndBytesConfig(
    load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.float16
)
model = LlavaNextForConditionalGeneration.from_pretrained(
    REPO_ID,
    torch_dtype=torch.float16,
    quantization_config=quantization_config,
)

image = Image.open(io.BytesIO(image_bytes))

# Prepare input
prompt = f"[INST] <image>\nExtract JSON [/INST]"
max_output_token = 256
inputs = processor(prompt, image, return_tensors="pt").to("cuda:0")
output = model.generate(**inputs, max_new_tokens=max_output_token)
response = processor.decode(output[0], skip_special_tokens=True)

# Convert response to JSON
generated_json = token2json(response)

To see the fine-tuning process and training configurtaton please visit this GitHub repository.

Additional Resources

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Dataset used to train abhisheksinghrathore/Finetuned-Llava