Plant_Disease_SWIN_BLIP2_Llama3.2_3B

This is a fine-tuned BLIP-2 model that integrates a Swin Transformer for vision and Llama 3.2 (3B) for language generation. It is optimized for plant disease-related visual question answering (VQA) tasks.

Model Overview

Inference

You can easily perform inference with this model using the HuggingFace transformers library.

Example Inference Code

from transformers import Blip2Processor, Blip2ForConditionalGeneration, SwinModel
from PIL import Image


class CustomBlip2ForConditionalGeneration(Blip2ForConditionalGeneration):
    def __init__(self, config):
        super().__init__(config)
        self.vision_model = SwinModel(config.vision_config)

# Load the processor and model
processor = Blip2Processor.from_pretrained("raghavendrad60/Plant_Disease_SWIN_BLIP2_Llama3.2_3B")
model = CustomBlip2ForConditionalGeneration.from_pretrained("raghavendrad60/Plant_Disease_SWIN_BLIP2_Llama3.2_3B")

# Prepare an image and text input (e.g., a plant image and a relevant question)
image = Image.open("path_to_your_image.jpg")
text = "Q) Name plant and disease."

# Process the inputs
inputs = processor(image, text, return_tensors="pt", padding="max_length", max_length=512, truncation=True)

# Generate output
outputs = model(**inputs)
answer = processor.tokenizer.decode(outputs[0], skip_special_tokens=True)
print("Answer:", answer)

Additional Information

  • Training Code:
    The model was trained using the code available in the Custom-BLIP-2 GitHub repository.

  • Usage:
    This model is designed for research purposes and can be used for plant disease detection and related VQA tasks. It leverages a robust vision encoder and language model to generate high-quality responses.

Links


Feel free to experiment with the model and share your feedback!

Downloads last month
9
Safetensors
Model size
3.34B params
Tensor type
I64
·
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.