Model
Baseline LLaVA model for MEGL-Action
Usage
The model is fine-tuned with lora on the Action dataset, hence needs to be loaded with peft.
base_model = LlavaForConditionalGeneration.from_pretrained(
"llava-hf/llava-1.5-7b-hf",
device_map="auto"
)
model = PeftModel.from_pretrained(
base_model,
"TnTerry/MEGL-LLaVA-Baseline-Action",
device_map="auto"
)
To inference with this model, follow the official guidance on 🤗 for LLava inference.
inputs = processor(images=image, text=prompt, return_tensors="pt")
output = model.generate(**inputs, max_length=40)
decoded_output = processor.batch_decode(
output, skip_special_tokens=True, clean_up_tokenization_spaces=False
)[0]