Video-Text-to-Text
Transformers
Safetensors
English
llava
text-generation
multimodal
Eval Results
Inference Endpoints
ZhangYuanhan commited on
Commit
9ac63f9
1 Parent(s): 5a09702

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -190,7 +190,7 @@ pretrained = "lmms-lab/LLaVA-NeXT-Video-7B-Qwen2"
190
  model_name = "llava_qwen"
191
  device = "cuda"
192
  device_map = "auto"
193
- tokenizer, model, image_processor, max_length = load_pretrained_model(pretrained, None, model_name, device_map=device_map) # Add any other thing you want to pass in llava_model_args
194
  model.eval()
195
  video_path = "XXXX"
196
  max_frames_num = "64"
 
190
  model_name = "llava_qwen"
191
  device = "cuda"
192
  device_map = "auto"
193
+ tokenizer, model, image_processor, max_length = load_pretrained_model(pretrained, None, model_name, torch_dtype="bfloat16", device_map=device_map) # Add any other thing you want to pass in llava_model_args
194
  model.eval()
195
  video_path = "XXXX"
196
  max_frames_num = "64"