--- license: mit ---
Modality | LoRA tuning | Fine-tuning |
---|---|---|
Video | LanguageBind_Video | LanguageBind_Video_FT |
Audio | LanguageBind_Audio | LanguageBind_Audio_FT |
Depth | LanguageBind_Depth | - |
Thermal | LanguageBind_Thermal | - |
Version | Tuning | Model size | Num_frames | HF Link | MSR-VTT | DiDeMo | ActivityNet | MSVD |
---|---|---|---|---|---|---|---|---|
LanguageBind_Video | LoRA | Large | 8 | Link | 42.6 | 37.8 | 35.1 | 52.2 |
LanguageBind_Video_FT | Full-tuning | Large | 8 | Link | 42.7 | 38.1 | 36.9 | 53.5 |
LanguageBind_Video_V1.5_FT | Full-tuning | Large | 8 | Link | 42.8 | 39.7 | 38.4 | 54.1 |
LanguageBind_Video_V1.5_FT | Full-tuning | Large | 12 | Coming soon | ||||
LanguageBind_Video_Huge_V1.5_FT | Full-tuning | Huge | 8 | Link | 44.8 | 39.9 | 41.0 | 53.7 |
LanguageBind_Video_Huge_V1.5_FT | Full-tuning | Huge | 12 | Coming soon |