File size: 708 Bytes
6c6fa9e 9aa8ade d49bdd4 |
1 2 3 4 5 6 7 8 9 10 11 |
---
license: cc-by-nc-sa-4.0
---
This model contains the weights of NExT-GPT covering text-image-video-audio (tiva), which is built upon
- 1) [Vicuna-7B](https://huggingface.co/lmsys/vicuna-7b-delta-v0) with version 0
- 2) [ImageBind](https://dl.fbaipublicfiles.com/imagebind/imagebind_huge.pth)
- 3) [Stable Diffusion](https://huggingface.co/runwayml/stable-diffusion-v1-5) with version `v1-5`.
- 4) [AudioLDM](https://github.com/haoheliu/AudioLDM) with version `l-full`.
- 5) [ZeroScope](https://huggingface.co/cerspense/zeroscope_v2_576w) with version `v2_576w`.
For more details about the usage of the model, please refer to our [code repository](https://github.com/NExT-GPT/NExT-GPT). |