|
--- |
|
license: cc-by-nc-sa-4.0 |
|
--- |
|
This model contains the weights of NExT-GPT covering text-image-video-audio (tiva), which is built upon |
|
- 1) [Vicuna-7B](https://huggingface.co/lmsys/vicuna-7b-delta-v0) with version 0 |
|
- 2) [ImageBind](https://dl.fbaipublicfiles.com/imagebind/imagebind_huge.pth) |
|
- 3) [Stable Diffusion](https://huggingface.co/runwayml/stable-diffusion-v1-5) with version `v1-5`. |
|
- 4) [AudioLDM](https://github.com/haoheliu/AudioLDM) with version `l-full`. |
|
- 5) [ZeroScope](https://huggingface.co/cerspense/zeroscope_v2_576w) with version `v2_576w`. |
|
|
|
For more details about the usage of the model, please refer to our [code repository](https://github.com/NExT-GPT/NExT-GPT). |