tk93
/

V-Express

stable-diffusion

Model card Files Files and versions Community

V-Express / README.md

tk93's picture

update: we have release the technical report.

1b526d2 verified 7 months ago

|

1.69 kB

metadata

tags:
  - text-to-image
  - stable-diffusion
  - audio-to-video
license: apache-2.0
language:
  - en
library_name: diffusers

V-Express Model Card

Project Page | Paper | Code

Introduction

Models

Audio Encoder

model_ckpts/wav2vec2-base-960h. (It is also available from the original model card facebook/wav2vec2-base-960h)

Face Analysis

model_ckpts/insightface_models/models/buffalo_l. (It is also available from the original repository insightface/buffalo_l)

V-Express

model_ckpts/sd-vae-ft-mse. VAE encoder. (original model card stabilityai/sd-vae-ft-mse)
model_ckpts/stable-diffusion-v1-5. Only the model configuration file for unet is needed here. (original model card runwayml/stable-diffusion-v1-5)
model_ckpts/v-express. The video generation model conditional on audio and V-kps we call V-Express.