fudan-generative-ai/hallo · Generation time

Jun 16

•

you mentioned "every 10 seconds of generation takes ~1 minute" is that referring to 10 seconds of audio = 1 minute?
or is it referring to it being 6x slower on a L4 ?

I created my own private space when Hallo was first released on multiple types of different hardware and the time it took was insane so figured you perhaps fixed something but I'm running a 512x512 png with a 0:09 duration wav on your space duplicated to a L4 and the inference time finished after around 20 minutes so some clarity on the message would be appreciated

Inferencer changed discussion status to closed Jun 16

Inferencer changed discussion status to open Jun 16

johnblues

Jun 17

Same here. I tried 1xL4 and 4xL4. 8-10 second audio with different headshots. I stopped after 10-15 minutes of waiting for it to generate. Much too slow/costly.

SmartMatterAI

Jun 17

Same here. It took over 25 minutes to render 9 seconds with the recommended 1XL4 instance. The instructions had me expecting results much faster so until I saw these comments I'd given up.

Inferencer

Jun 17

well tbh I did do a 256x256 image and changed the ./configs/inference/default.yaml
to a width and height of 256
this decreased the time to a third of what a 512 was
so I can do a 0:23 duration audio in 9 minutes which isn't too bad however all my outputs are messed up and i have tried different source images that would be optimal, I can't share them here but I can share my 512x512 that was a sub-optimal source due to head rotation which gives you the general idea of how all my vids are ending up as I believe their is an optimization failing to load at the start

Inferencer

Jun 18

UI now seems to clarify the 10 secs thing, and based on my usage of 256px to increase speed i can close this now, although something is still broken in the app that is causing weird defects

Inferencer changed discussion status to closed Jun 18