Why do videos get much bigger by processing them?

#13
by KurtWoloch - opened

I tried to add sound to some AI generated videos using this space. These are 5-second videos generated with SkyReels. In their downloaded form from SkyReels they are 1.5 MB long, however, with sound added and re-downloaded they are about 5 MB. How does this increase in size come about? 5 seconds of sound surely can not eat up 3.5 MB... or is the video decoded and re-encoded in a less efficient way?

Great question!

The short answer is yes we re-encode.
The problem is that I cannot find a reliable way to combine the input video and the output audio without re-encoding for all codecs while getting both the video and audio durations right. I have hardcoded an output bitrate of 10Mbps (https://huggingface.co/spaces/hkchengrex/MMAudio/blob/main/mmaudio/data/av_utils.py), so the resultant file might be larger. I might add the raw output audio as output as well, so that users can combine the audio with the video themselves but I don't have the bandwidth at the moment.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment