Add Diffusers weights

#18

by a-r-r-o-w HF staff - opened 15 days ago

base: refs/heads/main

←

from: refs/pr/18

Discussion Files changed

+102295

-1

Upload folder using huggingface_hub8e6dfeac

a-r-r-o-w

15 days ago

•

edited 15 days ago

Thanks for the awesome work! This PR adds the Diffusers-format weights to complete the integration.

Diffusers PR: https://github.com/huggingface/diffusers/pull/10136

Here's the minimal inference code for testing:

import torch
from diffusers import HunyuanVideoPipeline, HunyuanVideoTransformer3DModel
from diffusers.utils import export_to_video

model_id = "tencent/HunyuanVideo"
transformer = HunyuanVideoTransformer3DModel.from_pretrained(
    model_id, subfolder="transformer", torch_dtype=torch.bfloat16
)
pipe = HunyuanVideoPipeline.from_pretrained(model_id, transformer=transformer, torch_dtype=torch.float16)
pipe.vae.enable_tiling()
pipe.to("cuda")

output = pipe(
    prompt="A cat walks on the grass, realistic",
    height=320,
    width=512,
    num_frames=61,
    num_inference_steps=30,
).frames[0]
export_to_video(output, "output.mp4", fps=15)

Once the weights are merged, I will test everything on my end, and merge the PR in diffusers if all is good. Please do test on your end as well and let me know if any changes are required. Happy to help with anything, and thank you so much again for empowering the community with the best open source video model!

a-r-r-o-w changed pull request title from Upload folder using huggingface_hub to Add Diffusers weights 15 days ago

Update README.md2a15b557

softwareweaver

13 days ago

I tried this branch with the above code and it is running out of CUDA memory on RTX 6000 Ada GPU with 48GB of memory.

Is there a way to use Accelerate with this to spread out the model on 4 48GB GPUs. Thanks.

ghunkins

12 days ago

@softwareweaver Try the instructions here: https://github.com/huggingface/diffusers/pull/10136#issuecomment-2550038676

softwareweaver

12 days ago

Thanks @ghunkins
This works!
Is there anyway to to use Accelerate with this to spread out the model on 4 48GB GPUs. Thanks.

a-r-r-o-w

12 days ago

@softwareweaver Yes, you can set device_map="balanced" on the pipeline (to shard all models on multiple GPUs), or device_map="auto" on the transformer (to just shard the transformer). It's also possible to pass a finegrained dictionary where you can specify which layer resides on which GPU.

Here's the relevant documentation:

softwareweaver

11 days ago

Thanks @a-r-r-o-w

Adding device map to transformer gave me this error

transformer = HunyuanVideoTransformer3DModel.from_pretrained(
model_id, subfolder="transformer", torch_dtype=torch.bfloat16, revision="refs/pr/18", quantization_config=quantization_config, device_map="auto"
)

NotImplementedError: Currently, device_map is automatically inferred for quantized bitsandbytes models. Support for providing device_map as an input will be added in the future.

Adding device_map="balanced" to the pipeline only gave the following error when executing the pipeline

pipe = HunyuanVideoPipeline.from_pretrained(model_id, transformer=transformer, torch_dtype=torch.float16, revision="refs/pr/18",device_map="balanced")

File "/home/ash/miniconda3/envs/diffusers/lib/python3.11/site-packages/bitsandbytes/functional.py", line 1989, in gemv_4bit
is_on_gpu([B, A, out, absmax, state.code])
File "/home/ash/miniconda3/envs/diffusers/lib/python3.11/site-packages/bitsandbytes/functional.py", line 469, in is_on_gpu
raise RuntimeError(
RuntimeError: Input tensors need to be on the same GPU, but found the following tensor and device combinations:
[(torch.Size([393216, 1]), device(type='cuda', index=0)), (torch.Size([1, 256]), device(type='cuda', index=3)), (torch.Size([1, 3072]), device(type='cuda', index=3)), (torch.Size([12288]), device(type='cuda', index=0)), (torch.Size([16]), device(type='cuda', index=0))]

ahuang1900

5 days ago

RuntimeError: Failed to import diffusers.pipelines.hunyuan_video.pipeline_hunyuan_video because of the following error (look up to see its traceback):
Failed to import diffusers.models.autoencoders.autoencoder_kl_hunyuan_video because of the following error (look up to see its traceback):
'NoneType' object has no attribute 'start

a-r-r-o-w

5 days ago

@ahuang1900 You need to upgrade the diffusers version to 0.32.0 or install from the main branch

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment