YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Prince Ciel Phantomhive HunyuanVideo LoRA

This repository contains the necessary setup and scripts to generate videos using the HunyuanVideo model with a LoRA (Low-Rank Adaptation) fine-tuned for Ciel Phantomhive. Below are the instructions to install dependencies, download models, and run the demo.


Installation

Step 1: Install System Dependencies

Run the following command to install required system packages:

sudo apt-get update && sudo apt-get install git-lfs ffmpeg cbm

Step 2: Clone the Repository

Clone the repository and navigate to the project directory:

git clone https://huggingface.co/svjack/Prince_Ciel_Phantomhive_HunyuanVideo_lora
cd Prince_Ciel_Phantomhive_HunyuanVideo_lora

Step 3: Install Python Dependencies

Install the required Python packages:

conda create -n py310 python=3.10
conda activate py310
pip install ipykernel
python -m ipykernel install --user --name py310 --display-name "py310"

pip install -r requirements.txt
pip install ascii-magic matplotlib tensorboard huggingface_hub
pip install moviepy==1.0.3
pip install sageattention==1.0.6

pip install torch==2.5.0 torchvision

Download Models

Step 1: Download HunyuanVideo Model

Download the HunyuanVideo model and place it in the ckpts directory:

huggingface-cli download tencent/HunyuanVideo --local-dir ./ckpts

Step 2: Download LLaVA Model

Download the LLaVA model and preprocess it:

cd ckpts
huggingface-cli download xtuner/llava-llama-3-8b-v1_1-transformers --local-dir ./llava-llama-3-8b-v1_1-transformers
wget https://raw.githubusercontent.com/Tencent/HunyuanVideo/refs/heads/main/hyvideo/utils/preprocess_text_encoder_tokenizer_utils.py
python preprocess_text_encoder_tokenizer_utils.py --input_dir llava-llama-3-8b-v1_1-transformers --output_dir text_encoder

Step 3: Download CLIP Model

Download the CLIP model for the text encoder:

huggingface-cli download openai/clip-vit-large-patch14 --local-dir ./text_encoder_2

Demo

Generate Video 1: Ciel Phantomhive

Run the following command to generate a video of Ciel Phantomhive:

python hv_generate_video.py \
    --fp8 \
    --video_size 544 960 \
    --video_length 60 \
    --infer_steps 30 \
    --prompt "Ciel Phantomhive, depicted in a semi-realistic art style. Ciel has short, silver hair with bangs, and an eyepatch over his right eye. He wears a black military-style uniform with white accents, including a high-collared shirt and a belt with a buckle. His expression is stern and focused. The background is a soft, pastel purple, contrasting with the darker tones of his outfit. The image has a clean, polished look with smooth shading and attention to detail in the uniform's textures and folds." \
    --save_path . \
    --output_type both \
    --dit ckpts/hunyuan-video-t2v-720p/transformers/mp_rank_00_model_states.pt \
    --attn_mode sdpa \
    --vae ckpts/hunyuan-video-t2v-720p/vae/pytorch_model.pt \
    --vae_chunk_size 32 \
    --vae_spatial_tile_sample_min_size 128 \
    --text_encoder1 ckpts/text_encoder \
    --text_encoder2 ckpts/text_encoder_2 \
    --seed 1234 \
    --lora_multiplier 1.0 \
    --lora_weight Ciel_im_lora_dir/Ciel_single_im_lora-000030.safetensors

Generate Video 2: Ciel Phantomhive Rain

Run the following command to generate a video of Ciel Phantomhive in rain:

python hv_generate_video.py \
    --fp8 \
    --video_size 544 960 \
    --video_length 60 \
    --infer_steps 30 \
    --prompt "Ciel Phantomhive, depicted in a semi-realistic art style, stands amidst the bustling, rain-soaked streets of a city. Ciel has short, silver hair with bangs, and an eyepatch over his right eye. He wears a black military-style uniform with white accents, including a high-collared shirt and a belt with a buckle, the fabric slightly damp from the drizzle. His expression is stern and focused, as if undeterred by the chaotic surroundings. The background is a moody blend of gray skies and shimmering reflections from the wet pavement, with streaks of rain adding a dynamic texture. Neon lights from nearby buildings cast a faint glow, contrasting with the darker tones of his outfit. The image has a clean, polished look, with smooth shading and meticulous attention to detail in the uniform's textures and folds, emphasizing Ciel's commanding presence in the midst of the urban downpour." \
    --save_path . \
    --output_type both \
    --dit ckpts/hunyuan-video-t2v-720p/transformers/mp_rank_00_model_states.pt \
    --attn_mode sdpa \
    --vae ckpts/hunyuan-video-t2v-720p/vae/pytorch_model.pt \
    --vae_chunk_size 32 \
    --vae_spatial_tile_sample_min_size 128 \
    --text_encoder1 ckpts/text_encoder \
    --text_encoder2 ckpts/text_encoder_2 \
    --seed 1234 \
    --lora_multiplier 1.0 \
    --lora_weight Ciel_im_lora_dir/Ciel_single_im_lora-000030.safetensors


Notes

  • Ensure you have sufficient GPU resources for video generation.
  • Adjust the --video_size, --video_length, and --infer_steps parameters as needed for different output qualities and lengths.
  • The --prompt parameter can be modified to generate videos with different scenes or actions.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.