Distillation / README.md

Update README.md

c540ed0 verified 18 days ago

No virus

3.9 kB

	---
	library_name: hunyuan-dit
	license: other
	license_name: tencent-hunyuan-community
	license_link: https://huggingface.co/Tencent-Hunyuan/HunyuanDiT/blob/main/LICENSE.txt
	language:
	- en
	- zh
	---
	# HunyuanDiT Distillation Acceleration

	Language: English \| [中文](https://huggingface.co/Tencent-Hunyuan/Distillation/blob/main/README_zh.md)

	We provide a distillation version of HunyuanDiT for your inference acceleration.

	Based on progressive distillation method, we accelerate Hunyuan-Dit two times without any performance drop. With the use of distillation model, It achieves the effect of halving the time consumption based on any inference mode.

	The following table shows the requirements for running the distillation model and the acceleration performance of our distillation model (batch size = 1). We evaluate the accelaration on various GPU (like H800,A100, 3090, 4090) as well as different inference mode.

	\| GPU\| CUDA version \| model \| inference mode \| inference steps \| GPU Peak Memory \| inference time \|
	\| --- \| --- \| --- \| --- \| --- \| --- \| --- \|
	\| H800 \| 12.1 \| HunyuanDiT \| PyTorch \| 100 \| 13G \| 28s \|
	\| H800 \| 12.1 \| HunyuanDiT \| TensorRT \| 100 \| 12G \| 10s \|
	\| H800 \| 12.1 \| HunyuanDiT \| Distill+PyTorch \| 50 \| 13G \| 14s \|
	\| H800 \| 12.1 \| HunyuanDiT \| Distill+TensorRT \| 50 \| 12G \| 5s \|
	\| A100 \| 11.7 \| HunyuanDiT \| PyTorch \| 100 \| 13GB \| 54s \|
	\| A100 \| 11.7 \| HunyuanDiT \| TensorRT \| 100 \| 11GB \| 20s \|
	\| A100 \| 11.7 \| HunyuanDiT \| Distill+PyTorch \| 50 \| 13GB \| 25s \|
	\| A100 \| 11.7 \| HunyuanDiT \| Distill+TensorRT \| 50 \| 11GB \| 10s \|
	\| 3090 \| 11.8 \| HunyuanDiT \| PyTorch \| 100 \| 14G \| 98s \|
	\| 3090 \| 11.8 \| HunyuanDiT \| TensorRT \| 100 \| 14G \| 40s \|
	\| 3090 \| 11.8 \| HunyuanDiT \| Distill+PyTorch \| 50 \| 14G \| 49s \|
	\| 3090 \| 11.8 \| HunyuanDiT \| Distill+TensorRT \| 50 \| 14G \| 20s \|
	\| 4090 \| 11.8 \| HunyuanDiT \| PyTorch \| 100 \| 14G \| 54s \|
	\| 4090 \| 11.8 \| HunyuanDiT \| TensorRT \| 100 \| 14G \| 20s \|
	\| 4090 \| 11.8 \| HunyuanDiT \| Distill+PyTorch \| 50 \| 14G \| 27s \|
	\| 4090 \| 11.8 \| HunyuanDiT \| Distill+TensorRT \| 50 \| 14G \| 10s \|

	Basically, the requirements for running the models is the same as the original model.

	## Instructions

	The dependencies and installation are basically the same as the [original model](https://huggingface.co/Tencent-Hunyuan/HunyuanDiT).

	Then download the model using the following commands:

	```bash
	cd HunyuanDiT
	# Use the huggingface-cli tool to download the model.
	huggingface-cli download Tencent-Hunyuan/Distillation ./pytorch_model_distill.pt --local-dir ./ckpts/t2i/model

	```

	## Inference

	### Using Gradio

	Make sure you have activated the conda environment before running the following command.

	```shell
	# By default, we start a Chinese UI.
	python app/hydit_app.py --load-key distill

	# Using Flash Attention for acceleration.
	python app/hydit_app.py --infer-mode fa --load-key distill

	# You can disable the enhancement model if the GPU memory is insufficient.
	# The enhancement will be unavailable until you restart the app without the `--no-enhance` flag.
	python app/hydit_app.py --no-enhance ---load-key distill

	# Start with English UI
	python app/hydit_app.py --lang en --load-key distill
	```


	### Using Command Line

	We provide several commands to quick start:

	```shell
	# Prompt Enhancement + Text-to-Image. Torch mode
	python sample_t2i.py --prompt "渔舟唱晚" --load-key distill --infer-steps 50

	# Only Text-to-Image. Torch mode
	python sample_t2i.py --prompt "渔舟唱晚" --no-enhance --load-key distill --infer-steps 50

	# Only Text-to-Image. Flash Attention mode
	python sample_t2i.py --infer-mode fa --prompt "渔舟唱晚" --load-key distill --infer-steps 50

	# Generate an image with other image sizes.
	python sample_t2i.py --prompt "渔舟唱晚" --image-size 1280 768 --load-key distill --infer-steps 50
	```

	More example prompts can be found in [example_prompts.txt](example_prompts.txt)