Tencent-Hunyuan
/

HunyuanCaptioner

Model card Files Files and versions Community

Zhiminli commited on Jun 30

Commit

0b58d50

•

1 Parent(s): 0dae0c9

Update README.md

Files changed (1) hide show

README.md +10 -11

README.md CHANGED Viewed

@@ -33,14 +33,14 @@ huggingface-cli download Tencent-Hunyuan/HunyuanCaptioner --local-dir ./ckpts/ca
 ### Inference
-Current supported prompts:
-| Target | Prompt |
-| --- | --- |
-| Caption in Chinese | 描述这张图片  |
-| Caption in Chinese with tags | 根据提示词“{}”,描述这张图片 |
-| Caption in English | Please describe the content of this image |
-|   |   |
 a. Single picture inference in Chinese
@@ -49,7 +49,7 @@ a. Single picture inference in Chinese
 python mllm/caption_demo.py --mode "caption_zh" --image_file "mllm/images/demo1.png" --model_path "./ckpts/captioner"
 ```
-b. Single picture inference with tag in Chinese
 ```bash
 python mllm/caption_demo.py --mode "insert_content" --content "宫保鸡丁" --image_file "mllm/images/demo2.png" --model_path "./ckpts/captioner"
@@ -65,14 +65,13 @@ d. Multiple pictures inference in Chinese
 ```bash
 ### Convert multiple pictures to csv file.
-python mllm/make_csv.py --img_dir "mllm/images" --input_file "mllm/images/demo.csv"
 ### Multiple pictures inference
 python mllm/caption_demo.py --mode "caption_zh" --input_file "mllm/images/demo.csv" --output_file "mllm/images/demo_res.csv" --model_path "./ckpts/captioner"
 ```
-(Optional) To convert the output csv file to Arrow format, please refer to
-[Data Preparation #3](https://github.com/Tencent/HunyuanDiT?tab=readme-ov-file#data-preparation) for detailed instructions.
 ### Gradio

 ### Inference
+Current supported prompt templates:
+|Mode           | Prompt template                           |Description                           |
+| ---           | ---                                       | ---                                  |
+|caption_zh     | 描述这张图片                               |Caption in Chinese                    |
+|insert_content | 根据提示词“{}”,描述这张图片                 |Insert specific knowledge into caption|
+|caption_en     | Please describe the content of this image |Caption in English                    |
+|               |                                           |                                      |
 a. Single picture inference in Chinese
 python mllm/caption_demo.py --mode "caption_zh" --image_file "mllm/images/demo1.png" --model_path "./ckpts/captioner"
 ```
+b. Insert specific knowledge into caption
 ```bash
 python mllm/caption_demo.py --mode "insert_content" --content "宫保鸡丁" --image_file "mllm/images/demo2.png" --model_path "./ckpts/captioner"
 ```bash
 ### Convert multiple pictures to csv file.
+python mllm/make_csv.py --files "mllm/images/demo1.png,mllm/images/demo2.png" --input_file "mllm/images/demo.csv"
 ### Multiple pictures inference
 python mllm/caption_demo.py --mode "caption_zh" --input_file "mllm/images/demo.csv" --output_file "mllm/images/demo_res.csv" --model_path "./ckpts/captioner"
 ```
+(Optional) To convert the output csv file to Arrow format, please refer to [Data Preparation #3](https://github.com/Tencent/HunyuanDiT?tab=readme-ov-file#data-preparation) for detailed instructions.
 ### Gradio