Zhiminli commited on
Commit
0b58d50
1 Parent(s): 0dae0c9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -11
README.md CHANGED
@@ -33,14 +33,14 @@ huggingface-cli download Tencent-Hunyuan/HunyuanCaptioner --local-dir ./ckpts/ca
33
 
34
  ### Inference
35
 
36
- Current supported prompts:
37
 
38
- | Target | Prompt |
39
- | --- | --- |
40
- | Caption in Chinese | 描述这张图片 |
41
- | Caption in Chinese with tags | 根据提示词“{}”,描述这张图片 |
42
- | Caption in English | Please describe the content of this image |
43
- | | |
44
 
45
 
46
  a. Single picture inference in Chinese
@@ -49,7 +49,7 @@ a. Single picture inference in Chinese
49
  python mllm/caption_demo.py --mode "caption_zh" --image_file "mllm/images/demo1.png" --model_path "./ckpts/captioner"
50
  ```
51
 
52
- b. Single picture inference with tag in Chinese
53
 
54
  ```bash
55
  python mllm/caption_demo.py --mode "insert_content" --content "宫保鸡丁" --image_file "mllm/images/demo2.png" --model_path "./ckpts/captioner"
@@ -65,14 +65,13 @@ d. Multiple pictures inference in Chinese
65
 
66
  ```bash
67
  ### Convert multiple pictures to csv file.
68
- python mllm/make_csv.py --img_dir "mllm/images" --input_file "mllm/images/demo.csv"
69
 
70
  ### Multiple pictures inference
71
  python mllm/caption_demo.py --mode "caption_zh" --input_file "mllm/images/demo.csv" --output_file "mllm/images/demo_res.csv" --model_path "./ckpts/captioner"
72
  ```
73
 
74
- (Optional) To convert the output csv file to Arrow format, please refer to
75
- [Data Preparation #3](https://github.com/Tencent/HunyuanDiT?tab=readme-ov-file#data-preparation) for detailed instructions.
76
 
77
 
78
  ### Gradio
 
33
 
34
  ### Inference
35
 
36
+ Current supported prompt templates:
37
 
38
+ |Mode | Prompt template |Description |
39
+ | --- | --- | --- |
40
+ |caption_zh | 描述这张图片 |Caption in Chinese |
41
+ |insert_content | 根据提示词“{}”,描述这张图片 |Insert specific knowledge into caption|
42
+ |caption_en | Please describe the content of this image |Caption in English |
43
+ | | | |
44
 
45
 
46
  a. Single picture inference in Chinese
 
49
  python mllm/caption_demo.py --mode "caption_zh" --image_file "mllm/images/demo1.png" --model_path "./ckpts/captioner"
50
  ```
51
 
52
+ b. Insert specific knowledge into caption
53
 
54
  ```bash
55
  python mllm/caption_demo.py --mode "insert_content" --content "宫保鸡丁" --image_file "mllm/images/demo2.png" --model_path "./ckpts/captioner"
 
65
 
66
  ```bash
67
  ### Convert multiple pictures to csv file.
68
+ python mllm/make_csv.py --files "mllm/images/demo1.png,mllm/images/demo2.png" --input_file "mllm/images/demo.csv"
69
 
70
  ### Multiple pictures inference
71
  python mllm/caption_demo.py --mode "caption_zh" --input_file "mllm/images/demo.csv" --output_file "mllm/images/demo_res.csv" --model_path "./ckpts/captioner"
72
  ```
73
 
74
+ (Optional) To convert the output csv file to Arrow format, please refer to [Data Preparation #3](https://github.com/Tencent/HunyuanDiT?tab=readme-ov-file#data-preparation) for detailed instructions.
 
75
 
76
 
77
  ### Gradio