zhyever commited on
Commit
4ac0a25
1 Parent(s): 3555a61

update: add support for grayscale color map

Browse files
Files changed (1) hide show
  1. app.py +6 -3
app.py CHANGED
@@ -214,6 +214,8 @@ def process(input_image, prompt, a_prompt, n_prompt, num_samples, image_resoluti
214
 
215
  if color_map == 'magma':
216
  colored_depth = colorize(detected_map)
 
 
217
  else:
218
  colored_depth = colorize_depth_maps(detected_map) * 255
219
 
@@ -281,6 +283,8 @@ PatchFusion is a deep learning model for high-resolution metric depth estimation
281
 
282
  Please refer to our [project webpage](https://zhyever.github.io/patchfusion), [paper](https://arxiv.org/abs/2312.02284) or [github](https://github.com/zhyever/PatchFusion) for more details.
283
 
 
 
284
  # Advanced tips
285
 
286
  The overall pipeline: image --> (PatchFusion) --> depth --> (controlnet) --> generated image.
@@ -289,11 +293,10 @@ As for the PatchFusion, it works on default 4k (2160x3840) resolution. All input
289
 
290
  The output depth map is resized to the original image resolution. Download for better visualization quality. 16-Bit Raw Depth = (pred_depth * 256).to(uint16).
291
 
292
- We provide two color maps to render depth map, which are magma (more common in supervised depth estimation) and spectral (better looking). Please choose from the advanced option.
293
 
294
  For ControlNet, it works on default 896x896 resolution. Again, all input images will be resized to 896x896 before passing through ControlNet as default. You might be not happy because the 4K->896x896 downsampling, but limited by the GPU resource, this demo could only achieve this. This is the memory bottleneck. The output is not resized back to the image resolution for fast inference (Well... It's still so slow now... :D).
295
 
296
- We provide some tips might be helpful: (1) Try our experimental demo (check our github) running on a local 80G gpu (you could try high-resolution generation there, like the one in our paper). But of course, it would be expired soon (in two days maybe); (2) Clone our code repo, and look for a gpu with more than 24G memory; (3) Clone our code repo, run the depth estimation (there are another demos for depth estimation and image-to-3D), and search for another guided high-resolution image generation strategy; (4) Some kind people give this space a stronger gpu support.
297
  """
298
 
299
  with gr.Blocks() as demo:
@@ -310,7 +313,7 @@ with gr.Blocks() as demo:
310
  patch_number = gr.Slider(1, 1024, label="Please decide the number of random patches (Only useful in mode=R)", step=1, value=256)
311
  resolution = gr.Textbox(label="(PatchFusion) Proccessing resolution (Default 4K. Use 'x' to split height and width.)", elem_id='mode', value='2160x3840')
312
  patch_size = gr.Textbox(label="(PatchFusion) Patch size (Default 1/4 of image resolution. Use 'x' to split height and width.)", elem_id='mode', value='540x960')
313
- color_map = gr.Radio(["magma", "spectral"], label="Colormap used to render depth map", elem_id='mode', value='magma')
314
 
315
  num_samples = gr.Slider(label="Images", minimum=1, maximum=12, value=1, step=1)
316
  image_resolution = gr.Slider(label="ControlNet image resolution (higher resolution will lead to OOM)", minimum=256, maximum=1024, value=896, step=64)
 
214
 
215
  if color_map == 'magma':
216
  colored_depth = colorize(detected_map)
217
+ elif color_map == 'gray':
218
+ colored_depth = colorize(detected_map, cmap='gray_r')
219
  else:
220
  colored_depth = colorize_depth_maps(detected_map) * 255
221
 
 
283
 
284
  Please refer to our [project webpage](https://zhyever.github.io/patchfusion), [paper](https://arxiv.org/abs/2312.02284) or [github](https://github.com/zhyever/PatchFusion) for more details.
285
 
286
+ **Running PatchFusion depth estimation pipeline needs about 12GB memory on 4K images.**
287
+
288
  # Advanced tips
289
 
290
  The overall pipeline: image --> (PatchFusion) --> depth --> (controlnet) --> generated image.
 
293
 
294
  The output depth map is resized to the original image resolution. Download for better visualization quality. 16-Bit Raw Depth = (pred_depth * 256).to(uint16).
295
 
296
+ We provide three color maps to render depth map, which are magma (more common in supervised depth estimation), spectral (better looking), and gray (thanks for the suggestion from petermg ). Please choose from the advanced option.
297
 
298
  For ControlNet, it works on default 896x896 resolution. Again, all input images will be resized to 896x896 before passing through ControlNet as default. You might be not happy because the 4K->896x896 downsampling, but limited by the GPU resource, this demo could only achieve this. This is the memory bottleneck. The output is not resized back to the image resolution for fast inference (Well... It's still so slow now... :D).
299
 
 
300
  """
301
 
302
  with gr.Blocks() as demo:
 
313
  patch_number = gr.Slider(1, 1024, label="Please decide the number of random patches (Only useful in mode=R)", step=1, value=256)
314
  resolution = gr.Textbox(label="(PatchFusion) Proccessing resolution (Default 4K. Use 'x' to split height and width.)", elem_id='mode', value='2160x3840')
315
  patch_size = gr.Textbox(label="(PatchFusion) Patch size (Default 1/4 of image resolution. Use 'x' to split height and width.)", elem_id='mode', value='540x960')
316
+ color_map = gr.Radio(["magma", "spectral", "gray"], label="Colormap used to render depth map", elem_id='mode', value='magma')
317
 
318
  num_samples = gr.Slider(label="Images", minimum=1, maximum=12, value=1, step=1)
319
  image_resolution = gr.Slider(label="ControlNet image resolution (higher resolution will lead to OOM)", minimum=256, maximum=1024, value=896, step=64)