Spaces:
Runtime error
Runtime error
Zhenyu Li
commited on
Commit
•
9d27c6d
1
Parent(s):
254d800
update
Browse files
app.py
CHANGED
@@ -198,7 +198,22 @@ description = """Official demo for **PatchFusion: An End-to-End Tile-Based Frame
|
|
198 |
|
199 |
PatchFusion is a deep learning model for high-resolution metric depth estimation from a single image.
|
200 |
|
201 |
-
Please refer to our [project webpage](https://zhyever.github.io/patchfusion), [paper](https://arxiv.org/abs/2312.02284) or [github](https://github.com/zhyever/PatchFusion) for more details.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
202 |
|
203 |
with gr.Blocks() as demo:
|
204 |
gr.Markdown(title)
|
@@ -216,8 +231,8 @@ with gr.Blocks() as demo:
|
|
216 |
# mode = gr.Radio(["P49", "R"], label="Tiling mode", info="We recommand using P49 for fast evaluation and R with 1024 patches for best visualization results, respectively", elem_id='mode', value='R'),
|
217 |
mode = gr.Radio(["P49", "R"], label="Tiling mode", info="We recommand using P49 for fast evaluation and R with 1024 patches for best visualization results, respectively", elem_id='mode', value='P49'),
|
218 |
patch_number = gr.Slider(1, 1024, label="Please decide the number of random patches (Only useful in mode=R)", step=1, value=256)
|
219 |
-
resolution = gr.Textbox(label="PatchFusion
|
220 |
-
patch_size = gr.Textbox(label="Patch size (Default 1/4 of image resolution. Use 'x' to split height and width.)", elem_id='mode', value='540x960')
|
221 |
|
222 |
num_samples = gr.Slider(label="Images", minimum=1, maximum=12, value=1, step=1)
|
223 |
image_resolution = gr.Slider(label="ControlNet image resolution (higher resolution will lead to OOM)", minimum=256, maximum=1024, value=896, step=64)
|
|
|
198 |
|
199 |
PatchFusion is a deep learning model for high-resolution metric depth estimation from a single image.
|
200 |
|
201 |
+
Please refer to our [project webpage](https://zhyever.github.io/patchfusion), [paper](https://arxiv.org/abs/2312.02284) or [github](https://github.com/zhyever/PatchFusion) for more details.
|
202 |
+
|
203 |
+
# Advanced tips
|
204 |
+
|
205 |
+
I know people don't like reading introductions, so you could run this demo without any extra modifications.
|
206 |
+
|
207 |
+
But for people want to do some crazy things, I recommand to read the following texts to better under stand how this demo work.
|
208 |
+
|
209 |
+
The overall pipeline: image --> (PatchFusion) --> depth --> (controlnet) --> generated image.
|
210 |
+
|
211 |
+
As for the PatchFusion, it works on default 4k (2160x3840) resolution. All input images will be resized to 4k before passing through PatchFusion as default. It means if you have a higher resolution image, you can increase the resolution in the advanced option.
|
212 |
+
|
213 |
+
For ControlNet, it works on default 896x896 resolution. Again, all input images will be resized to 896x896 before passing through ControlNet as default. You might be not happy because the 4K->896x896 downsampling, but limited by the GPU resource, this demo could only achieve this.
|
214 |
+
|
215 |
+
We provide some tips might be helpful: (1) Try our [experimental demo](https://55510c1c829b28b9e3.gradio.live) running on a local 80G gpu. But of course, it would be expired soon (in two days maybe); (2) Clone our code repo, and look for a gpu with more than 24G memory; (3) Clone our code repo, run the depth estimation (there are another demos for depth estimation and image-to-3D), and search for another guided high-resolution image generation strategy; (3) Some kind people give this space a stronger gpu support.
|
216 |
+
"""
|
217 |
|
218 |
with gr.Blocks() as demo:
|
219 |
gr.Markdown(title)
|
|
|
231 |
# mode = gr.Radio(["P49", "R"], label="Tiling mode", info="We recommand using P49 for fast evaluation and R with 1024 patches for best visualization results, respectively", elem_id='mode', value='R'),
|
232 |
mode = gr.Radio(["P49", "R"], label="Tiling mode", info="We recommand using P49 for fast evaluation and R with 1024 patches for best visualization results, respectively", elem_id='mode', value='P49'),
|
233 |
patch_number = gr.Slider(1, 1024, label="Please decide the number of random patches (Only useful in mode=R)", step=1, value=256)
|
234 |
+
resolution = gr.Textbox(label="(PatchFusion) Proccessing resolution (Default 4K. Use 'x' to split height and width.)", elem_id='mode', value='2160x3840')
|
235 |
+
patch_size = gr.Textbox(label="(PatchFusion) Patch size (Default 1/4 of image resolution. Use 'x' to split height and width.)", elem_id='mode', value='540x960')
|
236 |
|
237 |
num_samples = gr.Slider(label="Images", minimum=1, maximum=12, value=1, step=1)
|
238 |
image_resolution = gr.Slider(label="ControlNet image resolution (higher resolution will lead to OOM)", minimum=256, maximum=1024, value=896, step=64)
|