OzzyGT/diffusers-fast-inpaint · How to try the remove feature locally

Sep 20, 2024

So, I tried running the same script locally, but when I try to remove an object from an image (a complex one), it just fills it in with something else rather than removing the object. But when I try it on this space, it works well.

Is there a special prompt to pass in for removing objects? (I tried leaving my prompt as an empty string but it kept doing the same thing).

OzzyGT

Owner Sep 21, 2024

•

edited Sep 21, 2024

Hi, you mean you cloned this repo locally and didn't work? You can read the code, the prompt is just "high quality", nothing else. But this uses a custom pipeline and its just a PoC, you can read about it here: https://huggingface.co/blog/OzzyGT/diffusers-image-fill

Edit: sorry, because of you comment I though this was the other space, this is a special version that was requested where you can prompt what you want to fill, if you just want to fill the image, maybe you can try the other one?

Teggyg

Sep 21, 2024

Hi, you mean you cloned this repo locally and didn't work? You can read the code, the prompt is just "high quality", nothing else. But this uses a custom pipeline and its just a PoC, you can read about it here: https://huggingface.co/blog/OzzyGT/diffusers-image-fill

Edit: sorry, because of you comment I though this was the other space, this is a special version that was requested where you can prompt what you want to fill, if you just want to fill the image, maybe you can try the other one?

Thank you for the response.

I actually tried both, the other space and this one. Surprisingly, I was getting better result with this particular space by not inputting any prompt. For the other space, (the one without prompt option), it just generates something random in the mask area.

Image I used for testing.

This particular space diffusers-fast-inpaint gives the expected output. The only thing is that when I try to run it locally (with gradio and all), I don't get the same results.

Yeah, and I also read through the blog and I must say, it is really well put together and insightful. Thank you once again. I just can't understand why it's working in this space but not locally (same image, same mask and all). I'd appreciate the help.

OzzyGT

Owner Sep 21, 2024

•

edited Sep 21, 2024

thanks and wow, your image its really extreme for this. Do you try testing this with some other tool?

That image has a woman with detailed black hair over a black fence on top of a blurred background. The hair and the black fence are enough to make this really hard to do.

If you crop the image at the bottom, the model doesn't have anything to go by, it will always try to replace it with something else (blurred background or bokeh means the focus is in a subject on the front). Also you practically have to mask the whole image which also doesn't leave that much room for the model to guess how to fill it unless you prompt it about something.

For example, if I crop it a little more upwards, the model starts to fill it in the other space, but it's not that good because of the blurry background.

About the matter with the spaces and your local installation, this space has the same code than the other, the only difference is that you can change the prompt, nothing else. Why it doesn't run the same locally, I really can't tell without more information. I have only tested this with linux at the moment so that can be a reason, or maybe the pytorch version can also influence the results.

OzzyGT

Owner Sep 21, 2024

I did a test with this space, and if you prompt: "black fence, mesh, park in the background, grass, bokeh, depth of field" it works but because of the bokeh it tries to add a post most of the time, but you can get a clean background sometimes.

Teggyg

Sep 22, 2024

thanks and wow, your image its really extreme for this. Do you try testing this with some other tool?

That image has a woman with detailed black hair over a black fence on top of a blurred background. The hair and the black fence are enough to make this really hard to do.

If you crop the image at the bottom, the model doesn't have anything to go by, it will always try to replace it with something else (blurred background or bokeh means the focus is in a subject on the front). Also you practically have to mask the whole image which also doesn't leave that much room for the model to guess how to fill it unless you prompt it about something.

For example, if I crop it a little more upwards, the model starts to fill it in the other space, but it's not that good because of the blurry background.

About the matter with the spaces and your local installation, this space has the same code than the other, the only difference is that you can change the prompt, nothing else. Why it doesn't run the same locally, I really can't tell without more information. I have only tested this with linux at the moment so that can be a reason, or maybe the pytorch version can also influence the results.

Yeah 😅. I really wanted to push it to its limit that's why I used this image. I think I've been able to set it up in such a way that it works on locally. Apparently, for the local space, I didn't mask it properly and so it was generating things that had hair at that spot (humans). So when I masked completely, it removed the woman without having to enter a prompt.

Also, I have been able to get the script working now (i.e. the one from the article). I had to modify the way the mask were converted. My mask was already in the right format (white for focus area and black for background) and so I didn't need to invert again. So I did something like this

mask = load_image("mask_woman_3.webp").convert("L")
binary_mask = mask.point(lambda p: p > 128 and 255)

cnet_image = source_image.copy()
cnet_image.paste(0, (0, 0), binary_mask)

# (rest of the code)

Also, I needed to ensure my source image was in RGBA. Every other thing were kept largely the same.

I even tried another complex image (yeah, I also worked on the aspect ratio so I'm not stuck with 1024 by 1024)

All in all, I'm really grateful for this pipeline. This is the best inpainting I've worked with. You did a really superb job!

Thank you and looking forward to your future works too 😁😉

Teggyg

Sep 22, 2024

I did a test with this space, and if you prompt: "black fence, mesh, park in the background, grass, bokeh, depth of field" it works but because of the bokeh it tries to add a post most of the time, but you can get a clean background sometimes.

Oh yeah, to answer your first question, I actually tried some other tools like lama and MAT. While I find them good for text (remove texts from simple to medium (in terms of complexity) background), they aren't good for complex images as these as they do not really generate coherent results. Most of the time, they just generate a large slurge or blur around where the image was formerly in.

OzzyGT

Owner Sep 22, 2024

glad it worked for you, the fence one looks really impressive