Spaces:

adamelliotfields
/

diffusion

Running on Zero

App Files Files Community

adamelliotfields commited on Aug 18, 2024

Commit

61ad3d2

verified ·

1 Parent(s): c5cf566

Add IP-Adapter

Browse files

Files changed (8) hide show

README.md +6 -5
app.css +4 -0
app.py +74 -31
cli.py +4 -0
lib/config.py +4 -5
lib/inference.py +15 -2
lib/loader.py +49 -11
usage.md +15 -15

README.md CHANGED Viewed

@@ -16,6 +16,7 @@ license: apache-2.0
 models:
 - ai-forever/Real-ESRGAN
 - fluently/Fluently-v4
 - Linaqruf/anything-v3-1
 - Lykon/dreamshaper-8
 - prompthero/openjourney-v4
@@ -28,6 +29,9 @@ preload_from_hub:
 - >-
   fluently/Fluently-v4
   text_encoder/model.fp16.safetensors,unet/diffusion_pytorch_model.fp16.safetensors,vae/diffusion_pytorch_model.fp16.safetensors
 - >-
   Linaqruf/anything-v3-1
   text_encoder/model.safetensors,unet/diffusion_pytorch_model.safetensors,vae/diffusion_pytorch_model.safetensors
@@ -48,9 +52,10 @@ preload_from_hub:
 # diffusion
 Gradio app for Stable Diffusion 1.5 including:
-* txt2img and img2img pipelines
 * Curated models and TI embeddings
 * 100+ styles from sdxl_prompt_styler
 * Compel prompt weighting
 * Multiple samplers with Karras scheduling
 * DeepCache, FreeU, and Clip Skip available
@@ -80,7 +85,3 @@ python app.py --port 7860
 # cli
 python cli.py 'an astronaut riding a horse on mars'
 ```
-## TODO
-- [ ] IP-Adapter and T2I-Adapter

 models:
 - ai-forever/Real-ESRGAN
 - fluently/Fluently-v4
+- h94/IP-Adapter
 - Linaqruf/anything-v3-1
 - Lykon/dreamshaper-8
 - prompthero/openjourney-v4
 - >-
   fluently/Fluently-v4
   text_encoder/model.fp16.safetensors,unet/diffusion_pytorch_model.fp16.safetensors,vae/diffusion_pytorch_model.fp16.safetensors
+- >-
+  h94/IP-Adapter
+  models/ip-adapter-full-face_sd15.safetensors,models/ip-adapter-plus_sd15.safetensors,models/image_encoder/model.safetensors
 - >-
   Linaqruf/anything-v3-1
   text_encoder/model.safetensors,unet/diffusion_pytorch_model.safetensors,vae/diffusion_pytorch_model.safetensors
 # diffusion
 Gradio app for Stable Diffusion 1.5 including:
+* txt2img and img2img pipelines with IP-Adapter
 * Curated models and TI embeddings
 * 100+ styles from sdxl_prompt_styler
+* 150+ prompts from StableStudio
 * Compel prompt weighting
 * Multiple samplers with Karras scheduling
 * DeepCache, FreeU, and Clip Skip available
 # cli
 python cli.py 'an astronaut riding a horse on mars'
 ```

app.css CHANGED Viewed

@@ -47,6 +47,10 @@
   max-width: 42px;
 }
 .popover {
   position: relative;
 }

   max-width: 42px;
 }
+.image-container {
+  max-height: 438px;
+}
 .popover {
   position: relative;
 }

app.py CHANGED Viewed

@@ -44,27 +44,32 @@ def random_fn():
     return gr.Textbox(value=random.choice(prompts))
-# can't toggle interactive in JS
-def gallery_fn(images, image):
-    if image is not None:
         return gr.Dropdown(
             choices=[("🔒", -2)],
             interactive=False,
             value=-2,
         )
-    return gr.Dropdown(
-        choices=[("None", -1)]
-        + [(str(i + 1), i) for i, _ in enumerate(images if images is not None else [])],
-        interactive=True,
-        value=-1,
     )
 def image_prompt_fn(images):
-    return gallery_fn(images, None)
-# can't use image input in JS
 def image_select_fn(images, image, i):
     # -2 is the lock icon, -1 is None
     if i == -2:
@@ -278,29 +283,53 @@ with gr.Blocks(
             with gr.TabItem("🖼️ Image"):
                 with gr.Row():
                     image_prompt = gr.Image(
                         show_label=False,
                         min_width=320,
                         format="png",
                         type="pil",
-                        scale=0,
                     )
-                with gr.Row():
-                    image_select = gr.Dropdown(
-                        choices=[("None", -1)],
-                        label="Load from Gallery",
-                        interactive=True,
-                        filterable=False,
-                        value=-1,
-                    )
-                    denoising_strength = gr.Slider(
-                        value=Config.DENOISING_STRENGTH,
-                        label="Denoising Strength",
-                        minimum=0.0,
-                        maximum=1.0,
-                        step=0.1,
                     )
             with gr.TabItem("ℹ️ Usage"):
                 gr.Markdown(read_file("usage.md"), elem_classes=["markdown"])
@@ -358,9 +387,9 @@ with gr.Blocks(
     seed.change(None, inputs=[seed], outputs=[], js=seed_js)
     file_format.change(
-        lambda f: (gr.Gallery(format=f), gr.Image(format=f)),
         inputs=[file_format],
-        outputs=[output_images, image_prompt],
         show_api=False,
     )
@@ -372,11 +401,11 @@ with gr.Blocks(
         js=aspect_ratio_js,
     )
-    # lock the input image so you don't lose it when the gallery updates
     output_images.change(
         gallery_fn,
-        inputs=[output_images, image_prompt],
-        outputs=[image_select],
         show_api=False,
     )
@@ -387,6 +416,12 @@ with gr.Blocks(
         outputs=[image_prompt],
         show_api=False,
     )
     # reset the dropdown on clear
     image_prompt.clear(
@@ -395,6 +430,12 @@ with gr.Blocks(
         outputs=[image_select],
         show_api=False,
     )
     # show "Custom" aspect ratio when manually changing width or height
     gr.on(
@@ -415,6 +456,8 @@ with gr.Blocks(
             prompt,
             negative_prompt,
             image_prompt,
             embeddings,
             style,
             seed,

     return gr.Textbox(value=random.choice(prompts))
+def create_image_dropdown(images, locked=False):
+    if locked:
         return gr.Dropdown(
             choices=[("🔒", -2)],
             interactive=False,
             value=-2,
         )
+    else:
+        return gr.Dropdown(
+            choices=[("None", -1)] + [(str(i + 1), i) for i, _ in enumerate(images or [])],
+            interactive=True,
+            value=-1,
+        )
+def gallery_fn(images, image, ip_image):
+    return (
+        create_image_dropdown(images, locked=image is not None),
+        create_image_dropdown(images, locked=ip_image is not None),
     )
 def image_prompt_fn(images):
+    return create_image_dropdown(images)
 def image_select_fn(images, image, i):
     # -2 is the lock icon, -1 is None
     if i == -2:
             with gr.TabItem("🖼️ Image"):
                 with gr.Row():
                     image_prompt = gr.Image(
+                        show_share_button=False,
                         show_label=False,
                         min_width=320,
                         format="png",
                         type="pil",
                     )
+                    ip_image = gr.Image(
+                        show_share_button=False,
+                        label="IP-Adapter",
+                        min_width=320,
+                        format="png",
+                        type="pil",
                     )
+                with gr.Group():
+                    with gr.Row():
+                        image_select = gr.Dropdown(
+                            choices=[("None", -1)],
+                            label="Gallery Image",
+                            interactive=True,
+                            filterable=False,
+                            value=-1,
+                        )
+                        ip_image_select = gr.Dropdown(
+                            choices=[("None", -1)],
+                            label="Gallery Image (IP-Adapter)",
+                            interactive=True,
+                            filterable=False,
+                            value=-1,
+                        )
+                    with gr.Row():
+                        denoising_strength = gr.Slider(
+                            value=Config.DENOISING_STRENGTH,
+                            label="Denoising Strength",
+                            minimum=0.0,
+                            maximum=1.0,
+                            step=0.1,
+                        )
+                    with gr.Row():
+                        ip_face = gr.Checkbox(
+                            elem_classes=["checkbox"],
+                            label="IP-Adapter Face",
+                            value=False,
+                        )
             with gr.TabItem("ℹ️ Usage"):
                 gr.Markdown(read_file("usage.md"), elem_classes=["markdown"])
     seed.change(None, inputs=[seed], outputs=[], js=seed_js)
     file_format.change(
+        lambda f: (gr.Gallery(format=f), gr.Image(format=f), gr.Image(format=f)),
         inputs=[file_format],
+        outputs=[output_images, image_prompt, ip_image],
         show_api=False,
     )
         js=aspect_ratio_js,
     )
+    # lock the input images so you don't lose them when the gallery updates
     output_images.change(
         gallery_fn,
+        inputs=[output_images, image_prompt, ip_image],
+        outputs=[image_select, ip_image_select],
         show_api=False,
     )
         outputs=[image_prompt],
         show_api=False,
     )
+    ip_image_select.change(
+        image_select_fn,
+        inputs=[output_images, ip_image, ip_image_select],
+        outputs=[ip_image],
+        show_api=False,
+    )
     # reset the dropdown on clear
     image_prompt.clear(
         outputs=[image_select],
         show_api=False,
     )
+    ip_image.clear(
+        image_prompt_fn,
+        inputs=[output_images],
+        outputs=[ip_image_select],
+        show_api=False,
+    )
     # show "Custom" aspect ratio when manually changing width or height
     gr.on(
             prompt,
             negative_prompt,
             image_prompt,
+            ip_image,
+            ip_face,
             embeddings,
             style,
             seed,

cli.py CHANGED Viewed

@@ -31,6 +31,8 @@ def main():
     parser.add_argument("--steps", type=int, metavar="INT", default=Config.INFERENCE_STEPS)
     parser.add_argument("--strength", type=float, metavar="FLOAT", default=Config.DENOISING_STRENGTH)
     parser.add_argument("--image", type=str, metavar="STR")
     parser.add_argument("--taesd", action="store_true")
     parser.add_argument("--clip-skip", action="store_true")
     parser.add_argument("--truncate", action="store_true")
@@ -44,6 +46,8 @@ def main():
         args.prompt,
         args.negative,
         args.image,
         args.embedding,
         args.style,
         args.seed,

     parser.add_argument("--steps", type=int, metavar="INT", default=Config.INFERENCE_STEPS)
     parser.add_argument("--strength", type=float, metavar="FLOAT", default=Config.DENOISING_STRENGTH)
     parser.add_argument("--image", type=str, metavar="STR")
+    parser.add_argument("--ip-image", type=str, metavar="STR")
+    parser.add_argument("--ip-face", action="store_true")
     parser.add_argument("--taesd", action="store_true")
     parser.add_argument("--clip-skip", action="store_true")
     parser.add_argument("--truncate", action="store_true")
         args.prompt,
         args.negative,
         args.image,
+        args.ip_image,
+        args.ip_face,
         args.embedding,
         args.style,
         args.seed,

lib/config.py CHANGED Viewed

@@ -20,12 +20,11 @@ Config = SimpleNamespace(
     ],
     SCHEDULER="DEIS 2M",
     SCHEDULERS=[
         "DEIS 2M",
         "DPM++ 2M",
-        "DPM2 a",
         "Euler a",
-        "Heun",
-        "LMS",
         "PNDM",
     ],
     EMBEDDING="fast_negative",
@@ -39,8 +38,8 @@ Config = SimpleNamespace(
     HEIGHT=576,
     NUM_IMAGES=1,
     SEED=-1,
-    GUIDANCE_SCALE=7,
-    INFERENCE_STEPS=30,
     DENOISING_STRENGTH=0.6,
     DEEPCACHE_INTERVAL=2,
     SCALE=1,

     ],
     SCHEDULER="DEIS 2M",
     SCHEDULERS=[
+        "DDIM",
         "DEIS 2M",
         "DPM++ 2M",
+        "Euler",
         "Euler a",
         "PNDM",
     ],
     EMBEDDING="fast_negative",
     HEIGHT=576,
     NUM_IMAGES=1,
     SEED=-1,
+    GUIDANCE_SCALE=6,
+    INFERENCE_STEPS=35,
     DENOISING_STRENGTH=0.6,
     DEEPCACHE_INTERVAL=2,
     SCALE=1,

lib/inference.py CHANGED Viewed

@@ -75,6 +75,8 @@ def generate(
     positive_prompt,
     negative_prompt="",
     image_prompt=None,
     embeddings=[],
     style=None,
     seed=None,
@@ -120,11 +122,17 @@ def generate(
     KIND = "img2img" if image_prompt is not None else "txt2img"
     with torch.inference_mode():
         start = time.perf_counter()
         loader = Loader()
         pipe, upscaler = loader.load(
             KIND,
             model,
             scheduler,
             karras,
@@ -146,10 +154,12 @@ def generate(
                     token=f"<{embedding}>",
                 )
                 negative_prompt = (
-                    f"{negative_prompt}, {embedding}" if negative_prompt else embedding
                 )
             except (EnvironmentError, HFValidationError, RepositoryNotFoundError):
-                raise Error(f"Invalid embedding: {embedding}")
         # prompt embeds
         compel = Compel(
@@ -202,6 +212,9 @@ def generate(
                 kwargs["strength"] = denoising_strength
                 kwargs["image"] = prepare_image(image_prompt, (width, height))
             try:
                 image = pipe(**kwargs).images[0]
                 if scale > 1:

     positive_prompt,
     negative_prompt="",
     image_prompt=None,
+    ip_image=None,
+    ip_face=False,
     embeddings=[],
     style=None,
     seed=None,
     KIND = "img2img" if image_prompt is not None else "txt2img"
+    IP_ADAPTER = None
+    if ip_image:
+        IP_ADAPTER = "full-face" if ip_face else "plus"
     with torch.inference_mode():
         start = time.perf_counter()
         loader = Loader()
         pipe, upscaler = loader.load(
             KIND,
+            IP_ADAPTER,
             model,
             scheduler,
             karras,
                     token=f"<{embedding}>",
                 )
                 negative_prompt = (
+                    f"{negative_prompt}, (<{embedding}>)1.1"
+                    if negative_prompt
+                    else f"(<{embedding}>)1.1"
                 )
             except (EnvironmentError, HFValidationError, RepositoryNotFoundError):
+                raise Error(f"Invalid embedding: <{embedding}>")
         # prompt embeds
         compel = Compel(
                 kwargs["strength"] = denoising_strength
                 kwargs["image"] = prepare_image(image_prompt, (width, height))
+            if IP_ADAPTER:
+                kwargs["ip_adapter_image"] = prepare_image(ip_image, (width, height))
             try:
                 image = pipe(**kwargs).images[0]
                 if scale > 1:

lib/loader.py CHANGED Viewed

@@ -1,17 +1,17 @@
 import torch
 from DeepCache import DeepCacheSDHelper
 from diffusers import (
     DEISMultistepScheduler,
     DPMSolverMultistepScheduler,
     EulerAncestralDiscreteScheduler,
-    HeunDiscreteScheduler,
-    KDPM2AncestralDiscreteScheduler,
-    LMSDiscreteScheduler,
     PNDMScheduler,
     StableDiffusionImg2ImgPipeline,
     StableDiffusionPipeline,
 )
 from diffusers.models import AutoencoderKL, AutoencoderTiny
 from torch._dynamo import OptimizedModule
 from .upscaler import RealESRGAN
@@ -29,6 +29,7 @@ class Loader:
             cls._instance = super(Loader, cls).__new__(cls)
             cls._instance.pipe = None
             cls._instance.upscaler = None
         return cls._instance
     def _load_upscaler(self, device=None, scale=4):
@@ -61,7 +62,38 @@ class Loader:
             # https://github.com/ChenyangSi/FreeU
             self.pipe.enable_freeu(b1=1.5, b2=1.6, s1=0.9, s2=0.2)
-    def _load_vae(self, model_name=None, taesd=False, variant=None):
         vae_type = type(self.pipe.vae)
         is_kl = issubclass(vae_type, (AutoencoderKL, OptimizedModule))
         is_tiny = issubclass(vae_type, AutoencoderTiny)
@@ -97,10 +129,12 @@ class Loader:
             self.pipe = pipelines[kind].from_pretrained(model, **kwargs).to(device, dtype)
         if not isinstance(self.pipe, pipelines[kind]):
             self.pipe = pipelines[kind].from_pipe(self.pipe).to(device, dtype)
     def load(
         self,
         kind,
         model,
         scheduler,
         karras,
@@ -114,26 +148,29 @@ class Loader:
         model_lower = model.lower()
         schedulers = {
             "DEIS 2M": DEISMultistepScheduler,
             "DPM++ 2M": DPMSolverMultistepScheduler,
-            "DPM2 a": KDPM2AncestralDiscreteScheduler,
             "Euler a": EulerAncestralDiscreteScheduler,
-            "Heun": HeunDiscreteScheduler,
-            "LMS": LMSDiscreteScheduler,
             "PNDM": PNDMScheduler,
         }
         scheduler_kwargs = {
             "beta_schedule": "scaled_linear",
             "timestep_spacing": "leading",
-            "use_karras_sigmas": karras,
             "beta_start": 0.00085,
             "beta_end": 0.012,
             "steps_offset": 1,
         }
-        if scheduler in ["Euler a", "PNDM"]:
-            del scheduler_kwargs["use_karras_sigmas"]
         # no fp16 variant
         if model_lower not in [
@@ -175,7 +212,8 @@ class Loader:
             self.pipe = None
             self._load_pipeline(kind, model_lower, device, dtype, **pipe_kwargs)
-        self._load_vae(model_lower, taesd, variant)
         self._load_freeu(freeu)
         self._load_deepcache(deepcache)
         self._load_upscaler(device, scale)

 import torch
 from DeepCache import DeepCacheSDHelper
 from diffusers import (
+    DDIMScheduler,
     DEISMultistepScheduler,
     DPMSolverMultistepScheduler,
     EulerAncestralDiscreteScheduler,
+    EulerDiscreteScheduler,
     PNDMScheduler,
     StableDiffusionImg2ImgPipeline,
     StableDiffusionPipeline,
 )
 from diffusers.models import AutoencoderKL, AutoencoderTiny
+from diffusers.models.attention_processor import AttnProcessor2_0, IPAdapterAttnProcessor2_0
 from torch._dynamo import OptimizedModule
 from .upscaler import RealESRGAN
             cls._instance = super(Loader, cls).__new__(cls)
             cls._instance.pipe = None
             cls._instance.upscaler = None
+            cls._instance.ip_adapter = None
         return cls._instance
     def _load_upscaler(self, device=None, scale=4):
             # https://github.com/ChenyangSi/FreeU
             self.pipe.enable_freeu(b1=1.5, b2=1.6, s1=0.9, s2=0.2)
+    def _load_ip_adapter(self, ip_adapter=None):
+        if self.ip_adapter is None and self.ip_adapter != ip_adapter:
+            self.pipe.load_ip_adapter(
+                "h94/IP-Adapter",
+                subfolder="models",
+                weight_name=f"ip-adapter-{ip_adapter}_sd15.safetensors",
+            )
+            self.pipe.set_ip_adapter_scale(0.6 if ip_adapter == "full-face" else 0.5)
+            self.ip_adapter = ip_adapter
+        if self.ip_adapter is not None and ip_adapter is None:
+            if not isinstance(self.pipe, StableDiffusionImg2ImgPipeline):
+                self.pipe.image_encoder = None
+                self.pipe.register_to_config(image_encoder=[None, None])
+            self.pipe.feature_extractor = None
+            self.pipe.unet.encoder_hid_proj = None
+            self.pipe.unet.config.encoder_hid_dim_type = None
+            self.pipe.register_to_config(feature_extractor=[None, None])
+            attn_procs = {}
+            for name, value in self.pipe.unet.attn_processors.items():
+                attn_processor_class = AttnProcessor2_0()  # raises if not torch 2
+                attn_procs[name] = (
+                    attn_processor_class
+                    if isinstance(value, IPAdapterAttnProcessor2_0)
+                    else value.__class__()
+                )
+            self.pipe.unet.set_attn_processor(attn_procs)
+            self.pipe.ip_adapter = None
+    def _load_vae(self, taesd=False, model_name=None, variant=None):
         vae_type = type(self.pipe.vae)
         is_kl = issubclass(vae_type, (AutoencoderKL, OptimizedModule))
         is_tiny = issubclass(vae_type, AutoencoderTiny)
             self.pipe = pipelines[kind].from_pretrained(model, **kwargs).to(device, dtype)
         if not isinstance(self.pipe, pipelines[kind]):
             self.pipe = pipelines[kind].from_pipe(self.pipe).to(device, dtype)
+            self.ip_adapter = None
     def load(
         self,
         kind,
+        ip_adapter,
         model,
         scheduler,
         karras,
         model_lower = model.lower()
         schedulers = {
+            "DDIM": DDIMScheduler,
             "DEIS 2M": DEISMultistepScheduler,
             "DPM++ 2M": DPMSolverMultistepScheduler,
+            "Euler": EulerDiscreteScheduler,
             "Euler a": EulerAncestralDiscreteScheduler,
             "PNDM": PNDMScheduler,
         }
         scheduler_kwargs = {
             "beta_schedule": "scaled_linear",
             "timestep_spacing": "leading",
             "beta_start": 0.00085,
             "beta_end": 0.012,
             "steps_offset": 1,
         }
+        if scheduler not in ["DDIM", "Euler a", "PNDM"]:
+            scheduler_kwargs["use_karras_sigmas"] = karras
+        # https://github.com/huggingface/diffusers/blob/8a3f0c1/scripts/convert_original_stable_diffusion_to_diffusers.py#L939
+        if scheduler == "DDIM":
+            scheduler_kwargs["clip_sample"] = False
+            scheduler_kwargs["set_alpha_to_one"] = False
         # no fp16 variant
         if model_lower not in [
             self.pipe = None
             self._load_pipeline(kind, model_lower, device, dtype, **pipe_kwargs)
+        self._load_ip_adapter(ip_adapter)
+        self._load_vae(taesd, model_lower, variant)
         self._load_freeu(freeu)
         self._load_deepcache(deepcache)
         self._load_upscaler(device, scale)

usage.md CHANGED Viewed

@@ -12,6 +12,8 @@ Positive and negative prompts are embedded by [Compel](https://github.com/damian
 Note that `++` is `1.1^2` (and so on). See [syntax features](https://github.com/damian0815/compel/blob/main/doc/syntax.md) to learn more and read [Civitai](https://civitai.com)'s guide on [prompting](https://education.civitai.com/civitais-prompt-crafting-guide-part-1-basics/) for best practices.
 #### Arrays
 Arrays allow you to generate different images from a single prompt. For example, `[[cat,corgi]]` will expand into 2 separate prompts. Make sure `Images` is set accordingly (e.g., 2). Only works for the positive prompt. Inspired by [Fooocus](https://github.com/lllyasviel/Fooocus/pull/1503).
@@ -30,7 +32,7 @@ Styles are prompt templates from twri's [sdxl_prompt_styler](https://github.com/
 ### Scale
-Rescale up to 4x using [Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN).
 ### Models
@@ -45,27 +47,25 @@ Each model checkpoint has a different aesthetic:
 ### Schedulers
-Optionally, the [Karras](https://arxiv.org/abs/2206.00364) noise schedule can be used:
-* [DEIS 2M](https://huggingface.co/docs/diffusers/en/api/schedulers/deis) (default)
-* [DPM++ 2M](https://huggingface.co/docs/diffusers/en/api/schedulers/multistep_dpm_solver)
-* [DPM2 a](https://huggingface.co/docs/diffusers/api/schedulers/dpm_discrete_ancestral)
-* [Euler a](https://huggingface.co/docs/diffusers/en/api/schedulers/euler_ancestral)
-* [Heun](https://huggingface.co/docs/diffusers/api/schedulers/heun)
-* [LMS](https://huggingface.co/docs/diffusers/api/schedulers/lms_discrete)
-* [PNDM](https://huggingface.co/docs/diffusers/api/schedulers/pndm)
 ### Image-to-Image
-The `🖼️ Image` tab enables the image-to-image pipeline. Either use the image input or select a generation from the gallery and then adjust the denoising strength. To disable, simply clear the image input (the `x` overlay button).
-Denoising strength is essentially how much the generation will differ from the input image. A value of `0` will be identical to the original, while `1` will be a completely new image. You may want to also increase the number of inference steps.
 ### Advanced
 #### DeepCache
-[DeepCache](https://github.com/horseee/DeepCache) (Ma et al. 2023) caches lower U-Net layers and reuses them every `Interval` steps:
 * `1`: no caching
 * `2`: more quality (default)
 * `3`: balanced
@@ -73,7 +73,7 @@ Denoising strength is essentially how much the generation will differ from the i
 #### FreeU
-[FreeU](https://github.com/ChenyangSi/FreeU) (Si et al. 2023) re-weights the contributions sourced from the U-Net’s skip connections and backbone feature maps to potentially improve image quality.
 #### Clip Skip
@@ -81,7 +81,7 @@ When enabled, the last CLIP layer is skipped. This can sometimes improve image q
 #### Tiny VAE
-Enable [madebyollin/taesd](https://github.com/madebyollin/taesd) for almost instant latent decoding with a minor loss in detail. Useful for development.
 #### Prompt Truncation

 Note that `++` is `1.1^2` (and so on). See [syntax features](https://github.com/damian0815/compel/blob/main/doc/syntax.md) to learn more and read [Civitai](https://civitai.com)'s guide on [prompting](https://education.civitai.com/civitais-prompt-crafting-guide-part-1-basics/) for best practices.
+You can also press the `🎲` button to generate a random prompt.
 #### Arrays
 Arrays allow you to generate different images from a single prompt. For example, `[[cat,corgi]]` will expand into 2 separate prompts. Make sure `Images` is set accordingly (e.g., 2). Only works for the positive prompt. Inspired by [Fooocus](https://github.com/lllyasviel/Fooocus/pull/1503).
 ### Scale
+Rescale up to 4x using [Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN) (Wang et al. 2021).
 ### Models
 ### Schedulers
+The default is [DEIS 2M](https://huggingface.co/docs/diffusers/en/api/schedulers/deis) with [Karras](https://arxiv.org/abs/2206.00364) enabled. The other multistep scheduler, [DPM++ 2M](https://huggingface.co/docs/diffusers/en/api/schedulers/multistep_dpm_solver), is also good. For realism, [DDIM](https://huggingface.co/docs/diffusers/en/api/schedulers/ddim) is recommended. [Euler a](https://huggingface.co/docs/diffusers/en/api/schedulers/euler_ancestral) is worth trying for a different look.
 ### Image-to-Image
+The `🖼️ Image` tab enables the image-to-image and IP-Adapter pipelines. Either use the image input or select a generation from the gallery. To disable, simply clear the image input (the `x` overlay button).
+Denoising strength is essentially how much the generation will differ from the input image. A value of `0` will be identical to the original, while `1` will be a completely new image. You may want to also increase the number of inference steps. Only applies to the image-to-image input.
+### IP-Adapter
+In an image-to-image pipeline, the input image is used as the initial latent. With [IP-Adapter](https://github.com/tencent-ailab/IP-Adapter) (Ye et al. 2023), the input image is processed by a separate image encoder and the encoded features are used as conditioning along with the text prompt.
+For capturing faces, enable `IP-Adapter Face` to use the full-face model. You should use an input image that is mostly a face along with the Realistic Vision model. The input image should also be the same aspect ratio as the output to avoid distortion.
 ### Advanced
 #### DeepCache
+[DeepCache](https://github.com/horseee/DeepCache) (Ma et al. 2023) caches lower UNet layers and reuses them every `Interval` steps:
 * `1`: no caching
 * `2`: more quality (default)
 * `3`: balanced
 #### FreeU
+[FreeU](https://github.com/ChenyangSi/FreeU) (Si et al. 2023) re-weights the contributions sourced from the UNet’s skip connections and backbone feature maps to potentially improve image quality.
 #### Clip Skip
 #### Tiny VAE
+Enable [madebyollin/taesd](https://github.com/madebyollin/taesd) for near-instant latent decoding with a minor loss in detail. Useful for development.
 #### Prompt Truncation