license: creativeml-openrail-m
tags:
- coreml
- stable-diffusion
- text-to-image
These are Stable Diffusion v1.5 type models and compatible ControlNet v1.1 models that have been converted to Apple's CoreML format
For use with a Swift app or the SwiftCLI
The SD models are all "original" (not split-einsum) and built for CPU and GPU. They are each for the output size noted. They are fp16, with the standard SD-1.5 VAE embedded.
The Stable Diffusion v1.5 model and the other SD 1.5 type models now contain both the standard Unet and the ControlledUnet used for the ControlNet pipeline. The correct one will be used automatically based on whether ControlNet is enabled or not.
They also should have VAEEncoder.mlmodelc bundles that allow Image2Image to operate correctly at all resolutions, with a current Swift CLI pipeline or a current GUI built with ml-stable-diffusion 0.4.0.
All the ControlNet models are also "original" ones, built for CPU and GPU compute units (cpuAndGPU) and for SD-1.5 type models. The smaller files are only 512x512. The larger files each have a set of 4 resolutions. They will not work with split-einsum models or with SD-2.1 type models.
All of the models in this repo will only work with Swift and the current ml-stable-diffusion pipeline (0.4.0). They were not built for a python diffusers pipeline. They need apple/ml-stable-diffusion (from GitHub) for command line use or a Swift app (currently in a closed beta test at https://github.com/godly-devotion/MochiDiffusion) that supports ControlNet.
The full SD models are in the "SD" folder here. They are individually zipped and need to be unzipped after downloading.
The ControlNet model files are in the "CN" folder here. They are also zipped and need to be unzipped after downloading. Note that there are 2 sizes containing either 1 512x512 model or a set of 4: 512x512, 512x768, 768x512, 768x768.
There is also a MISC folder that has text files with my notes and a screencap of my directory structure.
For command line use, it all runs in a miniconda3 environment, covered in one of the notes. If you are using the command line, please read the notes concerning naming and placement of your ControlNet model folder. If you are using a GUI, it will guide you to the correct location/arrangement.
* * * DYSLEXIA ALERT * * * Many for the initially uploaded model files reversed the names on the 512x768 and 768x512 models.
You can just rename them yourself, or download them again as the file names have been corrected.
The sizes are always meant to be WIDTH x HEIGHT. A 512x768 is "portrait" orientation and a 768x512 is "landscape" orientation.
Sorry if my early transposing of sizes messed with your mind
Notes
- There ia also a branch to main here called "For-Mochi-Model-Env".
- It was going to be a shortcut version of the conversion and generation pipelines for people who already have a setup for converting models per the Wiki at Mochi Diffusion. Development of a new version of Mochi Diffusion, with ControlNet included, is moving along very quickly, so I don't plan to spend more time on the CLI instructions.
- If you downloaded Stable Diffusion v1.5 Orignal 768x768 For ControlNet before 4/27/23, or Stable Diffusion v1.5 Original 512x768 before 5/4/23, please re-download. Those models were not supporting all intended features.
- If you encounter any models that do not work fully with image2image and ControlNet using the current CLI pipeline or Mochi Diffusion 3.2, please leave a report in the Community area here.
Model List
Each zip fles contains a single model for the output size indicated: 512x512, 512x768, 768x512 or 768x768
- Stable Diffusion v1.5, original, for ControlNet & Standard
- MyMerge of 8 1.5-type NSFW models, original, for ControlNet & Standard
- MeinaMix9 1.5-type anime model, original, for ControlNet & Standard
- GhostMix v1.1, 1.5-type anime model, original, for ControlNet & Standard
- Realistic Vision v2.0, 1.5-type model, original, for ControlNet & Standard
- DreamShaper v5.0, 1.5-type model, original, for ControlNet & Standard <<<=== NEW <<<=== NEW
ControlNet List
The smaller files are 512x512 only. The larger files are a set of 4 resolutions zipped together: 512x512, 512x768, 768x512, 768x768
- Canny -- Edge Detection, Outlines As Input
- Scribble -- Freehand Sketch As Input
- InstrP2P -- Instruct Picture2Picture, Modified By Text ("change dog to cat")
- MLSD -- Find And Reuse Straight Lines And Edges
- InPaint -- Modify An Indicated Area Of An Image (not sure how this works)
- LineArt -- Find And Reuse Small Outlines
- OpenPose -- Copy Body Poses
- SoftEdge -- Find And Reuse Soft Edges
- Tile -- Subtle Variations In Batch Runs
- Depth -- Reproduces Depth Relationships From An Image