File size: 2,913 Bytes
0b5fca3 3bd23fa 15e3502 baba2ad 15e3502 baba2ad 80809cc baba2ad 2be5aaf baba2ad 3ee83de b166b7e 88f7b4c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
---
tags:
- stable-diffusion
- stable-diffusion-diffusers
- text-to-image
---
Models better suited for High-Resolution Image Synthesis. The main model (doohickey/doohickey-mega) has been finetuned from [runwayml/stable-diffusion-v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5) near a resolution of 768x768 (suggested method of generating from model is with [Doohickey](https://colab.research.google.com/github/aicrumb/doohickey/blob/main/Doohickey_Diffusion.ipynb)).
Current models:
| name | description | datasets used |
| --- | --- | --- |
| doohickey/doohickey-mega/v1-3000steps.ckpt | first try, rlly good hd, bad results w/ other aspect ratios than 1:1 trained at 704x704 | A-1k|
| doohickey/doohickey-mega/v2-3000steps.ckpt | same as last one but worse | A-1k + ~1k samples from LAION-2b-En-Aesthetic >=768x768 |
| doohickey/doohickey-mega/v3-3000.ckpt | with new CLIP model ([laion/CLIP-ViT-L-14-laion2B-s32B-b82K](https://hf.co/laion/CLIP-ViT-L-14-laion2B-s32B-b82K)) (CLIP model also finetuned the 3k steps), models past this point were trained with various aspect ratios from 640x640 min to 768x768 max resolution. (examples 768x640 or 704x768) | A-1k + E-10k |
| doohickey/doohickey-mega/v3-6000.ckpt | 3k steps on top of v3-3000.ckpt, better at hands! (just UNet finetune, added a RandomHorizontalFlip operation at 50%) | A-1k |
| doohickey/doohickey-mega/v3-7000.ckpt | continuation of last model, I thought Colab would crash after 3k steps but it kept going for a little while saving ckpts every 1k steps. | A-1k |
| doohickey/doohickey-mega/v3-8000.ckpt | see last description, v3-6000 + 2k steps | A-1k |
The currently loaded model for diffusers is doohickey/doohickey-mega/v3-8000.ckpt
Datasets:
| name | description |
| --- | --- |
| A-1K | 1k scraped images, captioned with BLIP (more refined aesthetic) |
| E-10k | 10k scraped images captioned with BLIP (less refined aesthetic) |
_Limitations and Biases from Stable Diffusion also apply to this model._
<div style="font-size:10px">
This model is open access and available to all, with a CreativeML OpenRAIL-M license further specifying rights and usage.
The CreativeML OpenRAIL License specifies:
1. You can't use the model to deliberately produce nor share illegal or harmful outputs or content
2. The authors claim no rights on the outputs you generate, you are free to use them and are accountable for their use which must not go against the provisions set in the license
3. You may re-distribute the weights and use the model commercially and/or as a service. If you do, please be aware you have to include the same use restrictions as the ones in the license and share a copy of the CreativeML OpenRAIL-M to all your users (please read the license entirely and carefully)
Please read the full license carefully here: https://huggingface.co/spaces/CompVis/stable-diffusion-license
</div> |