Guizmus commited on
Commit
5707811
·
1 Parent(s): fdd1e7f

Upload 18 files

Browse files
.gitattributes CHANGED
@@ -32,3 +32,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
32
  *.zip filter=lfs diff=lfs merge=lfs -text
33
  *.zst filter=lfs diff=lfs merge=lfs -text
34
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
32
  *.zip filter=lfs diff=lfs merge=lfs -text
33
  *.zst filter=lfs diff=lfs merge=lfs -text
34
  *tfevents* filter=lfs diff=lfs merge=lfs -text
35
+ showcase.jpg filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,3 +1,115 @@
1
  ---
 
 
2
  license: creativeml-openrail-m
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - en
4
  license: creativeml-openrail-m
5
+ thumbnail: "https://huggingface.co/Guizmus/SDArt_synesthesia768/resolve/main/showcase.jpg"
6
+ tags:
7
+ - stable-diffusion
8
+ - text-to-image
9
+ - image-to-image
10
  ---
11
+ # SDArt : Synesthesia (version based on 2.1 768px)
12
+
13
+ ![Showcase](https://huggingface.co/Guizmus/SDArt_synesthesia768/resolve/main/showcase.jpg)
14
+
15
+ ## Theme
16
+
17
+ Dear @Challengers ,
18
+
19
+ > In a world where colors have flavor, and sounds have texture,
20
+ > The senses intertwine to create a sensory rapture.
21
+ > Welcome to “Synesthesia”- where the ordinary is extraordinary,
22
+ > And art takes on a meaning that’s truly visionary! :Rainbowpink:
23
+
24
+ Synesthesia /ˌsɪn.əsˈθiː.ʒə/: an anomalous blending of the senses in which the stimulation of one modality simultaneously produces sensation in a different modality. Synesthetes hear colors, feel sounds, and taste shapes.
25
+
26
+ * Create an image that captures the essence of synesthesia sensory! Explore the intersections of sound, color, and texture. What’s it like tasting a melody, feeling a scent, or seeing colored words?
27
+
28
+ * What would it look it look like to taste the sound of a thunderstorm? How would you visualize the sensation of feeling the warmth of the sun on your skin? Or how about tasting a particular color–such as a bright red apple or a cool blueberry?
29
+
30
+ * Explore the possibilities of synesthesia and its many interpretations!
31
+
32
+
33
+ ## Model description
34
+
35
+ This is a model related to the "Picture of the Week" contest on Stable Diffusion discord..
36
+
37
+ I try to make a model out of all the submission for people to continue enjoy the theme after the even, and see a little of their designs in other people's creations. The token stays "SDArt" and I balance the learning on the low side, so that it doesn't just replicate creations.
38
+
39
+ The total dataset is made of 39 pictures. It was trained on [Stable diffusion 2.1 768px](https://huggingface.co/stabilityai/stable-diffusion-2-1). I used [EveryDream](https://github.com/victorchall/EveryDream2trainer) to do the training, 100 total repeat per picture. The pictures were tagged using the token "SDArt", and an arbitrary token I choose. The dataset is provided below, as well as a list of usernames and their corresponding token.
40
+
41
+ The recommended sampling is k_Euler_a or DPM++ 2M Karras on 20 steps, CFGS 7.5 .
42
+
43
+ ## Trained tokens
44
+
45
+ * SDArt
46
+ * dyce
47
+ * bnp
48
+ * keel
49
+ * aten
50
+ * fcu
51
+ * lpg
52
+ * mth
53
+ * elio
54
+ * gani
55
+ * pfa
56
+ * kprc
57
+ * cpec
58
+ * kuro
59
+ * asot
60
+ * psst
61
+ * sqm
62
+ * irgc
63
+ * cq
64
+ * utm
65
+ * guin
66
+ * crit
67
+ * mlas
68
+ * isch
69
+ * vedi
70
+ * dds
71
+ * acu
72
+ * oxi
73
+ * kohl
74
+ * maar
75
+ * mako
76
+ * mds
77
+ * mert
78
+ * mgt
79
+ * miki
80
+ * minh
81
+ * mohd
82
+ * mss
83
+ * muc
84
+ * mwf
85
+
86
+ ## Download links
87
+
88
+ [SafeTensors](https://huggingface.co/Guizmus/SDArt_synesthesia768/resolve/main/SDArt_synesthesia768.safetensors)
89
+
90
+ [CKPT](https://huggingface.co/Guizmus/SDArt_synesthesia768/resolve/main/SDArt_synesthesia768.ckpt)
91
+
92
+ [Config (yaml)](https://huggingface.co/Guizmus/SDArt_synesthesia768/resolve/main/SDArt_synesthesia768.yaml)
93
+
94
+ [Dataset](https://huggingface.co/Guizmus/SDArt_synesthesia768/resolve/main/dataset.zip)
95
+
96
+ ## 🧨 Diffusers
97
+
98
+ This model can be used just like any other Stable Diffusion model. For more information,
99
+ please have a look at the [Stable Diffusion](https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion).
100
+
101
+ You can also export the model to [ONNX](https://huggingface.co/docs/diffusers/optimization/onnx), [MPS](https://huggingface.co/docs/diffusers/optimization/mps) and/or [FLAX/JAX]().
102
+
103
+ ```python
104
+ from diffusers import StableDiffusionPipeline
105
+ import torch
106
+
107
+ model_id = "Guizmus/SDArt_synesthesia768"
108
+ pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
109
+ pipe = pipe.to("cuda")
110
+
111
+ prompt = "SDArt minh"
112
+ image = pipe(prompt).images[0]
113
+
114
+ image.save("./SDArt.png")
115
+ ```
SDArt_synesthesia768.ckpt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5ccf18e1d4bb60c6373fc6a1b625c4e443d6719ee4ad780f466f32bf82e1d4ed
3
+ size 2580316378
SDArt_synesthesia768.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:aed82eed2da89fdb7c2eede893df3c767dcd23e8650c71f933ae573362210ea2
3
+ size 2580068696
SDArt_synesthesia768.yaml ADDED
@@ -0,0 +1,68 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ model:
2
+ base_learning_rate: 1.0e-4
3
+ target: ldm.models.diffusion.ddpm.LatentDiffusion
4
+ params:
5
+ parameterization: "v"
6
+ linear_start: 0.00085
7
+ linear_end: 0.0120
8
+ num_timesteps_cond: 1
9
+ log_every_t: 200
10
+ timesteps: 1000
11
+ first_stage_key: "jpg"
12
+ cond_stage_key: "txt"
13
+ image_size: 64
14
+ channels: 4
15
+ cond_stage_trainable: false
16
+ conditioning_key: crossattn
17
+ monitor: val/loss_simple_ema
18
+ scale_factor: 0.18215
19
+ use_ema: False # we set this to false because this is an inference only config
20
+
21
+ unet_config:
22
+ target: ldm.modules.diffusionmodules.openaimodel.UNetModel
23
+ params:
24
+ use_checkpoint: True
25
+ use_fp16: True
26
+ image_size: 32 # unused
27
+ in_channels: 4
28
+ out_channels: 4
29
+ model_channels: 320
30
+ attention_resolutions: [ 4, 2, 1 ]
31
+ num_res_blocks: 2
32
+ channel_mult: [ 1, 2, 4, 4 ]
33
+ num_head_channels: 64 # need to fix for flash-attn
34
+ use_spatial_transformer: True
35
+ use_linear_in_transformer: True
36
+ transformer_depth: 1
37
+ context_dim: 1024
38
+ legacy: False
39
+
40
+ first_stage_config:
41
+ target: ldm.models.autoencoder.AutoencoderKL
42
+ params:
43
+ embed_dim: 4
44
+ monitor: val/rec_loss
45
+ ddconfig:
46
+ #attn_type: "vanilla-xformers"
47
+ double_z: true
48
+ z_channels: 4
49
+ resolution: 256
50
+ in_channels: 3
51
+ out_ch: 3
52
+ ch: 128
53
+ ch_mult:
54
+ - 1
55
+ - 2
56
+ - 4
57
+ - 4
58
+ num_res_blocks: 2
59
+ attn_resolutions: []
60
+ dropout: 0.0
61
+ lossconfig:
62
+ target: torch.nn.Identity
63
+
64
+ cond_stage_config:
65
+ target: ldm.modules.encoders.modules.FrozenOpenCLIPEmbedder
66
+ params:
67
+ freeze: True
68
+ layer: "penultimate"
dataset.zip ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7e9e2e39aeb7672c03c12317af8039d4ae100afead12c081b665df1d18c98bf9
3
+ size 6329803
model_index.json ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_class_name": "StableDiffusionPipeline",
3
+ "_diffusers_version": "0.13.0",
4
+ "feature_extractor": [
5
+ null,
6
+ null
7
+ ],
8
+ "requires_safety_checker": null,
9
+ "safety_checker": [
10
+ null,
11
+ null
12
+ ],
13
+ "scheduler": [
14
+ "diffusers",
15
+ "DDPMScheduler"
16
+ ],
17
+ "text_encoder": [
18
+ "transformers",
19
+ "CLIPTextModel"
20
+ ],
21
+ "tokenizer": [
22
+ "transformers",
23
+ "CLIPTokenizer"
24
+ ],
25
+ "unet": [
26
+ "diffusers",
27
+ "UNet2DConditionModel"
28
+ ],
29
+ "vae": [
30
+ "diffusers",
31
+ "AutoencoderKL"
32
+ ]
33
+ }
scheduler/scheduler_config.json ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_class_name": "DDPMScheduler",
3
+ "_diffusers_version": "0.13.0",
4
+ "beta_end": 0.012,
5
+ "beta_schedule": "scaled_linear",
6
+ "beta_start": 0.00085,
7
+ "clip_sample": false,
8
+ "clip_sample_range": 1.0,
9
+ "num_train_timesteps": 1000,
10
+ "prediction_type": "v_prediction",
11
+ "set_alpha_to_one": false,
12
+ "skip_prk_steps": true,
13
+ "steps_offset": 1,
14
+ "trained_betas": null,
15
+ "variance_type": "fixed_small"
16
+ }
showcase.jpg ADDED

Git LFS Details

  • SHA256: ddcfd989ee0b33aa2db76017045dd6a746c51c00c3e2c874a88a6dc46df4b3e5
  • Pointer size: 132 Bytes
  • Size of remote file: 4.56 MB
text_encoder/config.json ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "F:\\AI\\Data\\Diffusers\\stable-diffusion-2-1",
3
+ "architectures": [
4
+ "CLIPTextModel"
5
+ ],
6
+ "attention_dropout": 0.0,
7
+ "bos_token_id": 0,
8
+ "dropout": 0.0,
9
+ "eos_token_id": 2,
10
+ "hidden_act": "gelu",
11
+ "hidden_size": 1024,
12
+ "initializer_factor": 1.0,
13
+ "initializer_range": 0.02,
14
+ "intermediate_size": 4096,
15
+ "layer_norm_eps": 1e-05,
16
+ "max_position_embeddings": 77,
17
+ "model_type": "clip_text_model",
18
+ "num_attention_heads": 16,
19
+ "num_hidden_layers": 23,
20
+ "pad_token_id": 1,
21
+ "projection_dim": 512,
22
+ "torch_dtype": "float32",
23
+ "transformers_version": "4.25.1",
24
+ "vocab_size": 49408
25
+ }
text_encoder/pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5a644e09f3adc3db8c3498a33497ef63ea68d156fe9d1939400d0a671004f418
3
+ size 1361677143
tokenizer/merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer/special_tokens_map.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<|startoftext|>",
4
+ "lstrip": false,
5
+ "normalized": true,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "eos_token": {
10
+ "content": "<|endoftext|>",
11
+ "lstrip": false,
12
+ "normalized": true,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": "!",
17
+ "unk_token": {
18
+ "content": "<|endoftext|>",
19
+ "lstrip": false,
20
+ "normalized": true,
21
+ "rstrip": false,
22
+ "single_word": false
23
+ }
24
+ }
tokenizer/tokenizer_config.json ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": false,
3
+ "bos_token": {
4
+ "__type": "AddedToken",
5
+ "content": "<|startoftext|>",
6
+ "lstrip": false,
7
+ "normalized": true,
8
+ "rstrip": false,
9
+ "single_word": false
10
+ },
11
+ "do_lower_case": true,
12
+ "eos_token": {
13
+ "__type": "AddedToken",
14
+ "content": "<|endoftext|>",
15
+ "lstrip": false,
16
+ "normalized": true,
17
+ "rstrip": false,
18
+ "single_word": false
19
+ },
20
+ "errors": "replace",
21
+ "model_max_length": 77,
22
+ "name_or_path": "F:\\AI\\Data\\Diffusers\\stable-diffusion-2-1",
23
+ "pad_token": "<|endoftext|>",
24
+ "special_tokens_map_file": "./special_tokens_map.json",
25
+ "tokenizer_class": "CLIPTokenizer",
26
+ "unk_token": {
27
+ "__type": "AddedToken",
28
+ "content": "<|endoftext|>",
29
+ "lstrip": false,
30
+ "normalized": true,
31
+ "rstrip": false,
32
+ "single_word": false
33
+ },
34
+ "use_fast": false
35
+ }
tokenizer/vocab.json ADDED
The diff for this file is too large to render. See raw diff
 
unet/config.json ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_class_name": "UNet2DConditionModel",
3
+ "_diffusers_version": "0.13.0",
4
+ "_name_or_path": "F:\\AI\\Data\\Diffusers\\stable-diffusion-2-1",
5
+ "act_fn": "silu",
6
+ "attention_head_dim": [
7
+ 5,
8
+ 10,
9
+ 20,
10
+ 20
11
+ ],
12
+ "block_out_channels": [
13
+ 320,
14
+ 640,
15
+ 1280,
16
+ 1280
17
+ ],
18
+ "center_input_sample": false,
19
+ "class_embed_type": null,
20
+ "conv_in_kernel": 3,
21
+ "conv_out_kernel": 3,
22
+ "cross_attention_dim": 1024,
23
+ "down_block_types": [
24
+ "CrossAttnDownBlock2D",
25
+ "CrossAttnDownBlock2D",
26
+ "CrossAttnDownBlock2D",
27
+ "DownBlock2D"
28
+ ],
29
+ "downsample_padding": 1,
30
+ "dual_cross_attention": false,
31
+ "flip_sin_to_cos": true,
32
+ "freq_shift": 0,
33
+ "in_channels": 4,
34
+ "layers_per_block": 2,
35
+ "mid_block_scale_factor": 1,
36
+ "mid_block_type": "UNetMidBlock2DCrossAttn",
37
+ "norm_eps": 1e-05,
38
+ "norm_num_groups": 32,
39
+ "num_class_embeds": null,
40
+ "only_cross_attention": false,
41
+ "out_channels": 4,
42
+ "projection_class_embeddings_input_dim": null,
43
+ "resnet_time_scale_shift": "default",
44
+ "sample_size": 96,
45
+ "time_cond_proj_dim": null,
46
+ "time_embedding_type": "positional",
47
+ "timestep_post_act": null,
48
+ "up_block_types": [
49
+ "UpBlock2D",
50
+ "CrossAttnUpBlock2D",
51
+ "CrossAttnUpBlock2D",
52
+ "CrossAttnUpBlock2D"
53
+ ],
54
+ "upcast_attention": true,
55
+ "use_linear_projection": true
56
+ }
unet/diffusion_pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:559ecb8d22a434b60d05c6337a103a0b82c08e01de239c8864b57ea657ac6e56
3
+ size 3463923045
vae/config.json ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_class_name": "AutoencoderKL",
3
+ "_diffusers_version": "0.13.0",
4
+ "_name_or_path": "F:\\AI\\Data\\Diffusers\\stable-diffusion-2-1",
5
+ "act_fn": "silu",
6
+ "block_out_channels": [
7
+ 128,
8
+ 256,
9
+ 512,
10
+ 512
11
+ ],
12
+ "down_block_types": [
13
+ "DownEncoderBlock2D",
14
+ "DownEncoderBlock2D",
15
+ "DownEncoderBlock2D",
16
+ "DownEncoderBlock2D"
17
+ ],
18
+ "in_channels": 3,
19
+ "latent_channels": 4,
20
+ "layers_per_block": 2,
21
+ "norm_num_groups": 32,
22
+ "out_channels": 3,
23
+ "sample_size": 768,
24
+ "scaling_factor": 0.18215,
25
+ "up_block_types": [
26
+ "UpDecoderBlock2D",
27
+ "UpDecoderBlock2D",
28
+ "UpDecoderBlock2D",
29
+ "UpDecoderBlock2D"
30
+ ]
31
+ }
vae/diffusion_pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:eb128b1f37e0c381c440128b217d29613b3e08b9e4ea7f20466424145ba538b0
3
+ size 167402961