JFoz commited on
Commit
cf8f435
1 Parent(s): 77366b0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -139
README.md CHANGED
@@ -14,63 +14,29 @@ library_name: diffusers
14
 
15
  # controlnet- JFoz/dog-cat-pose
16
 
17
- These are controlnet weights trained on runwayml/stable-diffusion-v1-5 with pose conditioning generated using the animalpose model of OpenPifPaf
18
- You can find some example images in the following.
 
 
 
19
 
20
  prompt: a tortoiseshell cat is sitting on a cushion
21
  ![images_0)](./images_0.png)
22
  prompt: a yellow dog standing on a lawn
23
  ![images_1)](./images_1.png)
24
 
 
 
 
 
25
 
26
 
27
  # Model Card for dog-cat-pose
28
 
29
- <!-- Provide a quick summary of what the model is/does. [Optional] -->
30
  This is an ControlNet model which allows users to control the pose of a dog or cat. Poses were extracted from images using the animalpose model of OpenPifPaf https://openpifpaf.github.io/intro.html . Skeleton colouring is as shown in the dataset. See also https://huggingface.co/JFoz/dog-pose
31
 
32
 
33
 
34
-
35
- # Table of Contents
36
-
37
- - [Model Card for dog-cat-pose](#model-card-for--model_id-)
38
- - [Table of Contents](#table-of-contents)
39
- - [Table of Contents](#table-of-contents-1)
40
- - [Model Details](#model-details)
41
- - [Model Description](#model-description)
42
- - [Uses](#uses)
43
- - [Direct Use](#direct-use)
44
- - [Downstream Use [Optional]](#downstream-use-optional)
45
- - [Out-of-Scope Use](#out-of-scope-use)
46
- - [Bias, Risks, and Limitations](#bias-risks-and-limitations)
47
- - [Recommendations](#recommendations)
48
- - [Training Details](#training-details)
49
- - [Training Data](#training-data)
50
- - [Training Procedure](#training-procedure)
51
- - [Preprocessing](#preprocessing)
52
- - [Speeds, Sizes, Times](#speeds-sizes-times)
53
- - [Evaluation](#evaluation)
54
- - [Testing Data, Factors & Metrics](#testing-data-factors--metrics)
55
- - [Testing Data](#testing-data)
56
- - [Factors](#factors)
57
- - [Metrics](#metrics)
58
- - [Results](#results)
59
- - [Model Examination](#model-examination)
60
- - [Environmental Impact](#environmental-impact)
61
- - [Technical Specifications [optional]](#technical-specifications-optional)
62
- - [Model Architecture and Objective](#model-architecture-and-objective)
63
- - [Compute Infrastructure](#compute-infrastructure)
64
- - [Hardware](#hardware)
65
- - [Software](#software)
66
- - [Citation](#citation)
67
- - [Glossary [optional]](#glossary-optional)
68
- - [More Information [optional]](#more-information-optional)
69
- - [Model Card Authors [optional]](#model-card-authors-optional)
70
- - [Model Card Contact](#model-card-contact)
71
- - [How to Get Started with the Model](#how-to-get-started-with-the-model)
72
-
73
-
74
  # Model Details
75
 
76
  ## Model Description
@@ -83,7 +49,7 @@ This is an ControlNet model which allows users to control the pose of a dog or c
83
  - **Language(s) (NLP):** en
84
  - **License:** openrail
85
  - **Parent Model:** https://huggingface.co/runwayml/stable-diffusion-v1-5
86
- - **Resources for more information:** More information needed
87
  - [GitHub Repo](https://github.com/jfozard/animalpose/tree/f1be80ed29886a1314054b87f2a8944ea98997ac)
88
 
89
 
@@ -99,7 +65,6 @@ This is an ControlNet model which allows users to control the pose of a dog or c
99
  Supply a suitable, potentially incomplete pose along with a relevant text prompt
100
 
101
 
102
-
103
  ## Out-of-Scope Use
104
 
105
  <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
@@ -112,7 +77,7 @@ Generating images of non-animals. We advise retaining the stable diffusion safet
112
 
113
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
114
 
115
-
116
 
117
 
118
  ## Recommendations
@@ -131,7 +96,6 @@ Maintain careful supervision of model inputs and outputs.
131
 
132
  Trained on a subset of Laion-5B using clip retrieval with the prompts &#34;a photo of a (dog/cat) (standing/walking)&#34;
133
 
134
-
135
  ## Training Procedure
136
 
137
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
@@ -140,96 +104,20 @@ Trained on a subset of Laion-5B using clip retrieval with the prompts &#34;a pho
140
 
141
  Images were rescaled to 512 along their short edge and centrally cropped. The OpenPifPaf pose-detection model was used to extract poses, which were used to generate conditioning images.
142
 
143
- ### Speeds, Sizes, Times
144
-
145
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
146
-
147
- More information needed
148
-
149
- # Evaluation
150
-
151
- <!-- This section describes the evaluation protocols and provides the results. -->
152
-
153
- ## Testing Data, Factors & Metrics
154
-
155
- ### Testing Data
156
-
157
- <!-- This should link to a Data Card if possible. -->
158
-
159
- More information needed
160
-
161
-
162
- ### Factors
163
-
164
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
165
-
166
- More information needed
167
-
168
- ### Metrics
169
-
170
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
171
-
172
- More information needed
173
-
174
- ## Results
175
 
176
- More information needed
177
 
178
- # Model Examination
179
-
180
- More information needed
181
-
182
- # Environmental Impact
183
-
184
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
185
-
186
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
187
-
188
- - **Hardware Type:** More information needed
189
- - **Hours used:** More information needed
190
- - **Cloud Provider:** More information needed
191
- - **Compute Region:** More information needed
192
- - **Carbon Emitted:** More information needed
193
-
194
- # Technical Specifications [optional]
195
-
196
- ## Model Architecture and Objective
197
-
198
- More information needed
199
 
200
  ## Compute Infrastructure
201
 
202
  TPUv4i
203
 
204
- ### Hardware
205
 
206
- More information needed
207
 
208
  ### Software
209
 
210
  Flax stable diffusion controlnet pipeline
211
 
212
- # Citation
213
-
214
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
215
-
216
- **BibTeX:**
217
-
218
- More information needed
219
-
220
- **APA:**
221
 
222
- More information needed
223
-
224
- # Glossary [optional]
225
-
226
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
227
-
228
- More information needed
229
-
230
- # More Information [optional]
231
-
232
- More information needed
233
 
234
  # Model Card Authors [optional]
235
 
@@ -237,19 +125,3 @@ More information needed
237
 
238
  John Fozard
239
 
240
- # Model Card Contact
241
-
242
- More information needed
243
-
244
- # How to Get Started with the Model
245
-
246
- Use the code below to get started with the model.
247
-
248
- <details>
249
- <summary> Click to expand </summary>
250
-
251
- from diffusers import DiffusionPipeline
252
-
253
- pipeline = DiffusionPipeline.from_pretrained(&#34;dog-cat-pose&#34;${model.private ? &#34;, use_auth_token=True&#34; : &#34;&#34;})
254
-
255
- </details>
 
14
 
15
  # controlnet- JFoz/dog-cat-pose
16
 
17
+ Simple controlnet model made as part of the HF JaX/Diffusers community sprint.
18
+
19
+ These are controlnet weights trained on runwayml/stable-diffusion-v1-5 with pose conditioning generated using the animalpose model of OpenPifPaf.
20
+
21
+ Some example images can be found in the following
22
 
23
  prompt: a tortoiseshell cat is sitting on a cushion
24
  ![images_0)](./images_0.png)
25
  prompt: a yellow dog standing on a lawn
26
  ![images_1)](./images_1.png)
27
 
28
+ Whilst not the dataset used for this model, a smaller dataset with the same
29
+ format for conditioning images can be found at https://huggingface.co/datasets/JFoz/dog-poses-controlnet-dataset
30
+
31
+ The dataset was generated using the code at https://github.com/jfozard/animalpose/tree/f1be80ed29886a1314054b87f2a8944ea98997ac
32
 
33
 
34
  # Model Card for dog-cat-pose
35
 
 
36
  This is an ControlNet model which allows users to control the pose of a dog or cat. Poses were extracted from images using the animalpose model of OpenPifPaf https://openpifpaf.github.io/intro.html . Skeleton colouring is as shown in the dataset. See also https://huggingface.co/JFoz/dog-pose
37
 
38
 
39
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
40
  # Model Details
41
 
42
  ## Model Description
 
49
  - **Language(s) (NLP):** en
50
  - **License:** openrail
51
  - **Parent Model:** https://huggingface.co/runwayml/stable-diffusion-v1-5
52
+ - **Resources for more information:**
53
  - [GitHub Repo](https://github.com/jfozard/animalpose/tree/f1be80ed29886a1314054b87f2a8944ea98997ac)
54
 
55
 
 
65
  Supply a suitable, potentially incomplete pose along with a relevant text prompt
66
 
67
 
 
68
  ## Out-of-Scope Use
69
 
70
  <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
 
77
 
78
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
79
 
80
+ The model is trained on a relatively small dataset, and may be overfit to those images.
81
 
82
 
83
  ## Recommendations
 
96
 
97
  Trained on a subset of Laion-5B using clip retrieval with the prompts &#34;a photo of a (dog/cat) (standing/walking)&#34;
98
 
 
99
  ## Training Procedure
100
 
101
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
 
104
 
105
  Images were rescaled to 512 along their short edge and centrally cropped. The OpenPifPaf pose-detection model was used to extract poses, which were used to generate conditioning images.
106
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
107
 
 
108
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
109
 
110
  ## Compute Infrastructure
111
 
112
  TPUv4i
113
 
 
114
 
 
115
 
116
  ### Software
117
 
118
  Flax stable diffusion controlnet pipeline
119
 
 
 
 
 
 
 
 
 
 
120
 
 
 
 
 
 
 
 
 
 
 
 
121
 
122
  # Model Card Authors [optional]
123
 
 
125
 
126
  John Fozard
127