yhzhai commited on
Commit
8b19867
·
1 Parent(s): ed9c4b4
Files changed (4) hide show
  1. .gitignore +180 -0
  2. README.md +8 -1
  3. app.py +349 -103
  4. requirements.txt +87 -6
.gitignore ADDED
@@ -0,0 +1,180 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Created by https://www.toptal.com/developers/gitignore/api/python
2
+ # Edit at https://www.toptal.com/developers/gitignore?templates=python
3
+
4
+ ### Python ###
5
+ # Byte-compiled / optimized / DLL files
6
+ __pycache__/
7
+ *.py[cod]
8
+ *$py.class
9
+
10
+ # C extensions
11
+ *.so
12
+
13
+ # Distribution / packaging
14
+ .Python
15
+ build/
16
+ develop-eggs/
17
+ dist/
18
+ downloads/
19
+ eggs/
20
+ .eggs/
21
+ lib/
22
+ lib64/
23
+ parts/
24
+ sdist/
25
+ var/
26
+ wheels/
27
+ share/python-wheels/
28
+ *.egg-info/
29
+ .installed.cfg
30
+ *.egg
31
+ MANIFEST
32
+
33
+ # PyInstaller
34
+ # Usually these files are written by a python script from a template
35
+ # before PyInstaller builds the exe, so as to inject date/other infos into it.
36
+ *.manifest
37
+ *.spec
38
+
39
+ # Installer logs
40
+ pip-log.txt
41
+ pip-delete-this-directory.txt
42
+
43
+ # Unit test / coverage reports
44
+ htmlcov/
45
+ .tox/
46
+ .nox/
47
+ .coverage
48
+ .coverage.*
49
+ .cache
50
+ nosetests.xml
51
+ coverage.xml
52
+ *.cover
53
+ *.py,cover
54
+ .hypothesis/
55
+ .pytest_cache/
56
+ cover/
57
+
58
+ # Translations
59
+ *.mo
60
+ *.pot
61
+
62
+ # Django stuff:
63
+ *.log
64
+ local_settings.py
65
+ db.sqlite3
66
+ db.sqlite3-journal
67
+
68
+ # Flask stuff:
69
+ instance/
70
+ .webassets-cache
71
+
72
+ # Scrapy stuff:
73
+ .scrapy
74
+
75
+ # Sphinx documentation
76
+ docs/_build/
77
+
78
+ # PyBuilder
79
+ .pybuilder/
80
+ target/
81
+
82
+ # Jupyter Notebook
83
+ .ipynb_checkpoints
84
+
85
+ # IPython
86
+ profile_default/
87
+ ipython_config.py
88
+
89
+ # pyenv
90
+ # For a library or package, you might want to ignore these files since the code is
91
+ # intended to run in multiple environments; otherwise, check them in:
92
+ # .python-version
93
+
94
+ # pipenv
95
+ # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
96
+ # However, in case of collaboration, if having platform-specific dependencies or dependencies
97
+ # having no cross-platform support, pipenv may install dependencies that don't work, or not
98
+ # install all needed dependencies.
99
+ #Pipfile.lock
100
+
101
+ # poetry
102
+ # Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
103
+ # This is especially recommended for binary packages to ensure reproducibility, and is more
104
+ # commonly ignored for libraries.
105
+ # https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
106
+ #poetry.lock
107
+
108
+ # pdm
109
+ # Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
110
+ #pdm.lock
111
+ # pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
112
+ # in version control.
113
+ # https://pdm.fming.dev/#use-with-ide
114
+ .pdm.toml
115
+
116
+ # PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
117
+ __pypackages__/
118
+
119
+ # Celery stuff
120
+ celerybeat-schedule
121
+ celerybeat.pid
122
+
123
+ # SageMath parsed files
124
+ *.sage.py
125
+
126
+ # Environments
127
+ .env
128
+ .venv
129
+ env/
130
+ venv/
131
+ ENV/
132
+ env.bak/
133
+ venv.bak/
134
+
135
+ # Spyder project settings
136
+ .spyderproject
137
+ .spyproject
138
+
139
+ # Rope project settings
140
+ .ropeproject
141
+
142
+ # mkdocs documentation
143
+ /site
144
+
145
+ # mypy
146
+ .mypy_cache/
147
+ .dmypy.json
148
+ dmypy.json
149
+
150
+ # Pyre type checker
151
+ .pyre/
152
+
153
+ # pytype static type analyzer
154
+ .pytype/
155
+
156
+ # Cython debug symbols
157
+ cython_debug/
158
+
159
+ # PyCharm
160
+ # JetBrains specific template is maintained in a separate JetBrains.gitignore that can
161
+ # be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
162
+ # and can be added to the global gitignore or merged into this file. For a more nuclear
163
+ # option (not recommended) you can uncomment the following to ignore the entire idea folder.
164
+ #.idea/
165
+
166
+ ### Python Patch ###
167
+ # Poetry local configuration file - https://python-poetry.org/docs/configuration/#local-configuration
168
+ poetry.toml
169
+
170
+ # ruff
171
+ .ruff_cache/
172
+
173
+ # LSP config files
174
+ pyrightconfig.json
175
+
176
+ # End of https://www.toptal.com/developers/gitignore/api/python
177
+
178
+ gradio_cached_examples
179
+ *.DS_Store
180
+ samples
README.md CHANGED
@@ -1,5 +1,5 @@
1
  ---
2
- title: Mcm
3
  emoji: 🖼
4
  colorFrom: purple
5
  colorTo: red
@@ -8,6 +8,13 @@ sdk_version: 4.26.0
8
  app_file: app.py
9
  pinned: false
10
  license: apache-2.0
 
 
 
 
 
 
 
11
  ---
12
 
13
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
1
  ---
2
+ title: Motion Consistency Model - Accelerating Video Diffusion with Disentangled Motion-Appearance Distillation
3
  emoji: 🖼
4
  colorFrom: purple
5
  colorTo: red
 
8
  app_file: app.py
9
  pinned: false
10
  license: apache-2.0
11
+ short_description: Detect and locate image manipulations.
12
+ preload_from_hub:
13
+ - yhzhai/mcm
14
+ - ali-vilab/text-to-video-ms-1.7b
15
+ - runwayml/stable-diffusion-v1-5
16
+ - emilianJR/epiCRealism
17
+ - SG161222/Realistic_Vision_V6.0_B1_noVAE
18
  ---
19
 
20
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
app.py CHANGED
@@ -1,70 +1,334 @@
 
 
 
 
 
1
  import gradio as gr
2
  import numpy as np
3
- import random
4
- from diffusers import DiffusionPipeline
5
  import torch
 
 
 
 
 
 
 
 
6
 
7
  device = "cuda" if torch.cuda.is_available() else "cpu"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
 
9
- if torch.cuda.is_available():
10
- torch.cuda.max_memory_allocated(device=device)
11
- pipe = DiffusionPipeline.from_pretrained("stabilityai/sdxl-turbo", torch_dtype=torch.float16, variant="fp16", use_safetensors=True)
12
- pipe.enable_xformers_memory_efficient_attention()
13
  pipe = pipe.to(device)
14
- else:
15
- pipe = DiffusionPipeline.from_pretrained("stabilityai/sdxl-turbo", use_safetensors=True)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
  pipe = pipe.to(device)
 
17
 
18
- MAX_SEED = np.iinfo(np.int32).max
19
- MAX_IMAGE_SIZE = 1024
20
 
21
- def infer(prompt, negative_prompt, seed, randomize_seed, width, height, guidance_scale, num_inference_steps):
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
 
23
  if randomize_seed:
24
  seed = random.randint(0, MAX_SEED)
25
-
26
  generator = torch.Generator().manual_seed(seed)
27
-
28
- image = pipe(
29
- prompt = prompt,
30
- negative_prompt = negative_prompt,
31
- guidance_scale = guidance_scale,
32
- num_inference_steps = num_inference_steps,
33
- width = width,
34
- height = height,
35
- generator = generator
36
- ).images[0]
37
-
38
- return image
 
 
 
 
 
 
 
 
 
 
 
39
 
40
  examples = [
41
- "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k",
42
- "An astronaut riding a green horse",
43
- "A delicious ceviche cheesecake slice",
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
44
  ]
45
 
46
- css="""
47
  #col-container {
48
  margin: 0 auto;
49
- max-width: 520px;
50
  }
51
  """
52
 
53
- if torch.cuda.is_available():
54
- power_device = "GPU"
55
- else:
56
- power_device = "CPU"
 
 
 
 
 
 
 
57
 
58
  with gr.Blocks(css=css) as demo:
59
-
60
  with gr.Column(elem_id="col-container"):
61
- gr.Markdown(f"""
62
- # Text-to-Image Gradio Template
63
- Currently running on {power_device}.
64
- """)
65
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
66
  with gr.Row():
67
-
68
  prompt = gr.Text(
69
  label="Prompt",
70
  show_label=False,
@@ -72,75 +336,57 @@ with gr.Blocks(css=css) as demo:
72
  placeholder="Enter your prompt",
73
  container=False,
74
  )
75
-
76
  run_button = gr.Button("Run", scale=0)
77
-
78
- result = gr.Image(label="Result", show_label=False)
79
 
80
- with gr.Accordion("Advanced Settings", open=False):
81
-
82
- negative_prompt = gr.Text(
83
- label="Negative prompt",
84
- max_lines=1,
85
- placeholder="Enter a negative prompt",
86
- visible=False,
87
- )
88
-
89
- seed = gr.Slider(
90
- label="Seed",
91
- minimum=0,
92
- maximum=MAX_SEED,
93
- step=1,
94
- value=0,
95
- )
96
-
97
- randomize_seed = gr.Checkbox(label="Randomize seed", value=True)
98
-
99
- with gr.Row():
100
-
101
- width = gr.Slider(
102
- label="Width",
103
- minimum=256,
104
- maximum=MAX_IMAGE_SIZE,
105
- step=32,
106
- value=512,
107
- )
108
-
109
- height = gr.Slider(
110
- label="Height",
111
- minimum=256,
112
- maximum=MAX_IMAGE_SIZE,
113
- step=32,
114
- value=512,
115
- )
116
-
117
- with gr.Row():
118
-
119
- guidance_scale = gr.Slider(
120
- label="Guidance scale",
121
- minimum=0.0,
122
- maximum=10.0,
123
- step=0.1,
124
- value=0.0,
125
- )
126
-
127
- num_inference_steps = gr.Slider(
128
- label="Number of inference steps",
129
- minimum=1,
130
- maximum=12,
131
- step=1,
132
- value=2,
133
  )
134
-
135
  gr.Examples(
136
- examples = examples,
137
- inputs = [prompt]
 
 
 
138
  )
139
 
140
  run_button.click(
141
- fn = infer,
142
- inputs = [prompt, negative_prompt, seed, randomize_seed, width, height, guidance_scale, num_inference_steps],
143
- outputs = [result]
 
 
 
 
 
 
 
144
  )
145
 
146
- demo.queue().launch()
 
1
+ import os
2
+ import random
3
+ from datetime import datetime
4
+ from typing import Optional
5
+
6
  import gradio as gr
7
  import numpy as np
 
 
8
  import torch
9
+ from diffusers import (
10
+ AnimateDiffPipeline,
11
+ DiffusionPipeline,
12
+ LCMScheduler,
13
+ MotionAdapter,
14
+ )
15
+ from diffusers.utils import export_to_video
16
+ from peft import PeftModel
17
 
18
  device = "cuda" if torch.cuda.is_available() else "cpu"
19
+ mcm_id = "yhzhai/mcm"
20
+ basedir = os.getcwd()
21
+ savedir = os.path.join(
22
+ basedir, "samples", datetime.now().strftime("Gradio-%Y-%m-%dT%H-%M-%S")
23
+ )
24
+
25
+ MAX_SEED = np.iinfo(np.int32).max
26
+
27
+
28
+ def get_modelscope_pipeline(
29
+ mcm_variant: Optional[str] = "WebVid",
30
+ ):
31
+ model_id = "ali-vilab/text-to-video-ms-1.7b"
32
+ pipe = DiffusionPipeline.from_pretrained(
33
+ model_id, torch_dtype=torch.float16, variant="fp16"
34
+ )
35
+ scheduler = LCMScheduler.from_pretrained(
36
+ model_id,
37
+ subfolder="scheduler",
38
+ timestep_scaling=4.0,
39
+ )
40
+ pipe.scheduler = scheduler
41
+ pipe.enable_vae_slicing()
42
+
43
+ if mcm_variant == "WebVid":
44
+ subfolder = "modelscopet2v-webvid"
45
+ elif mcm_variant == "LAION-aes":
46
+ subfolder = "modelscopet2v-laion"
47
+ elif mcm_variant == "Anime":
48
+ subfolder = "modelscopet2v-anime"
49
+ elif mcm_variant == "Realistic":
50
+ subfolder = "modelscopet2v-real"
51
+ elif mcm_variant == "3D Cartoon":
52
+ subfolder = "modelscopet2v-3d-cartoon"
53
+ else:
54
+ subfolder = "modelscopet2v-laion"
55
+
56
+ lora = PeftModel.from_pretrained(
57
+ pipe.unet,
58
+ model_id=mcm_id,
59
+ subfolder=subfolder,
60
+ adapter_name="lora",
61
+ torch_device="cpu",
62
+ )
63
+ lora.merge_and_unload()
64
+ pipe.unet = lora
65
 
 
 
 
 
66
  pipe = pipe.to(device)
67
+
68
+ return pipe
69
+
70
+
71
+ def get_animatediff_pipeline(
72
+ real_variant: Optional[str] = "realvision",
73
+ motion_module_path: str = "guoyww/animatediff-motion-adapter-v1-5-2",
74
+ mcm_variant: Optional[str] = "WebVid",
75
+ ):
76
+ if real_variant is None:
77
+ model_id = "runwayml/stable-diffusion-v1-5"
78
+ elif real_variant == "epicrealism":
79
+ model_id = "emilianJR/epiCRealism"
80
+ elif real_variant == "realvision":
81
+ model_id = "SG161222/Realistic_Vision_V6.0_B1_noVAE"
82
+ else:
83
+ raise ValueError(f"Unknown real_variant {real_variant}")
84
+
85
+ adapter = MotionAdapter.from_pretrained(
86
+ motion_module_path, torch_dtype=torch.float16
87
+ )
88
+ pipe = AnimateDiffPipeline.from_pretrained(
89
+ model_id,
90
+ motion_adapter=adapter,
91
+ torch_dtype=torch.float16,
92
+ )
93
+ scheduler = LCMScheduler.from_pretrained(
94
+ model_id,
95
+ subfolder="scheduler",
96
+ timestep_scaling=4.0,
97
+ clip_sample=False,
98
+ timestep_spacing="linspace",
99
+ beta_schedule="linear",
100
+ beta_start=0.00085,
101
+ beta_end=0.012,
102
+ steps_offset=1,
103
+ )
104
+ pipe.scheduler = scheduler
105
+ pipe.enable_vae_slicing()
106
+
107
+ if mcm_variant == "WebVid":
108
+ subfolder = "animatediff-webvid"
109
+ elif mcm_variant == "LAION-aes":
110
+ subfolder = "animatediff-laion"
111
+ else:
112
+ subfolder = "animatediff-laion"
113
+
114
+ lora = PeftModel.from_pretrained(
115
+ pipe.unet,
116
+ model_id=mcm_id,
117
+ subfolder=subfolder,
118
+ adapter_name="lora",
119
+ torch_device="cpu",
120
+ )
121
+ lora.merge_and_unload()
122
+ pipe.unet = lora
123
+
124
  pipe = pipe.to(device)
125
+ return pipe
126
 
 
 
127
 
128
+ # pipe_dict = {
129
+ # "ModelScope T2V": {"WebVid": None, "LAION-aes": None, "Anime": None, "Realistic": None, "3D Cartoon": None},
130
+ # "AnimateDiff (SD1.5)": {"WebVid": None, "LAION-aes": None},
131
+ # "AnimateDiff (RealisticVision)": {"WebVid": None, "LAION-aes": None},
132
+ # "AnimateDiff (epiCRealism)": {"WebVid": None, "LAION-aes": None},
133
+ # }
134
+ cache_pipeline = {
135
+ "base_model": None,
136
+ "variant": None,
137
+ "pipeline": None,
138
+ }
139
+
140
+
141
+ def infer(
142
+ base_model, variant, prompt, seed=0, randomize_seed=True, num_inference_steps=4
143
+ ):
144
+ # if pipe_dict[base_model][variant] is None:
145
+ # if base_model == "ModelScope T2V":
146
+ # pipe_dict[base_model][variant] = get_modelscope_pipeline(mcm_variant=variant)
147
+ # elif base_model == "AnimateDiff (SD1.5)":
148
+ # pipe_dict[base_model][variant] = get_animatediff_pipeline(
149
+ # real_variant=None,
150
+ # motion_module_path="guoyww/animatediff-motion-adapter-v1-5-2",
151
+ # mcm_variant=variant,
152
+ # )
153
+ # elif base_model == "AnimateDiff (RealisticVision)":
154
+ # pipe_dict[base_model][variant] = get_animatediff_pipeline(
155
+ # real_variant="realvision",
156
+ # motion_module_path="guoyww/animatediff-motion-adapter-v1-5-2",
157
+ # mcm_variant=variant,
158
+ # )
159
+ # elif base_model == "AnimateDiff (epiCRealism)":
160
+ # pipe_dict[base_model][variant] = get_animatediff_pipeline(
161
+ # real_variant="epicrealism",
162
+ # motion_module_path="guoyww/animatediff-motion-adapter-v1-5-2",
163
+ # mcm_variant=variant,
164
+ # )
165
+ # else:
166
+ # raise ValueError(f"Unknown base_model {base_model}")
167
+ if (
168
+ cache_pipeline["base_model"] == base_model
169
+ and cache_pipeline["variant"] == variant
170
+ ):
171
+ pass
172
+ else:
173
+ if base_model == "ModelScope T2V":
174
+ pipeline = get_modelscope_pipeline(mcm_variant=variant)
175
+ elif base_model == "AnimateDiff (SD1.5)":
176
+ pipeline = get_animatediff_pipeline(
177
+ real_variant=None,
178
+ motion_module_path="guoyww/animatediff-motion-adapter-v1-5-2",
179
+ mcm_variant=variant,
180
+ )
181
+ elif base_model == "AnimateDiff (RealisticVision)":
182
+ pipeline = get_animatediff_pipeline(
183
+ real_variant="realvision",
184
+ motion_module_path="guoyww/animatediff-motion-adapter-v1-5-2",
185
+ mcm_variant=variant,
186
+ )
187
+ elif base_model == "AnimateDiff (epiCRealism)":
188
+ pipeline = get_animatediff_pipeline(
189
+ real_variant="epicrealism",
190
+ motion_module_path="guoyww/animatediff-motion-adapter-v1-5-2",
191
+ mcm_variant=variant,
192
+ )
193
+ else:
194
+ raise ValueError(f"Unknown base_model {base_model}")
195
+
196
+ cache_pipeline["base_model"] = base_model
197
+ cache_pipeline["variant"] = variant
198
+ cache_pipeline["pipeline"] = pipeline
199
 
200
  if randomize_seed:
201
  seed = random.randint(0, MAX_SEED)
202
+
203
  generator = torch.Generator().manual_seed(seed)
204
+
205
+ output = cache_pipeline["pipeline"](
206
+ prompt=prompt,
207
+ num_frames=16,
208
+ guidance_scale=1.0,
209
+ num_inference_steps=num_inference_steps,
210
+ generator=generator,
211
+ ).frames
212
+ if not isinstance(output, list):
213
+ output = [output[i] for i in range(output.shape[0])]
214
+
215
+ os.makedirs(savedir, exist_ok=True)
216
+ save_path = os.path.join(
217
+ savedir, f"sample_{base_model}_{variant}_{seed}.mp4".replace(" ", "_")
218
+ )
219
+ export_to_video(
220
+ output[0],
221
+ save_path,
222
+ fps=7,
223
+ )
224
+ print(f"Saved to {save_path}")
225
+ return save_path
226
+
227
 
228
  examples = [
229
+ [
230
+ "ModelScope T2V",
231
+ "LAION-aes",
232
+ "Aerial uhd 4k view. mid-air flight over fresh and clean mountain river at sunny summer morning. Green trees and sun rays on horizon. Direct on sun.",
233
+ ],
234
+ ["ModelScope T2V", "Anime", "Timelapse misty mountain landscape"],
235
+ [
236
+ "ModelScope T2V",
237
+ "WebVid",
238
+ "Back of woman in shorts going near pure creek in beautiful mountains.",
239
+ ],
240
+ [
241
+ "ModelScope T2V",
242
+ "3D Cartoon",
243
+ "A rotating pandoro (a traditional italian sweet yeast bread, most popular around christmas and new year) being eaten in time-lapse.",
244
+ ],
245
+ [
246
+ "ModelScope T2V",
247
+ "Realistic",
248
+ "Slow motion avocado with a stone falls and breaks into 2 parts with splashes",
249
+ ],
250
+ [
251
+ "AnimateDiff (SD1.5)",
252
+ "LAION-aes",
253
+ "Slow motion of delicious salmon sachimi set with green vegetables leaves served on wood plate. make homemade japanese food at home.-dan",
254
+ ],
255
+ [
256
+ "AnimateDiff (SD1.5)",
257
+ "WebVid",
258
+ "Blooming meadow panorama zoom-out shot heavenly clouds and upcoming thunderstorm in mountain range harz, germany.",
259
+ ],
260
+ [
261
+ "AnimateDiff (RealisticVision)",
262
+ "LAION-aes",
263
+ "A young woman in a yellow sweater uses vr glasses, sitting on the shore of a pond on a background of dark waves. a strong wind develops her hair, the sun's rays are reflected from the water.",
264
+ ],
265
+ [
266
+ "AnimateDiff (epiCRealism)",
267
+ "LAION-aes",
268
+ "Female running at sunset. healthy fitness concept",
269
+ ],
270
  ]
271
 
272
+ css = """
273
  #col-container {
274
  margin: 0 auto;
 
275
  }
276
  """
277
 
278
+ variants = {
279
+ "ModelScope T2V": ["WebVid", "LAION-aes", "Anime", "Realistic", "3D Cartoon"],
280
+ "AnimateDiff (SD1.5)": ["WebVid", "LAION-aes"],
281
+ "AnimateDiff (RealisticVision)": ["WebVid", "LAION-aes"],
282
+ "AnimateDiff (epiCRealism)": ["WebVid", "LAION-aes"],
283
+ }
284
+
285
+
286
+ def update_variant(rs):
287
+ return gr.update(choices=variants[rs], value=None)
288
+
289
 
290
  with gr.Blocks(css=css) as demo:
291
+
292
  with gr.Column(elem_id="col-container"):
293
+ gr.HTML(
294
+ """
295
+ <div style="text-align: center; margin-bottom: 20px;">
296
+ <h1 align="center">
297
+ <a href="https://yhzhai.github.io/mcm/"><b>Motion Consistency Model: Accelerating Video Diffusion with Disentangled Motion-Appearance Distillation</b></a>
298
+ </h1>
299
+ <h4>Our motion consistency model not only accelerates text2video diffusion model sampling process, but also can benefit from an additional high-quality image dataset to improve the frame quality of generated videos.</h4>
300
+ <div style="display: flex; justify-content: center; align-items: center; text-align: center;">
301
+ <a href='https://yhzhai.github.io/mcm/'><img src='https://img.shields.io/badge/Project-Page-Green'></a>
302
+ <a href='https://arxiv.org/abs/2406.06890'><img src='https://img.shields.io/badge/Paper-arXiv-red'></a>
303
+ <a href='https://huggingface.co/yhzhai/mcm'><img src='https://img.shields.io/badge/HF-checkpoint-yellow'></a>
304
+ </div>
305
+ </div>
306
+ """
307
+ )
308
+
309
+ with gr.Row():
310
+ base_model = gr.Dropdown(
311
+ label="Base model",
312
+ choices=[
313
+ "ModelScope T2V",
314
+ "AnimateDiff (SD1.5)",
315
+ "AnimateDiff (RealisticVision)",
316
+ "AnimateDiff (epiCRealism)",
317
+ ],
318
+ value="ModelScope T2V",
319
+ interactive=True,
320
+ )
321
+ variant_dropdown = gr.Dropdown(
322
+ variants["ModelScope T2V"],
323
+ label="MCM Variant",
324
+ interactive=True,
325
+ value=None,
326
+ )
327
+ base_model.change(
328
+ update_variant, inputs=[base_model], outputs=[variant_dropdown]
329
+ )
330
+
331
  with gr.Row():
 
332
  prompt = gr.Text(
333
  label="Prompt",
334
  show_label=False,
 
336
  placeholder="Enter your prompt",
337
  container=False,
338
  )
339
+
340
  run_button = gr.Button("Run", scale=0)
 
 
341
 
342
+ with gr.Row():
343
+ with gr.Column():
344
+ with gr.Accordion("Advanced Settings", open=True):
345
+
346
+ seed = gr.Slider(
347
+ label="Seed",
348
+ minimum=0,
349
+ maximum=MAX_SEED,
350
+ step=1,
351
+ value=0,
352
+ )
353
+
354
+ randomize_seed = gr.Checkbox(label="Randomize seed", value=True)
355
+
356
+ with gr.Row():
357
+ num_inference_steps = gr.Slider(
358
+ label="Number of inference steps",
359
+ minimum=1,
360
+ maximum=16,
361
+ step=1,
362
+ value=4,
363
+ )
364
+
365
+ with gr.Column():
366
+ # result = gr.Video(label="Result", show_label=False, interactive=False, height=512, width=512, autoplay=True)
367
+ result = gr.Video(
368
+ label="Result", show_label=False, interactive=False, autoplay=True
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
369
  )
370
+
371
  gr.Examples(
372
+ examples=examples,
373
+ inputs=[base_model, variant_dropdown, prompt],
374
+ cache_examples=True,
375
+ fn=infer,
376
+ outputs=[result],
377
  )
378
 
379
  run_button.click(
380
+ fn=infer,
381
+ inputs=[
382
+ base_model,
383
+ variant_dropdown,
384
+ prompt,
385
+ seed,
386
+ randomize_seed,
387
+ num_inference_steps,
388
+ ],
389
+ outputs=[result],
390
  )
391
 
392
+ demo.queue().launch()
requirements.txt CHANGED
@@ -1,6 +1,87 @@
1
- accelerate
2
- diffusers
3
- invisible_watermark
4
- torch
5
- transformers
6
- xformers
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # --extra-index-url https://download.pytorch.org/whl/cu118
2
+ # torch==2.1.2
3
+ torchvision==0.16.2
4
+ git+https://github.com/yhZhai/diffusers.git
5
+ transformers==4.36.2
6
+ wandb
7
+ matplotlib
8
+ torchmetrics==1.3.1
9
+ torch-fidelity==0.3.0
10
+ einops
11
+ azure-storage-blob==12.12.0
12
+ tensorboard
13
+ tensorboardX
14
+ ffmpeg-python
15
+ opencv-python
16
+ timm
17
+ ftfy
18
+ rouge_score
19
+ omegaconf
20
+ decord
21
+ colorlog
22
+ deepdish
23
+ configobj
24
+ json_lines
25
+ albumentations
26
+ pudb
27
+ imageio
28
+ imageio-ffmpeg
29
+ pytorch-lightning
30
+ omegaconf
31
+ test-tube
32
+ streamlit
33
+ setuptools
34
+ kornia
35
+ clean-fid
36
+ pytorch-fid
37
+ h5py
38
+ lpips
39
+ tabulate
40
+ ninja
41
+ matplotlib
42
+ webdataset
43
+ braceexpand
44
+ Pillow
45
+ accelerate==0.29.3
46
+ compel==0.1.8
47
+ datasets
48
+ filelock
49
+ flax>=0.4.1
50
+ hf-doc-builder>=0.3.0
51
+ huggingface-hub>=0.20.2
52
+ requests-mock==1.10.0
53
+ importlib_metadata
54
+ invisible-watermark>=0.2.0
55
+ isort>=5.5.4
56
+ jax>=0.4.1
57
+ jaxlib>=0.4.1
58
+ Jinja2
59
+ k-diffusion>=0.0.12
60
+ torchsde
61
+ note_seq
62
+ librosa
63
+ numpy
64
+ parameterized
65
+ git+https://github.com/yhZhai/peft.git
66
+ protobuf==3.20.3
67
+ pytest
68
+ pytest-timeout
69
+ pytest-xdist
70
+ ruff==0.1.5
71
+ safetensors>=0.3.1
72
+ sentencepiece>=0.1.91,!=0.1.92
73
+ GitPython<3.1.19
74
+ # scipy==1.11.1
75
+ onnx
76
+ regex!=2019.12.17
77
+ requests
78
+ bitsandbytes
79
+ git+https://github.com/microsoft/azfuse.git
80
+ deepspeed==0.11.2
81
+ # deepspeed==0.6.6
82
+ albumentations
83
+ mlflow
84
+ moviepy
85
+ git+https://github.com/openai/CLIP.git
86
+ av
87
+ git+https://github.com/yhZhai/open_clip.git