MeYourHint commited on
Commit
55262a8
·
1 Parent(s): 599b6c9

sdk version check

Browse files
Files changed (1) hide show
  1. README.md +2 -224
README.md CHANGED
@@ -4,229 +4,7 @@ emoji: 🎭
4
  colorFrom: pink
5
  colorTo: purple
6
  sdk: gradio
7
- sdk_version: 3.48.0
8
  app_file: app.py
9
  pinned: True
10
- ---
11
-
12
- # MoMask: Generative Masked Modeling of 3D Human Motions
13
- ## [[Project Page]](https://ericguo5513.github.io/momask) [[Paper]](https://arxiv.org/abs/2312.00063)
14
- ![teaser_image](https://ericguo5513.github.io/momask/static/images/teaser.png)
15
-
16
- If you find our code or paper helpful, please consider citing:
17
- ```
18
- @article{guo2023momask,
19
- title={MoMask: Generative Masked Modeling of 3D Human Motions},
20
- author={Chuan Guo and Yuxuan Mu and Muhammad Gohar Javed and Sen Wang and Li Cheng},
21
- year={2023},
22
- eprint={2312.00063},
23
- archivePrefix={arXiv},
24
- primaryClass={cs.CV}
25
- }
26
- ```
27
-
28
- ## :postbox: News
29
- 📢 **2023-12-19** --- Release scripts for temporal inpainting.
30
-
31
- 📢 **2023-12-15** --- Release codes and models for momask. Including training/eval/generation scripts.
32
-
33
- 📢 **2023-11-29** --- Initialized the webpage and git project.
34
-
35
-
36
- ## :round_pushpin: Get You Ready
37
-
38
- <details>
39
-
40
- ### 1. Conda Environment
41
- ```
42
- conda env create -f environment.yml
43
- conda activate momask
44
- pip install git+https://github.com/openai/CLIP.git
45
- ```
46
- We test our code on Python 3.7.13 and PyTorch 1.7.1
47
-
48
-
49
- ### 2. Models and Dependencies
50
-
51
- #### Download Pre-trained Models
52
- ```
53
- bash prepare/download_models.sh
54
- ```
55
-
56
- #### Download Evaluation Models and Gloves
57
- For evaluation only.
58
- ```
59
- bash prepare/download_evaluator.sh
60
- bash prepare/download_glove.sh
61
- ```
62
-
63
- #### Troubleshooting
64
- To address the download error related to gdown: "Cannot retrieve the public link of the file. You may need to change the permission to 'Anyone with the link', or have had many accesses". A potential solution is to run `pip install --upgrade --no-cache-dir gdown`, as suggested on https://github.com/wkentaro/gdown/issues/43. This should help resolve the issue.
65
-
66
- #### (Optional) Download Mannually
67
- Visit [[Google Drive]](https://drive.google.com/drive/folders/1b3GnAbERH8jAoO5mdWgZhyxHB73n23sK?usp=drive_link) to download the models and evaluators mannually.
68
-
69
- ### 3. Get Data
70
-
71
- You have two options here:
72
- * **Skip getting data**, if you just want to generate motions using *own* descriptions.
73
- * **Get full data**, if you want to *re-train* and *evaluate* the model.
74
-
75
- **(a). Full data (text + motion)**
76
-
77
- **HumanML3D** - Follow the instruction in [HumanML3D](https://github.com/EricGuo5513/HumanML3D.git), then copy the result dataset to our repository:
78
- ```
79
- cp -r ../HumanML3D/HumanML3D ./dataset/HumanML3D
80
- ```
81
- **KIT**-Download from [HumanML3D](https://github.com/EricGuo5513/HumanML3D.git), then place result in `./dataset/KIT-ML`
82
-
83
- ####
84
-
85
- </details>
86
-
87
- ## :rocket: Demo
88
- <details>
89
-
90
- ### (a) Generate from a single prompt
91
- ```
92
- python gen_t2m.py --gpu_id 1 --ext exp1 --text_prompt "A person is running on a treadmill."
93
- ```
94
- ### (b) Generate from a prompt file
95
- An example of prompt file is given in `./assets/text_prompt.txt`. Please follow the format of `<text description>#<motion length>` at each line. Motion length indicates the number of poses, which must be integeter and will be rounded by 4. In our work, motion is in 20 fps.
96
-
97
- If you write `<text description>#NA`, our model will determine a length. Note once there is **one** NA, all the others will be **NA** automatically.
98
-
99
- ```
100
- python gen_t2m.py --gpu_id 1 --ext exp2 --text_path ./assets/text_prompt.txt
101
- ```
102
-
103
-
104
- A few more parameters you may be interested:
105
- * `--repeat_times`: number of replications for generation, default `1`.
106
- * `--motion_length`: specify the number of poses for generation, only applicable in (a).
107
-
108
- The output files are stored under folder `./generation/<ext>/`. They are
109
- * `numpy files`: generated motions with shape of (nframe, 22, 3), under subfolder `./joints`.
110
- * `video files`: stick figure animation in mp4 format, under subfolder `./animation`.
111
- * `bvh files`: bvh files of the generated motion, under subfolder `./animation`.
112
-
113
- We also apply naive foot ik to the generated motions, see files with suffix `_ik`. It sometimes works well, but sometimes will fail.
114
-
115
- </details>
116
-
117
- ## :dancers: Visualization
118
- <details>
119
-
120
- All the animations are manually rendered in blender. We use the characters from [mixamo](https://www.mixamo.com/#/). You need to download the characters in T-Pose with skeleton.
121
-
122
- ### Retargeting
123
- For retargeting, we found rokoko usually leads to large error on foot. On the other hand, [keemap.rig.transfer](https://github.com/nkeeline/Keemap-Blender-Rig-ReTargeting-Addon/releases) shows more precise retargetting. You could watch the [tutorial](https://www.youtube.com/watch?v=EG-VCMkVpxg) here.
124
-
125
- Following these steps:
126
- * Download keemap.rig.transfer from the github, and install it in blender.
127
- * Import both the motion files (.bvh) and character files (.fbx) in blender.
128
- * `Shift + Select` the both source and target skeleton. (Do not need to be Rest Position)
129
- * Switch to `Pose Mode`, then unfold the `KeeMapRig` tool at the top-right corner of the view window.
130
- * Load and read the bone mapping file `./assets/mapping.json`(or `mapping6.json` if it doesn't work). This file is manually made by us. It works for most characters in mixamo. You could make your own.
131
- * Adjust the `Number of Samples`, `Source Rig`, `Destination Rig Name`.
132
- * Clik `Transfer Animation from Source Destination`, wait a few seconds.
133
-
134
- We didn't tried other retargetting tools. Welcome to comment if you find others are more useful.
135
-
136
- ### Scene
137
-
138
- We use this [scene](https://drive.google.com/file/d/1lg62nugD7RTAIz0Q_YP2iZsxpUzzOkT1/view?usp=sharing) for animation.
139
-
140
-
141
- </details>
142
-
143
- ## :clapper: Temporal Inpainting
144
- <details>
145
- We conduct mask-based editing in the m-transformer stage, followed by the regeneration of residual tokens for the entire sequence. To load your own motion, provide the path through `--source_motion`. Utilize `-msec` to specify the mask section, supporting either ratio or frame index. For instance, `-msec 0.3,0.6` with `max_motion_length=196` is equivalent to `-msec 59,118`, indicating the editing of the frame section [59, 118].
146
-
147
- ```
148
- python edit_t2m.py --gpu_id 1 --ext exp3 --use_res_model -msec 0.4,0.7 --text_prompt "A man picks something from the ground using his right hand."
149
- ```
150
-
151
- Note: Presently, the source motion must adhere to the format of a HumanML3D dim-263 feature vector. An example motion vector data from the HumanML3D test set is available in `example_data/000612.npy`. To process your own motion data, you can utilize the `process_file` function from `utils/motion_process.py`.
152
-
153
- </details>
154
-
155
- ## :space_invader: Train Your Own Models
156
- <details>
157
-
158
-
159
- **Note**: You have to train RVQ **BEFORE** training masked/residual transformers. The latter two can be trained simultaneously.
160
-
161
- ### Train RVQ
162
- ```
163
- python train_vq.py --name rvq_name --gpu_id 1 --dataset_name t2m --batch_size 512 --num_quantizers 6 --max_epoch 500 --quantize_drop_prob 0.2
164
- ```
165
-
166
- ### Train Masked Transformer
167
- ```
168
- python train_t2m_transformer.py --name mtrans_name --gpu_id 2 --dataset_name t2m --batch_size 64 --vq_name rvq_name
169
- ```
170
-
171
- ### Train Residual Transformer
172
- ```
173
- python train_res_transformer.py --name rtrans_name --gpu_id 2 --dataset_name t2m --batch_size 64 --vq_name rvq_name --cond_drop_prob 0.2 --share_weight
174
- ```
175
-
176
- * `--dataset_name`: motion dataset, `t2m` for HumanML3D and `kit` for KIT-ML.
177
- * `--name`: name your model. This will create to model space as `./checkpoints/<dataset_name>/<name>`
178
- * `--gpu_id`: GPU id.
179
- * `--batch_size`: we use `512` for rvq training. For masked/residual transformer, we use `64` on HumanML3D and `16` for KIT-ML.
180
- * `--num_quantizers`: number of quantization layers, `6` is used in our case.
181
- * `--quantize_drop_prob`: quantization dropout ratio, `0.2` is used.
182
- * `--vq_name`: when training masked/residual transformer, you need to specify the name of rvq model for tokenization.
183
- * `--cond_drop_prob`: condition drop ratio, for classifier-free guidance. `0.2` is used.
184
- * `--share_weight`: whether to share the projection/embedding weights in residual transformer.
185
-
186
- All the pre-trained models and intermediate results will be saved in space `./checkpoints/<dataset_name>/<name>`.
187
- </details>
188
-
189
- ## :book: Evaluation
190
- <details>
191
-
192
- ### Evaluate RVQ Reconstruction:
193
- HumanML3D:
194
- ```
195
- python eval_t2m_vq.py --gpu_id 0 --name rvq_nq6_dc512_nc512_noshare_qdp0.2 --dataset_name t2m --ext rvq_nq6
196
-
197
- ```
198
- KIT-ML:
199
- ```
200
- python eval_t2m_vq.py --gpu_id 0 --name rvq_nq6_dc512_nc512_noshare_qdp0.2_k --dataset_name kit --ext rvq_nq6
201
- ```
202
-
203
- ### Evaluate Text2motion Generation:
204
- HumanML3D:
205
- ```
206
- python eval_t2m_trans_res.py --res_name tres_nlayer8_ld384_ff1024_rvq6ns_cdp0.2_sw --dataset_name t2m --name t2m_nlayer8_nhead6_ld384_ff1024_cdp0.1_rvq6ns --gpu_id 1 --cond_scale 4 --time_steps 10 --ext evaluation
207
- ```
208
- KIT-ML:
209
- ```
210
- python eval_t2m_trans_res.py --res_name tres_nlayer8_ld384_ff1024_rvq6ns_cdp0.2_sw_k --dataset_name kit --name t2m_nlayer8_nhead6_ld384_ff1024_cdp0.1_rvq6ns_k --gpu_id 0 --cond_scale 2 --time_steps 10 --ext evaluation
211
- ```
212
-
213
- * `--res_name`: model name of `residual transformer`.
214
- * `--name`: model name of `masked transformer`.
215
- * `--cond_scale`: scale of classifer-free guidance.
216
- * `--time_steps`: number of iterations for inference.
217
- * `--ext`: filename for saving evaluation results.
218
-
219
- The final evaluation results will be saved in `./checkpoints/<dataset_name>/<name>/eval/<ext>.log`
220
-
221
- </details>
222
-
223
- ## Acknowlegements
224
-
225
- We sincerely thank the open-sourcing of these works where our code is based on:
226
-
227
- [deep-motion-editing](https://github.com/DeepMotionEditing/deep-motion-editing), [Muse](https://github.com/lucidrains/muse-maskgit-pytorch), [vector-quantize-pytorch](https://github.com/lucidrains/vector-quantize-pytorch), [T2M-GPT](https://github.com/Mael-zys/T2M-GPT), [MDM](https://github.com/GuyTevet/motion-diffusion-model/tree/main) and [MLD](https://github.com/ChenFengYe/motion-latent-diffusion/tree/main)
228
-
229
- ## License
230
- This code is distributed under an [MIT LICENSE](https://github.com/EricGuo5513/momask-codes/tree/main?tab=MIT-1-ov-file#readme).
231
-
232
- Note that our code depends on other libraries, including SMPL, SMPL-X, PyTorch3D, and uses datasets which each have their own respective licenses that must also be followed.
 
4
  colorFrom: pink
5
  colorTo: purple
6
  sdk: gradio
7
+ sdk_version: 3.24.1
8
  app_file: app.py
9
  pinned: True
10
+ ---