microsoft
/

mocapact-models

Model card Files Files and versions Community

mocapact-models / README.md

akolobov

Update README.md

073a9d3 verified 6 months ago

preview code

raw

history blame contribute delete

7.56 kB

	---
	license: cdla-permissive-2.0
	datasets:
	- microsoft/mocapact-data
	---
	# MoCapAct Model Zoo
	Control of simulated humanoid characters is a challenging benchmark for sequential decision-making methods, as it assesses a policy’s ability to drive an inherently unstable, discontinuous, and high-dimensional physical system. Motion capture (MoCap) data can be very helpful in learning sophisticated locomotion policies by teaching a humanoid agent low-level skills (e.g., standing, walking, and running) that can then be used to generate high-level behaviors. However, even with MoCap data, controlling simulated humanoids remains very hard, because this data offers only kinematic information. Finding physical control inputs to realize the MoCap-demonstrated motions has required methods like reinforcement learning that need large amounts of compute, which has effectively served as a barrier to entry for this exciting research direction.

	In an effort to broaden participation and facilitate evaluation of ideas in humanoid locomotion research, we are releasing MoCapAct (Motion Capture with Actions), a library of high-quality pre-trained agents that can track over three hours of MoCap data for a simulated humanoid in the `dm_control` physics-based environment and rollouts from these experts containing proprioceptive observations and actions. MoCapAct allows researchers to sidestep the computationally intensive task of training low-level control policies from MoCap data and instead use MoCapAct's expert agents and demonstrations for learning advanced locomotion behaviors. It also allows improving on our low-level policies by using them and their demonstration data as a starting point.

	In our work, we use MoCapAct to train a single hierarchical policy capable of tracking the entire MoCap dataset within `dm_control`.
	We then re-use the learned low-level component to efficiently learn other high-level tasks.
	Finally, we use MoCapAct to train an autoregressive GPT model and show that it can perform natural motion completion given a motion prompt.
	We encourage the reader to visit our [project website](https://microsoft.github.io/MoCapAct/) to see videos of our results as well as get links to our paper and code.

	## Model Zoo Structure

	The file structure of the model zoo is:
	```
	├── all
	│ └── experts
	│ ├── experts_1.tar.gz
	│ ├── experts_2.tar.gz
	│ ...
	│ └── experts_8.tar.gz
	│
	├── sample
	│ └── experts.tar.gz
	│
	├── multiclip_policy.tar.gz
	│ ├── full_dataset
	│ └── locomotion_dataset
	│
	├── transfer.tar.gz
	│ ├── go_to_target
	│ │ ├── general_low_level
	│ │ ├── locomotion_low_level
	│ │ └── no_low_level
	│ │
	│ └── velocity_control
	│ ├── general_low_level
	│ ├── locomotion_low_level
	│ └── no_low_level
	│
	├── gpt.ckpt
	│
	└── videos
	├── full_clip_videos.tar.gz
	└── snippet_videos.tar.gz
	```

	## Experts Tarball Files
	The expert tarball files have the following structure:
	- `all/experts/experts_*.tar.gz`: Contains all of the clip snippet experts. Due to file size limitations, we split the experts among multiple tarball files.
	- `sample/experts.tar.gz`: Contains the clip snippet experts used to run the examples on the [dataset website](https://microsoft.github.io/MoCapAct/).

	The expert structure is detailed in Appendix A.1 of the paper as well as https://github.com/microsoft/MoCapAct#description.

	An expert can be loaded and rolled out in Python as in the following example:
	```python
	from mocapact import observables
	from mocapact.sb3 import utils
	expert_path = "/path/to/experts/CMU_083_33/CMU_083_33-0-194/eval_rsi/model"
	expert = utils.load_policy(expert_path, observables.TIME_INDEX_OBSERVABLES)

	from mocapact.envs import tracking
	from dm_control.locomotion.tasks.reference_pose import types
	dataset = types.ClipCollection(ids=['CMU_083_33'], start_steps=[0], end_steps=[194])
	env = tracking.MocapTrackingGymEnv(dataset)
	obs, done = env.reset(), False
	while not done:
	action, _ = expert.predict(obs, deterministic=True)
	obs, rew, done, _ = env.step(action)
	print(rew)
	```

	Alternatively, an expert can be rolled out from the command line:
	```bash
	python -m mocapact.clip_expert.evaluate \
	--policy_root /path/to/experts/CMU_016_22/CMU_016_22-0-82/eval_rsi/model \
	--act_noise 0 \
	--ghost_offset 1 \
	--always_init_at_clip_start
	```

	## GPT
	The GPT policy is contained in `gpt.ckpt` and can be loaded using PyTorch Lightning:
	```python
	from mocapact.distillation import model
	policy = model.GPTPolicy.load_from_checkpoint('/path/to/gpt.ckpt', map_location='cpu')
	```
	This policy can be used with `mocapact/distillation/motion_completion.py`, as in the following example:
	```bash
	python -m mocapact.distillation.motion_completion.py \
	--policy_path /path/to/gpt.ckpt \
	--nodeterministic \
	--ghost_offset 1 \
	--expert_root /path/to/experts/CMU_016_25 \
	--max_steps 500 \
	--always_init_at_clip_start \
	--prompt_length 32 \
	--min_steps 32 \
	--device cuda \
	--clip_snippet CMU_016_25
	```

	## Multi-Clip Policy
	The `multiclip_policy.tar.gz` file contains two policies:
	- `full_dataset`: Trained on the entire MoCapAct dataset
	- `locomotion_dataset`: Trained on the `locomotion_small` portion of the MoCapAct dataset

	Taking `full_dataset` as an example, a multi-clip policy can be loaded using PyTorch Lightning:
	```python
	from mocapact.distillation import model
	policy = model.NpmpPolicy.load_from_checkpoint('/path/to/multiclip_policy/full_dataset/model/model.ckpt', map_location='cpu')
	```
	The policy can be used with `mocapact/distillation/evaluate.py`, as in the following example:
	```bash
	python -m mocapact.distillation.evaluate \
	--policy_path /path/to/multiclip_policy/full_dataset/model/model.ckpt \
	--act_noise 0 \
	--ghost_offset 1 \
	--always_init_at_clip_start \
	--termination_error_threshold 10 \
	--clip_snippets CMU_016_22
	```

	## Transfer
	The `transfer.tar.gz` file contains policies for downstream tasks. The main difference between the contained folders is what low-level policy is used:
	- `general_low_level`: Low-level policy comes from `multiclip_policy/full_dataset`
	- `locomotion_low_level`: Low-level policy comes from `multiclip_policy/locomotion_dataset`
	- `no_low_level`: No low-level policy used

	The policy structure is as follows:
	```
	├── best_model.zip
	├── low_level_policy.ckpt
	└── vecnormalize.pkl
	```
	The `low_level_policy.ckpt` (only present in `general_low_level` and `locomotion_low_level`) contains the low-level policy and is loaded with PyTorch Lightning.
	The `best_model.zip` file contains the task policy parameters.
	The `vecnormalize.pkl` file contains the observation normalizer.
	The latter two files are loaded with Stable-Baselines3.

	The policy can be used with `mocapact/transfer/evaluate.py`, as in the following example:
	```bash
	python -m mocapact.transfer.evaluate \
	--model_root /path/to/transfer/go_to_target/general_low_level \
	--task /path/to/mocapact/transfer/config.py:go_to_target
	```

	## MoCap Videos
	There are two tarball files containing videos of the MoCap clips in the dataset:
	- `full_clip_videos.tar.gz` contains videos of the full MoCap clips.
	- `snippet_videos.tar.gz` contains videos of the snippets that were used to train the experts.
	Note that they are playbacks of the clips themselves, not rollouts of the corresponding experts.