diff --git a/INSTALL.md b/INSTALL.md new file mode 100644 index 0000000000000000000000000000000000000000..3a5f788467b61d0f0d96f9731e47cfbfe852f17b --- /dev/null +++ b/INSTALL.md @@ -0,0 +1,46 @@ +### Set up the python environment + +``` +conda create -n neuralbody python=3.7 +conda activate neuralbody + +# make sure that the pytorch cuda is consistent with the system cuda +# e.g., if your system cuda is 10.0, install torch 1.4 built from cuda 10.0 +pip install torch==1.4.0+cu100 -f https://download.pytorch.org/whl/torch_stable.html + +pip install -r requirements.txt + +# install spconv +cd +git clone https://github.com/traveller59/spconv --recursive +cd spconv +git checkout abf0acf30f5526ea93e687e3f424f62d9cd8313a +git submodule update --init --recursive +export CUDA_HOME="/usr/local/cuda-10.0" +python setup.py bdist_wheel +cd dist +pip install spconv-1.2.1-cp36-cp36m-linux_x86_64.whl +``` + +### Set up datasets + +#### People-Snapshot dataset + +1. Download the People-Snapshot dataset [here](https://graphics.tu-bs.de/people-snapshot). +2. Process the People-Snapshot dataset using the [script](https://github.com/zju3dv/neuralbody#process-people-snapshot). +3. Create a soft link: + ``` + ROOT=/path/to/neuralbody + cd $ROOT/data + ln -s /path/to/people_snapshot people_snapshot + ``` + +#### ZJU-Mocap dataset + +1. If someone wants to download the ZJU-Mocap dataset, please fill in the [agreement](https://zjueducn-my.sharepoint.com/:b:/g/personal/pengsida_zju_edu_cn/EUPiybrcFeNEhdQROx4-LNEBm4lzLxDwkk1SBcNWFgeplA?e=BGDiQh), and email me (pengsida@zju.edu.cn) and cc Xiaowei Zhou (xwzhou@zju.edu.cn) to request the download link. +2. Create a soft link: + ``` + ROOT=/path/to/neuralbody + cd $ROOT/data + ln -s /path/to/zju_mocap zju_mocap + ``` diff --git a/LICENSE b/LICENSE new file mode 100644 index 0000000000000000000000000000000000000000..cc617cef060665082ecc839a47bd6a9b1c5c7688 --- /dev/null +++ b/LICENSE @@ -0,0 +1,18 @@ +//////////////////////////////////////////////////////////////////////////// +// Copyright 2020-2021 the 3D Vision Group at the State Key Lab of CAD&CG, +// Zhejiang University. All Rights Reserved. +// +// For more information see +// If you use this code, please cite the corresponding publications as +// listed on the above website. +// +// Permission to use, copy, modify and distribute this software and its +// documentation for educational, research and non-profit purposes only. +// Any modification based on this work must be open source and prohibited +// for commercial use. +// You must retain, in the source form of any derivative works that you +// distribute, all copyright, patent, trademark, and attribution notices +// from the source form of this work. +// +// +//////////////////////////////////////////////////////////////////////////// diff --git a/README.md b/README.md index 2879e1c4ae7e2aff42b50d10efaa9c2772152728..2af28cfd822d9327f9e2a13917eff721cd80cfff 100644 --- a/README.md +++ b/README.md @@ -1,12 +1,202 @@ ---- -title: NeuralBody -emoji: 📚 -colorFrom: yellow -colorTo: indigo -sdk: gradio -sdk_version: 3.0.10 -app_file: app.py -pinned: false ---- - -Check out the configuration reference at https://huggingface.co/docs/hub/spaces#reference +**News** + +* `05/17/2021` To make the comparison on ZJU-MoCap easier, we save quantitative and qualitative results of other methods at [here](https://github.com/zju3dv/neuralbody/blob/master/supplementary_material.md#results-of-other-methods-on-zju-mocap), including Neural Volumes, Multi-view Neural Human Rendering, and Deferred Neural Human Rendering. +* `05/13/2021` To make the following works easier compare with our model, we save our rendering results of ZJU-MoCap at [here](https://zjueducn-my.sharepoint.com/:u:/g/personal/pengsida_zju_edu_cn/Ea3VOUy204VAiVJ-V-OGd9YBxdhbtfpS-U6icD_rDq0mUQ?e=cAcylK) and write a [document](supplementary_material.md) that describes the training and test protocols. +* `05/12/2021` The code supports the test and visualization on unseen human poses. +* `05/12/2021` We update the ZJU-MoCap dataset with better fitted SMPL using [EasyMocap](https://github.com/zju3dv/EasyMocap). We also release a [website](https://zju3dv.github.io/zju_mocap/) for visualization. Please see [here](https://github.com/zju3dv/neuralbody#potential-problems-of-provided-smpl-parameters) for the usage of provided smpl parameters. + +# Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans +### [Project Page](https://zju3dv.github.io/neuralbody) | [Video](https://www.youtube.com/watch?v=BPCAMeBCE-8) | [Paper](https://arxiv.org/pdf/2012.15838.pdf) | [Data](https://github.com/zju3dv/neuralbody/blob/master/INSTALL.md#zju-mocap-dataset) + +![monocular](https://zju3dv.github.io/neuralbody/images/monocular.gif) + +> [Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans](https://arxiv.org/pdf/2012.15838.pdf) +> Sida Peng, Yuanqing Zhang, Yinghao Xu, Qianqian Wang, Qing Shuai, Hujun Bao, Xiaowei Zhou +> CVPR 2021 + +Any questions or discussions are welcomed! + +## Installation + +Please see [INSTALL.md](INSTALL.md) for manual installation. + +### Installation using docker + +Please see [docker/README.md](docker/README.md). + +Thanks to [Zhaoyi Wan](https://github.com/wanzysky) for providing the docker implementation. + +## Run the code on the custom dataset + +Please see [CUSTOM](tools/custom). + +## Run the code on People-Snapshot + +Please see [INSTALL.md](INSTALL.md) to download the dataset. + +We provide the pretrained models at [here](https://zjueducn-my.sharepoint.com/:f:/g/personal/pengsida_zju_edu_cn/Enn43YWDHwBEg-XBqnetFYcBLr3cItZ0qUFU-oKUpDHKXw?e=FObjE9). + +### Process People-Snapshot + +We already provide some processed data. If you want to process more videos of People-Snapshot, you could use [tools/process_snapshot.py](tools/process_snapshot.py). + +You can also visualize smpl parameters of People-Snapshot with [tools/vis_snapshot.py](tools/vis_snapshot.py). + +### Visualization on People-Snapshot + +Take the visualization on `female-3-casual` as an example. The command lines for visualization are recorded in [visualize.sh](visualize.sh). + +1. Download the corresponding pretrained model and put it to `$ROOT/data/trained_model/if_nerf/female3c/latest.pth`. +2. Visualization: + * Visualize novel views of single frame + ``` + python run.py --type visualize --cfg_file configs/snapshot_exp/snapshot_f3c.yaml exp_name female3c vis_novel_view True num_render_views 144 + ``` + + ![monocular](https://zju3dv.github.io/neuralbody/images/monocular_render.gif) + + * Visualize views of dynamic humans with fixed camera + ``` + python run.py --type visualize --cfg_file configs/snapshot_exp/snapshot_f3c.yaml exp_name female3c vis_novel_pose True + ``` + + ![monocular](https://zju3dv.github.io/neuralbody/images/monocular_perform.gif) + + * Visualize mesh + ``` + # generate meshes + python run.py --type visualize --cfg_file configs/snapshot_exp/snapshot_f3c.yaml exp_name female3c vis_mesh True train.num_workers 0 + # visualize a specific mesh + python tools/render_mesh.py --exp_name female3c --dataset people_snapshot --mesh_ind 226 + ``` + + ![monocular](https://zju3dv.github.io/neuralbody/images/monocular_mesh.gif) + +3. The results of visualization are located at `$ROOT/data/render/female3c` and `$ROOT/data/perform/female3c`. + +### Training on People-Snapshot + +Take the training on `female-3-casual` as an example. The command lines for training are recorded in [train.sh](train.sh). + +1. Train: + ``` + # training + python train_net.py --cfg_file configs/snapshot_exp/snapshot_f3c.yaml exp_name female3c resume False + # distributed training + python -m torch.distributed.launch --nproc_per_node=4 train_net.py --cfg_file configs/snapshot_exp/snapshot_f3c.yaml exp_name female3c resume False gpus "0, 1, 2, 3" distributed True + ``` +2. Train with white background: + ``` + # training + python train_net.py --cfg_file configs/snapshot_exp/snapshot_f3c.yaml exp_name female3c resume False white_bkgd True + ``` +3. Tensorboard: + ``` + tensorboard --logdir data/record/if_nerf + ``` + +## Run the code on ZJU-MoCap + +Please see [INSTALL.md](INSTALL.md) to download the dataset. + +We provide the pretrained models at [here](https://zjueducn-my.sharepoint.com/:f:/g/personal/pengsida_zju_edu_cn/Enn43YWDHwBEg-XBqnetFYcBLr3cItZ0qUFU-oKUpDHKXw?e=FObjE9). + +### Potential problems of provided smpl parameters + +1. The newly fitted parameters locate in `new_params`. Currently, the released pretrained models are trained on previously fitted parameters, which locate in `params`. +2. The smpl parameters of ZJU-MoCap have different definition from the one of MPI's smplx. + * If you want to extract vertices from the provided smpl parameters, please use `zju_smpl/extract_vertices.py`. + * The reason that we use the current definition is described at [here](https://github.com/zju3dv/EasyMocap/blob/master/doc/02_output.md#attention-for-smplsmpl-x-users). + +It is okay to train Neural Body with smpl parameters fitted by smplx. + +### Test on ZJU-MoCap + +The command lines for test are recorded in [test.sh](test.sh). + +Take the test on `sequence 313` as an example. + +1. Download the corresponding pretrained model and put it to `$ROOT/data/trained_model/if_nerf/xyzc_313/latest.pth`. +2. Test on training human poses: + ``` + python run.py --type evaluate --cfg_file configs/zju_mocap_exp/latent_xyzc_313.yaml exp_name xyzc_313 + ``` +3. Test on unseen human poses: + ``` + python run.py --type evaluate --cfg_file configs/zju_mocap_exp/latent_xyzc_313.yaml exp_name xyzc_313 test_novel_pose True + ``` + +### Visualization on ZJU-MoCap + +Take the visualization on `sequence 313` as an example. The command lines for visualization are recorded in [visualize.sh](visualize.sh). + +1. Download the corresponding pretrained model and put it to `$ROOT/data/trained_model/if_nerf/xyzc_313/latest.pth`. +2. Visualization: + * Visualize novel views of single frame + ``` + python run.py --type visualize --cfg_file configs/zju_mocap_exp/latent_xyzc_313.yaml exp_name xyzc_313 vis_novel_view True + ``` + ![zju_mocap](https://zju3dv.github.io/neuralbody/images/zju_mocap_render_313.gif) + + * Visualize novel views of single frame by rotating the SMPL model + ``` + python run.py --type visualize --cfg_file configs/zju_mocap_exp/latent_xyzc_313.yaml exp_name xyzc_313 vis_novel_view True num_render_views 100 + ``` + ![zju_mocap](https://zju3dv.github.io/neuralbody/images/rotate_smpl.gif) + + * Visualize views of dynamic humans with fixed camera + ``` + python run.py --type visualize --cfg_file configs/zju_mocap_exp/latent_xyzc_313.yaml exp_name xyzc_313 vis_novel_pose True num_render_frame 1000 num_render_views 1 + ``` + ![zju_mocap](https://zju3dv.github.io/neuralbody/images/zju_mocap_perform_fixed_313.gif) + + * Visualize views of dynamic humans with rotated camera + ``` + python run.py --type visualize --cfg_file configs/zju_mocap_exp/latent_xyzc_313.yaml exp_name xyzc_313 vis_novel_pose True num_render_frame 1000 + ``` + ![zju_mocap](https://zju3dv.github.io/neuralbody/images/zju_mocap_perform_313.gif) + + * Visualize mesh + ``` + # generate meshes + python run.py --type visualize --cfg_file configs/zju_mocap_exp/latent_xyzc_313.yaml exp_name xyzc_313 vis_mesh True train.num_workers 0 + # visualize a specific mesh + python tools/render_mesh.py --exp_name xyzc_313 --dataset zju_mocap --mesh_ind 0 + ``` + ![zju_mocap](https://zju3dv.github.io/neuralbody/images/zju_mocap_mesh.gif) + +4. The results of visualization are located at `$ROOT/data/render/xyzc_313` and `$ROOT/data/perform/xyzc_313`. + +### Training on ZJU-MoCap + +Take the training on `sequence 313` as an example. The command lines for training are recorded in [train.sh](train.sh). + +1. Train: + ``` + # training + python train_net.py --cfg_file configs/zju_mocap_exp/latent_xyzc_313.yaml exp_name xyzc_313 resume False + # distributed training + python -m torch.distributed.launch --nproc_per_node=4 train_net.py --cfg_file configs/zju_mocap_exp/latent_xyzc_313.yaml exp_name xyzc_313 resume False gpus "0, 1, 2, 3" distributed True + ``` +2. Train with white background: + ``` + # training + python train_net.py --cfg_file configs/zju_mocap_exp/latent_xyzc_313.yaml exp_name xyzc_313 resume False white_bkgd True + ``` +3. Tensorboard: + ``` + tensorboard --logdir data/record/if_nerf + ``` + +## Citation + +If you find this code useful for your research, please use the following BibTeX entry. + +``` +@inproceedings{peng2021neural, + title={Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans}, + author={Peng, Sida and Zhang, Yuanqing and Xu, Yinghao and Wang, Qianqian and Shuai, Qing and Bao, Hujun and Zhou, Xiaowei}, + booktitle={CVPR}, + year={2021} +} +``` diff --git a/configs/default.yaml b/configs/default.yaml new file mode 100644 index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 diff --git a/configs/h36m_exp/latent_xyzc_s11g.yaml b/configs/h36m_exp/latent_xyzc_s11g.yaml new file mode 100644 index 0000000000000000000000000000000000000000..ae1f6274d971f0f3fbe0fb44e91925db7d96b5d5 --- /dev/null +++ b/configs/h36m_exp/latent_xyzc_s11g.yaml @@ -0,0 +1,28 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/zju_mocap_exp/latent_xyzc_313.yaml' + +train_dataset: + data_root: 'data/h36m/S11/Greeting' + human: 'S11' + ann_file: 'data/h36m/S11/Greeting/annots.npy' + split: 'train' + +test_dataset: + data_root: 'data/h36m/S11/Greeting' + human: 'S11' + ann_file: 'data/h36m/S11/Greeting/annots.npy' + split: 'test' + +# data options +H: 1002 +W: 1000 +ratio: 1. +training_view: [0, 1, 2, 3] +begin_ith_frame: 1200 +num_train_frame: 400 +smpl: 'smpl' +vertices: 'vertices' +params: 'params' +big_box: True diff --git a/configs/h36m_exp/latent_xyzc_s9p.yaml b/configs/h36m_exp/latent_xyzc_s9p.yaml new file mode 100644 index 0000000000000000000000000000000000000000..7697454fdc24517cb1df13f7ef9f87e1da7b8828 --- /dev/null +++ b/configs/h36m_exp/latent_xyzc_s9p.yaml @@ -0,0 +1,28 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/zju_mocap_exp/latent_xyzc_313.yaml' + +train_dataset: + data_root: 'data/h36m/S9/Posing' + human: 'S9' + ann_file: 'data/h36m/S9/Posing/annots.npy' + split: 'train' + +test_dataset: + data_root: 'data/h36m/S9/Posing' + human: 'S9' + ann_file: 'data/h36m/S9/Posing/annots.npy' + split: 'test' + +# data options +H: 1002 +W: 1000 +ratio: 1. +training_view: [0, 1, 2, 3] +begin_ith_frame: 1000 +num_train_frame: 300 +smpl: 'smpl' +vertices: 'vertices' +params: 'params' +big_box: True diff --git a/configs/monocular_custom.yaml b/configs/monocular_custom.yaml new file mode 100644 index 0000000000000000000000000000000000000000..84a4e729ce95be1fb9205772bf8f16f485be52de --- /dev/null +++ b/configs/monocular_custom.yaml @@ -0,0 +1,25 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/snapshot_exp/snapshot_f3c.yaml' + +train_dataset: + data_root: 'path/to/custom_data', + human: 'custom', + ann_file: 'path/to/custom_data/params.npy', + split: 'train' + +test_dataset: + data_root: 'path/to/custom_data', + human: 'custom', + ann_file: 'path/to/custom_data/params.npy', + split: 'test' + +# data options +ratio: 1. +training_view: [0, 6, 12, 18] +num_train_frame: 300 +smpl: 'smpl' +vertices: 'vertices' +params: 'params' +big_box: True diff --git a/configs/multi_view_custom.yaml b/configs/multi_view_custom.yaml new file mode 100644 index 0000000000000000000000000000000000000000..db208461a392429782a68d49b515a708f2aa6da0 --- /dev/null +++ b/configs/multi_view_custom.yaml @@ -0,0 +1,25 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/zju_mocap_exp/latent_xyzc_313.yaml' + +train_dataset: + data_root: 'path/to/custom_data', + human: 'custom', + ann_file: 'path/to/custom_data/annots.npy', + split: 'train' + +test_dataset: + data_root: 'path/to/custom_data', + human: 'custom', + ann_file: 'path/to/custom_data/annots.npy', + split: 'test' + +# data options +ratio: 1. +training_view: [0, 6, 12, 18] +num_train_frame: 300 +smpl: 'smpl' +vertices: 'vertices' +params: 'params' +big_box: True diff --git a/configs/nerf/nerf_313.yaml b/configs/nerf/nerf_313.yaml new file mode 100644 index 0000000000000000000000000000000000000000..137f5ef1c198a15dda4f21139a03a2c44272b0e2 --- /dev/null +++ b/configs/nerf/nerf_313.yaml @@ -0,0 +1,145 @@ +task: 'if_nerf' +gpus: [0] + +train_dataset_module: 'lib.datasets.light_stage.multi_view_dataset' +train_dataset_path: 'lib/datasets/light_stage/multi_view_dataset.py' +test_dataset_module: 'lib.datasets.light_stage.multi_view_dataset' +test_dataset_path: 'lib/datasets/light_stage/multi_view_dataset.py' + +network_module: 'lib.networks.nerf' +network_path: 'lib/networks/nerf.py' +renderer_module: 'lib.networks.renderer.volume_renderer' +renderer_path: 'lib/networks/renderer/volume_renderer.py' + +trainer_module: 'lib.train.trainers.nerf.py' +trainer_path: 'lib/train/trainers/nerf.py' + +evaluator_module: 'lib.evaluators.if_nerf' +evaluator_path: 'lib/evaluators/if_nerf.py' + +visualizer_module: 'lib.visualizers.if_nerf' +visualizer_path: 'lib/visualizers/if_nerf.py' + +human: 313 + +train_dataset: + data_root: 'data/zju_mocap/CoreView_313' + human: 'CoreView_313' + ann_file: 'data/zju_mocap/CoreView_313/annots.npy' + split: 'train' + +test_dataset: + data_root: 'data/zju_mocap/CoreView_313' + human: 'CoreView_313' + ann_file: 'data/zju_mocap/CoreView_313/annots.npy' + split: 'test' + +train: + batch_size: 1 + collator: '' + lr: 5e-4 + weight_decay: 0 + epoch: 400 + scheduler: + type: 'exponential' + gamma: 0.1 + decay_epochs: 1000 + num_workers: 16 + +test: + sampler: 'FrameSampler' + batch_size: 1 + collator: '' + +ep_iter: 500 +save_ep: 1000 +eval_ep: 1000 + +# training options +netdepth: 8 +netwidth: 256 +netdepth_fine: 8 +netwidth_fine: 256 +netchunk: 65536 +chunk: 32768 + +no_batching: True + +# rendering options +use_viewdirs: True +i_embed: 0 +xyz_res: 10 +view_res: 4 +raw_noise_std: 0 +lindisp: False + +N_samples: 64 +N_importance: 128 +N_rand: 1024 + +perturb: 1 +white_bkgd: False + +num_render_views: 50 + +# data options +ratio: 0.5 +num_train_frame: 1 +smpl: 'smpl' +params: 'params' + +voxel_size: [0.005, 0.005, 0.005] # dhw + +# record options +log_interval: 1 + + +novel_view_cfg: + train_dataset_module: 'lib.datasets.light_stage.multi_view_demo_dataset' + train_dataset_path: 'lib/datasets/light_stage/multi_view_demo_dataset.py' + test_dataset_module: 'lib.datasets.light_stage.multi_view_demo_dataset' + test_dataset_path: 'lib/datasets/light_stage/multi_view_demo_dataset.py' + + renderer_module: 'lib.networks.renderer.volume_renderer' + renderer_path: 'lib/networks/renderer/volume_renderer.py' + + visualizer_module: 'lib.visualizers.if_nerf_demo' + visualizer_path: 'lib/visualizers/if_nerf_demo.py' + + test: + sampler: '' + +novel_pose_cfg: + train_dataset_module: 'lib.datasets.light_stage.multi_view_perform_dataset' + train_dataset_path: 'lib/datasets/light_stage/multi_view_perform_dataset.py' + test_dataset_module: 'lib.datasets.light_stage.multi_view_perform_dataset' + test_dataset_path: 'lib/datasets/light_stage/multi_view_perform_dataset.py' + + renderer_module: 'lib.networks.renderer.volume_renderer' + renderer_path: 'lib/networks/renderer/volume_renderer.py' + + visualizer_module: 'lib.visualizers.if_nerf_perform' + visualizer_path: 'lib/visualizers/if_nerf_perform.py' + + test: + sampler: '' + +mesh_cfg: + train_dataset_module: 'lib.datasets.light_stage.multi_view_mesh_dataset' + train_dataset_path: 'lib/datasets/light_stage/multi_view_mesh_dataset.py' + test_dataset_module: 'lib.datasets.light_stage.multi_view_mesh_dataset' + test_dataset_path: 'lib/datasets/light_stage/multi_view_mesh_dataset.py' + + network_module: 'lib.networks.latent_xyzc' + network_path: 'lib/networks/latent_xyzc.py' + renderer_module: 'lib.networks.renderer.volume_mesh_renderer' + renderer_path: 'lib/networks/renderer/volume_mesh_renderer.py' + + visualizer_module: 'lib.visualizers.if_nerf_mesh' + visualizer_path: 'lib/visualizers/if_nerf_mesh.py' + + mesh_th: 5 + + test: + sampler: 'FrameSampler' + frame_sampler_interval: 1 diff --git a/configs/nerf/nerf_315.yaml b/configs/nerf/nerf_315.yaml new file mode 100644 index 0000000000000000000000000000000000000000..edf71733fcd58072586fb68b741a354b4010a20e --- /dev/null +++ b/configs/nerf/nerf_315.yaml @@ -0,0 +1,18 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/nerf/nerf_313.yaml' + +human: 315 + +train_dataset: + data_root: 'data/zju_mocap/CoreView_315' + human: 'CoreView_315' + ann_file: 'data/zju_mocap/CoreView_315/annots.npy' + split: 'train' + +test_dataset: + data_root: 'data/zju_mocap/CoreView_315' + human: 'CoreView_315' + ann_file: 'data/zju_mocap/CoreView_315/annots.npy' + split: 'test' diff --git a/configs/nerf/nerf_377.yaml b/configs/nerf/nerf_377.yaml new file mode 100644 index 0000000000000000000000000000000000000000..8c769fee0b24f21cfeaaecd96868cc9c4f97ff65 --- /dev/null +++ b/configs/nerf/nerf_377.yaml @@ -0,0 +1,18 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/nerf/nerf_313.yaml' + +human: 377 + +train_dataset: + data_root: 'data/zju_mocap/CoreView_377' + human: 'CoreView_377' + ann_file: 'data/zju_mocap/CoreView_377/annots.npy' + split: 'train' + +test_dataset: + data_root: 'data/zju_mocap/CoreView_377' + human: 'CoreView_377' + ann_file: 'data/zju_mocap/CoreView_377/annots.npy' + split: 'test' diff --git a/configs/nerf/nerf_386.yaml b/configs/nerf/nerf_386.yaml new file mode 100644 index 0000000000000000000000000000000000000000..11aac13c57eeb2f35c317ad7efd19296402b11d1 --- /dev/null +++ b/configs/nerf/nerf_386.yaml @@ -0,0 +1,18 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/nerf/nerf_313.yaml' + +human: 386 + +train_dataset: + data_root: 'data/zju_mocap/CoreView_386' + human: 'CoreView_386' + ann_file: 'data/zju_mocap/CoreView_386/annots.npy' + split: 'train' + +test_dataset: + data_root: 'data/zju_mocap/CoreView_386' + human: 'CoreView_386' + ann_file: 'data/zju_mocap/CoreView_386/annots.npy' + split: 'test' diff --git a/configs/nerf/nerf_387.yaml b/configs/nerf/nerf_387.yaml new file mode 100644 index 0000000000000000000000000000000000000000..3655c6af3fc7606565860489af88ce84965d91b1 --- /dev/null +++ b/configs/nerf/nerf_387.yaml @@ -0,0 +1,18 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/nerf/nerf_313.yaml' + +human: 387 + +train_dataset: + data_root: 'data/zju_mocap/CoreView_387' + human: 'CoreView_387' + ann_file: 'data/zju_mocap/CoreView_387/annots.npy' + split: 'train' + +test_dataset: + data_root: 'data/zju_mocap/CoreView_387' + human: 'CoreView_387' + ann_file: 'data/zju_mocap/CoreView_387/annots.npy' + split: 'test' diff --git a/configs/nerf/nerf_390.yaml b/configs/nerf/nerf_390.yaml new file mode 100644 index 0000000000000000000000000000000000000000..08fa6e05c75883abb400d846bf23dc5dd32a5fdf --- /dev/null +++ b/configs/nerf/nerf_390.yaml @@ -0,0 +1,18 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/nerf/nerf_313.yaml' + +human: 390 + +train_dataset: + data_root: 'data/zju_mocap/CoreView_390' + human: 'CoreView_390' + ann_file: 'data/zju_mocap/CoreView_390/annots.npy' + split: 'train' + +test_dataset: + data_root: 'data/zju_mocap/CoreView_390' + human: 'CoreView_390' + ann_file: 'data/zju_mocap/CoreView_390/annots.npy' + split: 'test' diff --git a/configs/nerf/nerf_392.yaml b/configs/nerf/nerf_392.yaml new file mode 100644 index 0000000000000000000000000000000000000000..144f5b48e2d05e52c6b475e8d5988852a927a554 --- /dev/null +++ b/configs/nerf/nerf_392.yaml @@ -0,0 +1,18 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/nerf/nerf_313.yaml' + +human: 392 + +train_dataset: + data_root: 'data/zju_mocap/CoreView_392' + human: 'CoreView_392' + ann_file: 'data/zju_mocap/CoreView_392/annots.npy' + split: 'train' + +test_dataset: + data_root: 'data/zju_mocap/CoreView_392' + human: 'CoreView_392' + ann_file: 'data/zju_mocap/CoreView_392/annots.npy' + split: 'test' diff --git a/configs/nerf/nerf_393.yaml b/configs/nerf/nerf_393.yaml new file mode 100644 index 0000000000000000000000000000000000000000..347f536ac4e76269dc6366b23764877599844755 --- /dev/null +++ b/configs/nerf/nerf_393.yaml @@ -0,0 +1,18 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/nerf/nerf_313.yaml' + +human: 393 + +train_dataset: + data_root: 'data/zju_mocap/CoreView_393' + human: 'CoreView_393' + ann_file: 'data/zju_mocap/CoreView_393/annots.npy' + split: 'train' + +test_dataset: + data_root: 'data/zju_mocap/CoreView_393' + human: 'CoreView_393' + ann_file: 'data/zju_mocap/CoreView_393/annots.npy' + split: 'test' diff --git a/configs/nerf/nerf_394.yaml b/configs/nerf/nerf_394.yaml new file mode 100644 index 0000000000000000000000000000000000000000..951e54297adbc61f4dac1d4b21c802f504b12ac1 --- /dev/null +++ b/configs/nerf/nerf_394.yaml @@ -0,0 +1,18 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/nerf/nerf_313.yaml' + +human: 394 + +train_dataset: + data_root: 'data/zju_mocap/CoreView_394' + human: 'CoreView_394' + ann_file: 'data/zju_mocap/CoreView_394/annots.npy' + split: 'train' + +test_dataset: + data_root: 'data/zju_mocap/CoreView_394' + human: 'CoreView_394' + ann_file: 'data/zju_mocap/CoreView_394/annots.npy' + split: 'test' diff --git a/configs/neural_volumes/neural_volumes_313.yaml b/configs/neural_volumes/neural_volumes_313.yaml new file mode 100644 index 0000000000000000000000000000000000000000..a0c78eab93af9ac5a1f2bc878a23444fdaba5794 --- /dev/null +++ b/configs/neural_volumes/neural_volumes_313.yaml @@ -0,0 +1,94 @@ +task: 'if_nerf' +gpus: [0] + +train_dataset_module: 'lib.datasets.light_stage.can_smpl' +train_dataset_path: 'lib/datasets/light_stage/can_smpl.py' +test_dataset_module: 'lib.datasets.light_stage.can_smpl' +test_dataset_path: 'lib/datasets/light_stage/can_smpl.py' + +network_module: 'lib.networks.latent_xyzc' +network_path: 'lib/networks/latent_xyzc.py' +renderer_module: 'lib.networks.renderer.if_clight_renderer' +renderer_path: 'lib/networks/renderer/if_clight_renderer.py' + +trainer_module: 'lib.train.trainers.if_nerf_clight' +trainer_path: 'lib/train/trainers/if_nerf_clight.py' + +evaluator_module: 'lib.evaluators.neural_volume' +evaluator_path: 'lib/evaluators/neural_volume.py' + +visualizer_module: 'lib.visualizers.if_nerf' +visualizer_path: 'lib/visualizers/if_nerf.py' + +human: 313 + +train: + dataset: Human313_0001_Train + batch_size: 1 + collator: '' + lr: 5e-4 + weight_decay: 0 + epoch: 400 + scheduler: + type: 'exponential' + gamma: 0.1 + decay_epochs: 1000 + num_workers: 16 + +test: + dataset: Human313_0001_Test + sampler: 'FrameSampler' + batch_size: 1 + collator: '' + +ep_iter: 500 +save_ep: 1000 +eval_ep: 1000 + +# training options +netdepth: 8 +netwidth: 256 +netdepth_fine: 8 +netwidth_fine: 256 +netchunk: 65536 +chunk: 32768 + +no_batching: True + +precrop_iters: 500 +precrop_frac: 0.5 + +# network options +point_feature: 6 + +# rendering options +use_viewdirs: True +i_embed: 0 +xyz_res: 10 +view_res: 4 +raw_noise_std: 0 + +N_samples: 64 +N_importance: 128 +N_rand: 1024 + +near: 1 +far: 3 + +perturb: 1 +white_bkgd: False + +render_views: 50 + +# data options +res: 256 +ratio: 0.5 +intv: 6 +ni: 60 +smpl: 'smpl' +params: 'params' + +voxel_size: [0.005, 0.005, 0.005] # dhw + +# record options +log_interval: 1 diff --git a/configs/neural_volumes/neural_volumes_315.yaml b/configs/neural_volumes/neural_volumes_315.yaml new file mode 100644 index 0000000000000000000000000000000000000000..f95ecbea891fa14538d49af19ffbebb47ab87280 --- /dev/null +++ b/configs/neural_volumes/neural_volumes_315.yaml @@ -0,0 +1,94 @@ +task: 'if_nerf' +gpus: [0] + +train_dataset_module: 'lib.datasets.light_stage.can_smpl' +train_dataset_path: 'lib/datasets/light_stage/can_smpl.py' +test_dataset_module: 'lib.datasets.light_stage.can_smpl' +test_dataset_path: 'lib/datasets/light_stage/can_smpl.py' + +network_module: 'lib.networks.latent_xyzc' +network_path: 'lib/networks/latent_xyzc.py' +renderer_module: 'lib.networks.renderer.if_clight_renderer' +renderer_path: 'lib/networks/renderer/if_clight_renderer.py' + +trainer_module: 'lib.train.trainers.if_nerf_clight' +trainer_path: 'lib/train/trainers/if_nerf_clight.py' + +evaluator_module: 'lib.evaluators.neural_volume' +evaluator_path: 'lib/evaluators/neural_volume.py' + +visualizer_module: 'lib.visualizers.if_nerf' +visualizer_path: 'lib/visualizers/if_nerf.py' + +human: 315 + +train: + dataset: Human315_0001_Train + batch_size: 1 + collator: '' + lr: 5e-4 + weight_decay: 0 + epoch: 400 + scheduler: + type: 'exponential' + gamma: 0.1 + decay_epochs: 1000 + num_workers: 16 + +test: + dataset: Human315_0001_Test + sampler: 'FrameSampler' + batch_size: 1 + collator: '' + +ep_iter: 500 +save_ep: 1000 +eval_ep: 1000 + +# training options +netdepth: 8 +netwidth: 256 +netdepth_fine: 8 +netwidth_fine: 256 +netchunk: 65536 +chunk: 32768 + +no_batching: True + +precrop_iters: 500 +precrop_frac: 0.5 + +# network options +point_feature: 6 + +# rendering options +use_viewdirs: True +i_embed: 0 +xyz_res: 10 +view_res: 4 +raw_noise_std: 0 + +N_samples: 64 +N_importance: 128 +N_rand: 1024 + +near: 1 +far: 3 + +perturb: 1 +white_bkgd: False + +render_views: 50 + +# data options +res: 256 +ratio: 0.5 +intv: 6 +ni: 400 +smpl: 'smpl' +params: 'params' + +voxel_size: [0.005, 0.005, 0.005] # dhw + +# record options +log_interval: 1 diff --git a/configs/neural_volumes/neural_volumes_377.yaml b/configs/neural_volumes/neural_volumes_377.yaml new file mode 100644 index 0000000000000000000000000000000000000000..5f425ca7d557312fb06216565f818f16e462ce2a --- /dev/null +++ b/configs/neural_volumes/neural_volumes_377.yaml @@ -0,0 +1,94 @@ +task: 'if_nerf' +gpus: [0] + +train_dataset_module: 'lib.datasets.light_stage.can_smpl' +train_dataset_path: 'lib/datasets/light_stage/can_smpl.py' +test_dataset_module: 'lib.datasets.light_stage.can_smpl' +test_dataset_path: 'lib/datasets/light_stage/can_smpl.py' + +network_module: 'lib.networks.latent_xyzc' +network_path: 'lib/networks/latent_xyzc.py' +renderer_module: 'lib.networks.renderer.if_clight_renderer' +renderer_path: 'lib/networks/renderer/if_clight_renderer.py' + +trainer_module: 'lib.train.trainers.if_nerf_clight' +trainer_path: 'lib/train/trainers/if_nerf_clight.py' + +evaluator_module: 'lib.evaluators.neural_volume' +evaluator_path: 'lib/evaluators/neural_volume.py' + +visualizer_module: 'lib.visualizers.if_nerf' +visualizer_path: 'lib/visualizers/if_nerf.py' + +human: 377 + +train: + dataset: Human377_0001_Train + batch_size: 1 + collator: '' + lr: 5e-4 + weight_decay: 0 + epoch: 400 + scheduler: + type: 'exponential' + gamma: 0.1 + decay_epochs: 1000 + num_workers: 16 + +test: + dataset: Human377_0001_Test + sampler: 'FrameSampler' + batch_size: 1 + collator: '' + +ep_iter: 500 +save_ep: 1000 +eval_ep: 1000 + +# training options +netdepth: 8 +netwidth: 256 +netdepth_fine: 8 +netwidth_fine: 256 +netchunk: 65536 +chunk: 32768 + +no_batching: True + +precrop_iters: 500 +precrop_frac: 0.5 + +# network options +point_feature: 6 + +# rendering options +use_viewdirs: True +i_embed: 0 +xyz_res: 10 +view_res: 4 +raw_noise_std: 0 + +N_samples: 64 +N_importance: 128 +N_rand: 1024 + +near: 1 +far: 3 + +perturb: 1 +white_bkgd: False + +render_views: 50 + +# data options +res: 256 +ratio: 0.5 +intv: 6 +ni: 300 +smpl: 'smpl' +params: 'params' + +voxel_size: [0.005, 0.005, 0.005] # dhw + +# record options +log_interval: 1 diff --git a/configs/neural_volumes/neural_volumes_386.yaml b/configs/neural_volumes/neural_volumes_386.yaml new file mode 100644 index 0000000000000000000000000000000000000000..156c535c65cbb0020ddd1bf7ef833ea2a968ad54 --- /dev/null +++ b/configs/neural_volumes/neural_volumes_386.yaml @@ -0,0 +1,94 @@ +task: 'if_nerf' +gpus: [0] + +train_dataset_module: 'lib.datasets.light_stage.can_smpl' +train_dataset_path: 'lib/datasets/light_stage/can_smpl.py' +test_dataset_module: 'lib.datasets.light_stage.can_smpl' +test_dataset_path: 'lib/datasets/light_stage/can_smpl.py' + +network_module: 'lib.networks.latent_xyzc' +network_path: 'lib/networks/latent_xyzc.py' +renderer_module: 'lib.networks.renderer.if_clight_renderer' +renderer_path: 'lib/networks/renderer/if_clight_renderer.py' + +trainer_module: 'lib.train.trainers.if_nerf_clight' +trainer_path: 'lib/train/trainers/if_nerf_clight.py' + +evaluator_module: 'lib.evaluators.neural_volume' +evaluator_path: 'lib/evaluators/neural_volume.py' + +visualizer_module: 'lib.visualizers.if_nerf' +visualizer_path: 'lib/visualizers/if_nerf.py' + +human: 386 + +train: + dataset: Human386_0001_Train + batch_size: 1 + collator: '' + lr: 5e-4 + weight_decay: 0 + epoch: 400 + scheduler: + type: 'exponential' + gamma: 0.1 + decay_epochs: 1000 + num_workers: 16 + +test: + dataset: Human386_0001_Test + sampler: 'FrameSampler' + batch_size: 1 + collator: '' + +ep_iter: 500 +save_ep: 1000 +eval_ep: 1000 + +# training options +netdepth: 8 +netwidth: 256 +netdepth_fine: 8 +netwidth_fine: 256 +netchunk: 65536 +chunk: 32768 + +no_batching: True + +precrop_iters: 500 +precrop_frac: 0.5 + +# network options +point_feature: 6 + +# rendering options +use_viewdirs: True +i_embed: 0 +xyz_res: 10 +view_res: 4 +raw_noise_std: 0 + +N_samples: 64 +N_importance: 128 +N_rand: 1024 + +near: 1 +far: 3 + +perturb: 1 +white_bkgd: False + +render_views: 50 + +# data options +res: 256 +ratio: 0.5 +intv: 6 +ni: 300 +smpl: 'smpl' +params: 'params' + +voxel_size: [0.005, 0.005, 0.005] # dhw + +# record options +log_interval: 1 diff --git a/configs/neural_volumes/neural_volumes_387.yaml b/configs/neural_volumes/neural_volumes_387.yaml new file mode 100644 index 0000000000000000000000000000000000000000..e8e1f94e83eb44ad0f2385cd966ce5a9d4eef4ae --- /dev/null +++ b/configs/neural_volumes/neural_volumes_387.yaml @@ -0,0 +1,94 @@ +task: 'if_nerf' +gpus: [0] + +train_dataset_module: 'lib.datasets.light_stage.can_smpl' +train_dataset_path: 'lib/datasets/light_stage/can_smpl.py' +test_dataset_module: 'lib.datasets.light_stage.can_smpl' +test_dataset_path: 'lib/datasets/light_stage/can_smpl.py' + +network_module: 'lib.networks.latent_xyzc' +network_path: 'lib/networks/latent_xyzc.py' +renderer_module: 'lib.networks.renderer.if_clight_renderer' +renderer_path: 'lib/networks/renderer/if_clight_renderer.py' + +trainer_module: 'lib.train.trainers.if_nerf_clight' +trainer_path: 'lib/train/trainers/if_nerf_clight.py' + +evaluator_module: 'lib.evaluators.neural_volume' +evaluator_path: 'lib/evaluators/neural_volume.py' + +visualizer_module: 'lib.visualizers.if_nerf' +visualizer_path: 'lib/visualizers/if_nerf.py' + +human: 387 + +train: + dataset: Human387_0001_Train + batch_size: 1 + collator: '' + lr: 5e-4 + weight_decay: 0 + epoch: 400 + scheduler: + type: 'exponential' + gamma: 0.1 + decay_epochs: 1000 + num_workers: 16 + +test: + dataset: Human387_0001_Test + sampler: 'FrameSampler' + batch_size: 1 + collator: '' + +ep_iter: 500 +save_ep: 1000 +eval_ep: 1000 + +# training options +netdepth: 8 +netwidth: 256 +netdepth_fine: 8 +netwidth_fine: 256 +netchunk: 65536 +chunk: 32768 + +no_batching: True + +precrop_iters: 500 +precrop_frac: 0.5 + +# network options +point_feature: 6 + +# rendering options +use_viewdirs: True +i_embed: 0 +xyz_res: 10 +view_res: 4 +raw_noise_std: 0 + +N_samples: 64 +N_importance: 128 +N_rand: 1024 + +near: 1 +far: 3 + +perturb: 1 +white_bkgd: False + +render_views: 50 + +# data options +res: 256 +ratio: 0.5 +intv: 6 +ni: 300 +smpl: 'smpl' +params: 'params' + +voxel_size: [0.005, 0.005, 0.005] # dhw + +# record options +log_interval: 1 diff --git a/configs/neural_volumes/neural_volumes_390.yaml b/configs/neural_volumes/neural_volumes_390.yaml new file mode 100644 index 0000000000000000000000000000000000000000..6cf68ef5180fa67d7216c05e1399a3d6c533ff34 --- /dev/null +++ b/configs/neural_volumes/neural_volumes_390.yaml @@ -0,0 +1,95 @@ +task: 'if_nerf' +gpus: [0] + +train_dataset_module: 'lib.datasets.light_stage.can_smpl' +train_dataset_path: 'lib/datasets/light_stage/can_smpl.py' +test_dataset_module: 'lib.datasets.light_stage.can_smpl' +test_dataset_path: 'lib/datasets/light_stage/can_smpl.py' + +network_module: 'lib.networks.latent_xyzc' +network_path: 'lib/networks/latent_xyzc.py' +renderer_module: 'lib.networks.renderer.if_clight_renderer' +renderer_path: 'lib/networks/renderer/if_clight_renderer.py' + +trainer_module: 'lib.train.trainers.if_nerf_clight' +trainer_path: 'lib/train/trainers/if_nerf_clight.py' + +evaluator_module: 'lib.evaluators.neural_volume' +evaluator_path: 'lib/evaluators/neural_volume.py' + +visualizer_module: 'lib.visualizers.if_nerf' +visualizer_path: 'lib/visualizers/if_nerf.py' + +human: 390 + +train: + dataset: Human390_0001_Train + batch_size: 1 + collator: '' + lr: 5e-4 + weight_decay: 0 + epoch: 400 + scheduler: + type: 'exponential' + gamma: 0.1 + decay_epochs: 1000 + num_workers: 16 + +test: + dataset: Human390_0001_Test + sampler: 'FrameSampler' + batch_size: 1 + collator: '' + +ep_iter: 500 +save_ep: 1000 +eval_ep: 1000 + +# training options +netdepth: 8 +netwidth: 256 +netdepth_fine: 8 +netwidth_fine: 256 +netchunk: 65536 +chunk: 32768 + +no_batching: True + +precrop_iters: 500 +precrop_frac: 0.5 + +# network options +point_feature: 6 + +# rendering options +use_viewdirs: True +i_embed: 0 +xyz_res: 10 +view_res: 4 +raw_noise_std: 0 + +N_samples: 64 +N_importance: 128 +N_rand: 1024 + +near: 1 +far: 3 + +perturb: 1 +white_bkgd: False + +render_views: 50 + +# data options +res: 256 +ratio: 0.5 +intv: 6 +begin_i: 700 +ni: 300 +smpl: 'smpl' +params: 'params' + +voxel_size: [0.005, 0.005, 0.005] # dhw + +# record options +log_interval: 1 diff --git a/configs/neural_volumes/neural_volumes_392.yaml b/configs/neural_volumes/neural_volumes_392.yaml new file mode 100644 index 0000000000000000000000000000000000000000..09bac447898aab83160ad6516a9420aaaa5ceb50 --- /dev/null +++ b/configs/neural_volumes/neural_volumes_392.yaml @@ -0,0 +1,94 @@ +task: 'if_nerf' +gpus: [0] + +train_dataset_module: 'lib.datasets.light_stage.can_smpl' +train_dataset_path: 'lib/datasets/light_stage/can_smpl.py' +test_dataset_module: 'lib.datasets.light_stage.can_smpl' +test_dataset_path: 'lib/datasets/light_stage/can_smpl.py' + +network_module: 'lib.networks.latent_xyzc' +network_path: 'lib/networks/latent_xyzc.py' +renderer_module: 'lib.networks.renderer.if_clight_renderer' +renderer_path: 'lib/networks/renderer/if_clight_renderer.py' + +trainer_module: 'lib.train.trainers.if_nerf_clight' +trainer_path: 'lib/train/trainers/if_nerf_clight.py' + +evaluator_module: 'lib.evaluators.neural_volume' +evaluator_path: 'lib/evaluators/neural_volume.py' + +visualizer_module: 'lib.visualizers.if_nerf' +visualizer_path: 'lib/visualizers/if_nerf.py' + +human: 392 + +train: + dataset: Human392_0001_Train + batch_size: 1 + collator: '' + lr: 5e-4 + weight_decay: 0 + epoch: 400 + scheduler: + type: 'exponential' + gamma: 0.1 + decay_epochs: 1000 + num_workers: 16 + +test: + dataset: Human392_0001_Test + sampler: 'FrameSampler' + batch_size: 1 + collator: '' + +ep_iter: 500 +save_ep: 1000 +eval_ep: 1000 + +# training options +netdepth: 8 +netwidth: 256 +netdepth_fine: 8 +netwidth_fine: 256 +netchunk: 65536 +chunk: 32768 + +no_batching: True + +precrop_iters: 500 +precrop_frac: 0.5 + +# network options +point_feature: 6 + +# rendering options +use_viewdirs: True +i_embed: 0 +xyz_res: 10 +view_res: 4 +raw_noise_std: 0 + +N_samples: 64 +N_importance: 128 +N_rand: 1024 + +near: 1 +far: 3 + +perturb: 1 +white_bkgd: False + +render_views: 50 + +# data options +res: 256 +ratio: 0.5 +intv: 6 +ni: 300 +smpl: 'smpl' +params: 'params' + +voxel_size: [0.005, 0.005, 0.005] # dhw + +# record options +log_interval: 1 diff --git a/configs/neural_volumes/neural_volumes_393.yaml b/configs/neural_volumes/neural_volumes_393.yaml new file mode 100644 index 0000000000000000000000000000000000000000..1ef356038f35d0d689ec02023c95bc7948359672 --- /dev/null +++ b/configs/neural_volumes/neural_volumes_393.yaml @@ -0,0 +1,94 @@ +task: 'if_nerf' +gpus: [0] + +train_dataset_module: 'lib.datasets.light_stage.can_smpl' +train_dataset_path: 'lib/datasets/light_stage/can_smpl.py' +test_dataset_module: 'lib.datasets.light_stage.can_smpl' +test_dataset_path: 'lib/datasets/light_stage/can_smpl.py' + +network_module: 'lib.networks.latent_xyzc' +network_path: 'lib/networks/latent_xyzc.py' +renderer_module: 'lib.networks.renderer.if_clight_renderer' +renderer_path: 'lib/networks/renderer/if_clight_renderer.py' + +trainer_module: 'lib.train.trainers.if_nerf_clight' +trainer_path: 'lib/train/trainers/if_nerf_clight.py' + +evaluator_module: 'lib.evaluators.neural_volume' +evaluator_path: 'lib/evaluators/neural_volume.py' + +visualizer_module: 'lib.visualizers.if_nerf' +visualizer_path: 'lib/visualizers/if_nerf.py' + +human: 393 + +train: + dataset: Human393_0001_Train + batch_size: 1 + collator: '' + lr: 5e-4 + weight_decay: 0 + epoch: 400 + scheduler: + type: 'exponential' + gamma: 0.1 + decay_epochs: 1000 + num_workers: 16 + +test: + dataset: Human393_0001_Test + sampler: 'FrameSampler' + batch_size: 1 + collator: '' + +ep_iter: 500 +save_ep: 1000 +eval_ep: 1000 + +# training options +netdepth: 8 +netwidth: 256 +netdepth_fine: 8 +netwidth_fine: 256 +netchunk: 65536 +chunk: 32768 + +no_batching: True + +precrop_iters: 500 +precrop_frac: 0.5 + +# network options +point_feature: 6 + +# rendering options +use_viewdirs: True +i_embed: 0 +xyz_res: 10 +view_res: 4 +raw_noise_std: 0 + +N_samples: 64 +N_importance: 128 +N_rand: 1024 + +near: 1 +far: 3 + +perturb: 1 +white_bkgd: False + +render_views: 50 + +# data options +res: 256 +ratio: 0.5 +intv: 6 +ni: 300 +smpl: 'smpl' +params: 'params' + +voxel_size: [0.005, 0.005, 0.005] # dhw + +# record options +log_interval: 1 diff --git a/configs/neural_volumes/neural_volumes_394.yaml b/configs/neural_volumes/neural_volumes_394.yaml new file mode 100644 index 0000000000000000000000000000000000000000..bc89a2de71b5e5a972567f1294e43df90ff07f66 --- /dev/null +++ b/configs/neural_volumes/neural_volumes_394.yaml @@ -0,0 +1,94 @@ +task: 'if_nerf' +gpus: [0] + +train_dataset_module: 'lib.datasets.light_stage.can_smpl' +train_dataset_path: 'lib/datasets/light_stage/can_smpl.py' +test_dataset_module: 'lib.datasets.light_stage.can_smpl' +test_dataset_path: 'lib/datasets/light_stage/can_smpl.py' + +network_module: 'lib.networks.latent_xyzc' +network_path: 'lib/networks/latent_xyzc.py' +renderer_module: 'lib.networks.renderer.if_clight_renderer' +renderer_path: 'lib/networks/renderer/if_clight_renderer.py' + +trainer_module: 'lib.train.trainers.if_nerf_clight' +trainer_path: 'lib/train/trainers/if_nerf_clight.py' + +evaluator_module: 'lib.evaluators.neural_volume' +evaluator_path: 'lib/evaluators/neural_volume.py' + +visualizer_module: 'lib.visualizers.if_nerf' +visualizer_path: 'lib/visualizers/if_nerf.py' + +human: 394 + +train: + dataset: Human394_0001_Train + batch_size: 1 + collator: '' + lr: 5e-4 + weight_decay: 0 + epoch: 400 + scheduler: + type: 'exponential' + gamma: 0.1 + decay_epochs: 1000 + num_workers: 16 + +test: + dataset: Human394_0001_Test + sampler: 'FrameSampler' + batch_size: 1 + collator: '' + +ep_iter: 500 +save_ep: 1000 +eval_ep: 1000 + +# training options +netdepth: 8 +netwidth: 256 +netdepth_fine: 8 +netwidth_fine: 256 +netchunk: 65536 +chunk: 32768 + +no_batching: True + +precrop_iters: 500 +precrop_frac: 0.5 + +# network options +point_feature: 6 + +# rendering options +use_viewdirs: True +i_embed: 0 +xyz_res: 10 +view_res: 4 +raw_noise_std: 0 + +N_samples: 64 +N_importance: 128 +N_rand: 1024 + +near: 1 +far: 3 + +perturb: 1 +white_bkgd: False + +render_views: 50 + +# data options +res: 256 +ratio: 0.5 +intv: 6 +ni: 300 +smpl: 'smpl' +params: 'params' + +voxel_size: [0.005, 0.005, 0.005] # dhw + +# record options +log_interval: 1 diff --git a/configs/snapshot_exp/snapshot_f1c.yaml b/configs/snapshot_exp/snapshot_f1c.yaml new file mode 100644 index 0000000000000000000000000000000000000000..a37af9bc24051f7cc00bef60553c16899b7de10b --- /dev/null +++ b/configs/snapshot_exp/snapshot_f1c.yaml @@ -0,0 +1,20 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/snapshot_exp/snapshot_f3c.yaml' + +train_dataset: + data_root: 'data/people_snapshot/female-1-casual' + human: 'female-1-casual' + ann_file: 'data/people_snapshot/female-1-casual/params.npy' + split: 'train' + +test_dataset: + data_root: 'data/people_snapshot/female-1-casual' + human: 'female-1-casual' + ann_file: 'data/people_snapshot/female-1-casual/params.npy' + split: 'test' + +# data options +ratio: 1. +num_train_frame: 250 diff --git a/configs/snapshot_exp/snapshot_f3c.yaml b/configs/snapshot_exp/snapshot_f3c.yaml new file mode 100644 index 0000000000000000000000000000000000000000..e1cbcdf3045209869ea8d15f6d0ae6cb4edb8d23 --- /dev/null +++ b/configs/snapshot_exp/snapshot_f3c.yaml @@ -0,0 +1,134 @@ +task: 'if_nerf' +gpus: [0] + +train_dataset_module: 'lib.datasets.light_stage.monocular_dataset' +train_dataset_path: 'lib/datasets/light_stage/monocular_dataset.py' +test_dataset_module: 'lib.datasets.light_stage.monocular_dataset' +test_dataset_path: 'lib/datasets/light_stage/monocular_dataset.py' + +network_module: 'lib.networks.latent_xyzc' +network_path: 'lib/networks/latent_xyzc.py' +renderer_module: 'lib.networks.renderer.if_clight_renderer' +renderer_path: 'lib/networks/renderer/if_clight_renderer.py' + +trainer_module: 'lib.train.trainers.if_nerf_clight' +trainer_path: 'lib/train/trainers/if_nerf_clight.py' + +evaluator_module: 'lib.evaluators.if_nerf' +evaluator_path: 'lib/evaluators/if_nerf.py' + +visualizer_module: 'lib.visualizers.if_nerf' +visualizer_path: 'lib/visualizers/if_nerf.py' + +train_dataset: + data_root: 'data/people_snapshot/female-3-casual' + human: 'female-3-casual' + ann_file: 'data/people_snapshot/female-3-casual/params.npy' + split: 'train' + +test_dataset: + data_root: 'data/people_snapshot/female-3-casual' + human: 'female-3-casual' + ann_file: 'data/people_snapshot/female-3-casual/params.npy' + split: 'test' + +train: + batch_size: 1 + collator: '' + lr: 5e-4 + weight_decay: 0 + epoch: 400 + scheduler: + type: 'exponential' + gamma: 0.1 + decay_epochs: 1000 + num_workers: 16 + +test: + batch_size: 1 + collator: '' + +ep_iter: 500 +save_ep: 100 +eval_ep: 1000 + +# rendering options +i_embed: 0 +xyz_res: 10 +view_res: 4 +raw_noise_std: 0 + +N_samples: 64 +N_importance: 128 +N_rand: 1024 + +perturb: 1 +white_bkgd: False + +num_render_views: 50 + +# data options +H: 1080 +W: 1080 +ratio: 1. +num_train_frame: 230 + +voxel_size: [0.005, 0.005, 0.005] # dhw + +# record options +log_interval: 1 + + +novel_view_cfg: + train_dataset_module: 'lib.datasets.light_stage.monocular_demo_dataset' + train_dataset_path: 'lib/datasets/light_stage/monocular_demo_dataset.py' + test_dataset_module: 'lib.datasets.light_stage.monocular_demo_dataset' + test_dataset_path: 'lib/datasets/light_stage/monocular_demo_dataset.py' + + renderer_module: 'lib.networks.renderer.if_clight_renderer_msk' + renderer_path: 'lib/networks/renderer/if_clight_renderer_msk.py' + + visualizer_module: 'lib.visualizers.if_nerf_demo' + visualizer_path: 'lib/visualizers/if_nerf_demo.py' + + ratio: 0.5 + + test: + sampler: '' + +novel_pose_cfg: + train_dataset_module: 'lib.datasets.light_stage.monocular_dataset' + train_dataset_path: 'lib/datasets/light_stage/monocular_dataset.py' + test_dataset_module: 'lib.datasets.light_stage.monocular_dataset' + test_dataset_path: 'lib/datasets/light_stage/monocular_dataset.py' + + renderer_module: 'lib.networks.renderer.if_clight_renderer_msk' + renderer_path: 'lib/networks/renderer/if_clight_renderer_msk.py' + + visualizer_module: 'lib.visualizers.if_nerf_perform' + visualizer_path: 'lib/visualizers/if_nerf_perform.py' + + ratio: 0.5 + + test: + sampler: '' + +mesh_cfg: + train_dataset_module: 'lib.datasets.light_stage.monocular_mesh_dataset' + train_dataset_path: 'lib/datasets/light_stage/monocular_mesh_dataset.py' + test_dataset_module: 'lib.datasets.light_stage.monocular_mesh_dataset' + test_dataset_path: 'lib/datasets/light_stage/monocular_mesh_dataset.py' + + network_module: 'lib.networks.latent_xyzc' + network_path: 'lib/networks/latent_xyzc.py' + renderer_module: 'lib.networks.renderer.if_mesh_renderer' + renderer_path: 'lib/networks/renderer/if_mesh_renderer.py' + + visualizer_module: 'lib.visualizers.if_nerf_mesh' + visualizer_path: 'lib/visualizers/if_nerf_mesh.py' + + mesh_th: 5 + + test: + sampler: 'FrameSampler' + frame_sampler_interval: 1 diff --git a/configs/snapshot_exp/snapshot_f4c.yaml b/configs/snapshot_exp/snapshot_f4c.yaml new file mode 100644 index 0000000000000000000000000000000000000000..0de1a9fe0660641cbf0f5cc551d7a28f9883024f --- /dev/null +++ b/configs/snapshot_exp/snapshot_f4c.yaml @@ -0,0 +1,21 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/snapshot_exp/snapshot_f3c.yaml' + +train_dataset: + data_root: 'data/people_snapshot/female-4-casual' + human: 'female-4-casual' + ann_file: 'data/people_snapshot/female-4-casual/params.npy' + split: 'train' + +test_dataset: + data_root: 'data/people_snapshot/female-4-casual' + human: 'female-4-casual' + ann_file: 'data/people_snapshot/female-4-casual/params.npy' + split: 'test' + +# data options +ratio: 1. +num_train_frame: 200 +begin_ith_frame: 10 diff --git a/configs/snapshot_exp/snapshot_f6p.yaml b/configs/snapshot_exp/snapshot_f6p.yaml new file mode 100644 index 0000000000000000000000000000000000000000..b8428026fe2521e5f15d3917623edce4a5f69dc4 --- /dev/null +++ b/configs/snapshot_exp/snapshot_f6p.yaml @@ -0,0 +1,20 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/snapshot_exp/snapshot_f3c.yaml' + +train_dataset: + data_root: 'data/people_snapshot/female-6-plaza' + human: 'female-6-plaza' + ann_file: 'data/people_snapshot/female-6-plaza/params.npy' + split: 'train' + +test_dataset: + data_root: 'data/people_snapshot/female-6-plaza' + human: 'female-6-plaza' + ann_file: 'data/people_snapshot/female-6-plaza/params.npy' + split: 'test' + +# data options +ratio: 1. +num_train_frame: 240 diff --git a/configs/snapshot_exp/snapshot_f7p.yaml b/configs/snapshot_exp/snapshot_f7p.yaml new file mode 100644 index 0000000000000000000000000000000000000000..3f6a4ca9806528af79bee27b0e9258162fe83986 --- /dev/null +++ b/configs/snapshot_exp/snapshot_f7p.yaml @@ -0,0 +1,20 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/snapshot_exp/snapshot_f3c.yaml' + +train_dataset: + data_root: 'data/people_snapshot/female-7-plaza' + human: 'female-7-plaza' + ann_file: 'data/people_snapshot/female-7-plaza/params.npy' + split: 'train' + +test_dataset: + data_root: 'data/people_snapshot/female-7-plaza' + human: 'female-7-plaza' + ann_file: 'data/people_snapshot/female-7-plaza/params.npy' + split: 'test' + +# data options +ratio: 1. +num_train_frame: 185 diff --git a/configs/snapshot_exp/snapshot_f8p.yaml b/configs/snapshot_exp/snapshot_f8p.yaml new file mode 100644 index 0000000000000000000000000000000000000000..36bbe46f15ef75356804014ac4dadd00791d1d3f --- /dev/null +++ b/configs/snapshot_exp/snapshot_f8p.yaml @@ -0,0 +1,20 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/snapshot_exp/snapshot_f3c.yaml' + +train_dataset: + data_root: 'data/people_snapshot/female-8-plaza' + human: 'female-8-plaza' + ann_file: 'data/people_snapshot/female-8-plaza/params.npy' + split: 'train' + +test_dataset: + data_root: 'data/people_snapshot/female-8-plaza' + human: 'female-8-plaza' + ann_file: 'data/people_snapshot/female-8-plaza/params.npy' + split: 'test' + +# data options +ratio: 1. +num_train_frame: 200 diff --git a/configs/snapshot_exp/snapshot_m2c.yaml b/configs/snapshot_exp/snapshot_m2c.yaml new file mode 100644 index 0000000000000000000000000000000000000000..a525079164b8a854dbcebc9acb1b3b911686c236 --- /dev/null +++ b/configs/snapshot_exp/snapshot_m2c.yaml @@ -0,0 +1,20 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/snapshot_exp/snapshot_f3c.yaml' + +train_dataset: + data_root: 'data/people_snapshot/male-2-casual' + human: 'male-2-casual' + ann_file: 'data/people_snapshot/male-2-casual/params.npy' + split: 'train' + +test_dataset: + data_root: 'data/people_snapshot/male-2-casual' + human: 'male-2-casual' + ann_file: 'data/people_snapshot/male-2-casual/params.npy' + split: 'test' + +# data options +ratio: 1. +num_train_frame: 180 diff --git a/configs/snapshot_exp/snapshot_m2o.yaml b/configs/snapshot_exp/snapshot_m2o.yaml new file mode 100644 index 0000000000000000000000000000000000000000..c03fb0f76e8b922961e25a15abb7d8d8a288efe4 --- /dev/null +++ b/configs/snapshot_exp/snapshot_m2o.yaml @@ -0,0 +1,20 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/snapshot_exp/snapshot_f3c.yaml' + +train_dataset: + data_root: 'data/people_snapshot/male-2-outdoor' + human: 'male-2-outdoor' + ann_file: 'data/people_snapshot/male-2-outdoor/params.npy' + split: 'train' + +test_dataset: + data_root: 'data/people_snapshot/male-2-outdoor' + human: 'male-2-outdoor' + ann_file: 'data/people_snapshot/male-2-outdoor/params.npy' + split: 'test' + +# data options +ratio: 1. +num_train_frame: 150 diff --git a/configs/snapshot_exp/snapshot_m3c.yaml b/configs/snapshot_exp/snapshot_m3c.yaml new file mode 100644 index 0000000000000000000000000000000000000000..12df3c3fdef6c06dd7e2eca0d5ed19b5fc6ff48d --- /dev/null +++ b/configs/snapshot_exp/snapshot_m3c.yaml @@ -0,0 +1,20 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/snapshot_exp/snapshot_f3c.yaml' + +train_dataset: + data_root: 'data/people_snapshot/male-3-casual' + human: 'male-3-casual' + ann_file: 'data/people_snapshot/male-3-casual/params.npy' + split: 'train' + +test_dataset: + data_root: 'data/people_snapshot/male-3-casual' + human: 'male-3-casual' + ann_file: 'data/people_snapshot/male-3-casual/params.npy' + split: 'test' + +# data options +ratio: 1. +num_train_frame: 235 diff --git a/configs/snapshot_exp/snapshot_m5o.yaml b/configs/snapshot_exp/snapshot_m5o.yaml new file mode 100644 index 0000000000000000000000000000000000000000..14b15205ace86ffa24f22f4d21fbc034e5c7fa34 --- /dev/null +++ b/configs/snapshot_exp/snapshot_m5o.yaml @@ -0,0 +1,20 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/snapshot_exp/snapshot_f3c.yaml' + +train_dataset: + data_root: 'data/people_snapshot/male-5-outdoor' + human: 'male-5-outdoor' + ann_file: 'data/people_snapshot/male-5-outdoor/params.npy' + split: 'train' + +test_dataset: + data_root: 'data/people_snapshot/male-5-outdoor' + human: 'male-5-outdoor' + ann_file: 'data/people_snapshot/male-5-outdoor/params.npy' + split: 'test' + +# data options +ratio: 1. +num_train_frame: 295 diff --git a/configs/zju_mocap_exp/latent_xyzc_313.yaml b/configs/zju_mocap_exp/latent_xyzc_313.yaml new file mode 100644 index 0000000000000000000000000000000000000000..a8a0f93c3ae5986d1862b53d14da0809690b6f82 --- /dev/null +++ b/configs/zju_mocap_exp/latent_xyzc_313.yaml @@ -0,0 +1,152 @@ +task: 'if_nerf' +gpus: [0] + +train_dataset_module: 'lib.datasets.light_stage.multi_view_dataset' +train_dataset_path: 'lib/datasets/light_stage/multi_view_dataset.py' +test_dataset_module: 'lib.datasets.light_stage.multi_view_dataset' +test_dataset_path: 'lib/datasets/light_stage/multi_view_dataset.py' + +network_module: 'lib.networks.latent_xyzc' +network_path: 'lib/networks/latent_xyzc.py' +renderer_module: 'lib.networks.renderer.if_clight_renderer' +renderer_path: 'lib/networks/renderer/if_clight_renderer.py' + +trainer_module: 'lib.train.trainers.if_nerf_clight' +trainer_path: 'lib/train/trainers/if_nerf_clight.py' + +evaluator_module: 'lib.evaluators.if_nerf' +evaluator_path: 'lib/evaluators/if_nerf.py' + +visualizer_module: 'lib.visualizers.if_nerf' +visualizer_path: 'lib/visualizers/if_nerf.py' + +human: 313 + +train_dataset: + data_root: 'data/zju_mocap/CoreView_313' + human: 'CoreView_313' + ann_file: 'data/zju_mocap/CoreView_313/annots.npy' + split: 'train' + +test_dataset: + data_root: 'data/zju_mocap/CoreView_313' + human: 'CoreView_313' + ann_file: 'data/zju_mocap/CoreView_313/annots.npy' + split: 'test' + +train: + batch_size: 1 + collator: '' + lr: 5e-4 + weight_decay: 0 + epoch: 400 + scheduler: + type: 'exponential' + gamma: 0.1 + decay_epochs: 1000 + num_workers: 16 + +test: + sampler: 'FrameSampler' + batch_size: 1 + collator: '' + +ep_iter: 500 +save_ep: 1000 +eval_ep: 1000 + +# rendering options +i_embed: 0 +xyz_res: 10 +view_res: 4 +raw_noise_std: 0 + +N_samples: 64 +N_importance: 128 +N_rand: 1024 + +perturb: 1 +white_bkgd: False + +num_render_views: 50 + +# data options +H: 1024 +W: 1024 +ratio: 0.5 +training_view: [0, 6, 12, 18] +num_train_frame: 60 +num_novel_pose_frame: 1000 +smpl: 'smpl' +params: 'params' + +voxel_size: [0.005, 0.005, 0.005] # dhw + +# record options +log_interval: 1 + + +novel_view_cfg: + train_dataset_module: 'lib.datasets.light_stage.multi_view_demo_dataset' + train_dataset_path: 'lib/datasets/light_stage/multi_view_demo_dataset.py' + test_dataset_module: 'lib.datasets.light_stage.multi_view_demo_dataset' + test_dataset_path: 'lib/datasets/light_stage/multi_view_demo_dataset.py' + + renderer_module: 'lib.networks.renderer.if_clight_renderer_mmsk' + renderer_path: 'lib/networks/renderer/if_clight_renderer_mmsk.py' + + visualizer_module: 'lib.visualizers.if_nerf_demo' + visualizer_path: 'lib/visualizers/if_nerf_demo.py' + + test: + sampler: '' + +rotate_smpl_cfg: + train_dataset_module: 'lib.datasets.light_stage.rotate_smpl_dataset' + train_dataset_path: 'lib/datasets/light_stage/rotate_smpl_dataset.py' + test_dataset_module: 'lib.datasets.light_stage.rotate_smpl_dataset' + test_dataset_path: 'lib/datasets/light_stage/rotate_smpl_dataset.py' + + renderer_module: 'lib.networks.renderer.if_clight_renderer' + renderer_path: 'lib/networks/renderer/if_clight_renderer.py' + + visualizer_module: 'lib.visualizers.if_nerf_demo' + visualizer_path: 'lib/visualizers/if_nerf_demo.py' + + test: + sampler: '' + +novel_pose_cfg: + train_dataset_module: 'lib.datasets.light_stage.multi_view_perform_dataset' + train_dataset_path: 'lib/datasets/light_stage/multi_view_perform_dataset.py' + test_dataset_module: 'lib.datasets.light_stage.multi_view_perform_dataset' + test_dataset_path: 'lib/datasets/light_stage/multi_view_perform_dataset.py' + + renderer_module: 'lib.networks.renderer.if_clight_renderer_mmsk' + renderer_path: 'lib/networks/renderer/if_clight_renderer_mmsk.py' + + visualizer_module: 'lib.visualizers.if_nerf_perform' + visualizer_path: 'lib/visualizers/if_nerf_perform.py' + + test: + sampler: '' + +mesh_cfg: + train_dataset_module: 'lib.datasets.light_stage.multi_view_mesh_dataset' + train_dataset_path: 'lib/datasets/light_stage/multi_view_mesh_dataset.py' + test_dataset_module: 'lib.datasets.light_stage.multi_view_mesh_dataset' + test_dataset_path: 'lib/datasets/light_stage/multi_view_mesh_dataset.py' + + network_module: 'lib.networks.latent_xyzc' + network_path: 'lib/networks/latent_xyzc.py' + renderer_module: 'lib.networks.renderer.if_mesh_renderer' + renderer_path: 'lib/networks/renderer/if_mesh_renderer.py' + + visualizer_module: 'lib.visualizers.if_nerf_mesh' + visualizer_path: 'lib/visualizers/if_nerf_mesh.py' + + mesh_th: 5 + + test: + sampler: 'FrameSampler' + frame_sampler_interval: 1 diff --git a/configs/zju_mocap_exp/latent_xyzc_315.yaml b/configs/zju_mocap_exp/latent_xyzc_315.yaml new file mode 100644 index 0000000000000000000000000000000000000000..8313dea92567fca79a979010ecc399f6a5515c03 --- /dev/null +++ b/configs/zju_mocap_exp/latent_xyzc_315.yaml @@ -0,0 +1,21 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/zju_mocap_exp/latent_xyzc_313.yaml' + +human: 315 + +train_dataset: + data_root: 'data/zju_mocap/CoreView_315' + human: 'CoreView_315' + ann_file: 'data/zju_mocap/CoreView_315/annots.npy' + split: 'train' + +test_dataset: + data_root: 'data/zju_mocap/CoreView_315' + human: 'CoreView_315' + ann_file: 'data/zju_mocap/CoreView_315/annots.npy' + split: 'test' + +# data options +num_train_frame: 400 diff --git a/configs/zju_mocap_exp/latent_xyzc_377.yaml b/configs/zju_mocap_exp/latent_xyzc_377.yaml new file mode 100644 index 0000000000000000000000000000000000000000..a9c37b4c3c1363ad383608fe60f0c86fce52bca0 --- /dev/null +++ b/configs/zju_mocap_exp/latent_xyzc_377.yaml @@ -0,0 +1,21 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/zju_mocap_exp/latent_xyzc_313.yaml' + +human: 377 + +train_dataset: + data_root: 'data/zju_mocap/CoreView_377' + human: 'CoreView_377' + ann_file: 'data/zju_mocap/CoreView_377/annots.npy' + split: 'train' + +test_dataset: + data_root: 'data/zju_mocap/CoreView_377' + human: 'CoreView_377' + ann_file: 'data/zju_mocap/CoreView_377/annots.npy' + split: 'test' + +# data options +num_train_frame: 300 diff --git a/configs/zju_mocap_exp/latent_xyzc_386.yaml b/configs/zju_mocap_exp/latent_xyzc_386.yaml new file mode 100644 index 0000000000000000000000000000000000000000..15dd80bd26087d065e528cda9849cf1856991390 --- /dev/null +++ b/configs/zju_mocap_exp/latent_xyzc_386.yaml @@ -0,0 +1,21 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/zju_mocap_exp/latent_xyzc_313.yaml' + +human: 386 + +train_dataset: + data_root: 'data/zju_mocap/CoreView_386' + human: 'CoreView_386' + ann_file: 'data/zju_mocap/CoreView_386/annots.npy' + split: 'train' + +test_dataset: + data_root: 'data/zju_mocap/CoreView_386' + human: 'CoreView_386' + ann_file: 'data/zju_mocap/CoreView_386/annots.npy' + split: 'test' + +# data options +num_train_frame: 300 diff --git a/configs/zju_mocap_exp/latent_xyzc_387.yaml b/configs/zju_mocap_exp/latent_xyzc_387.yaml new file mode 100644 index 0000000000000000000000000000000000000000..870335af7974c0ac076df21ccc2922adf33f78a7 --- /dev/null +++ b/configs/zju_mocap_exp/latent_xyzc_387.yaml @@ -0,0 +1,21 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/zju_mocap_exp/latent_xyzc_313.yaml' + +human: 387 + +train_dataset: + data_root: 'data/zju_mocap/CoreView_387' + human: 'CoreView_387' + ann_file: 'data/zju_mocap/CoreView_387/annots.npy' + split: 'train' + +test_dataset: + data_root: 'data/zju_mocap/CoreView_387' + human: 'CoreView_387' + ann_file: 'data/zju_mocap/CoreView_387/annots.npy' + split: 'test' + +# data options +num_train_frame: 300 diff --git a/configs/zju_mocap_exp/latent_xyzc_390.yaml b/configs/zju_mocap_exp/latent_xyzc_390.yaml new file mode 100644 index 0000000000000000000000000000000000000000..38f09f5079ae8d665babf515fc4128650e9c5bac --- /dev/null +++ b/configs/zju_mocap_exp/latent_xyzc_390.yaml @@ -0,0 +1,23 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/zju_mocap_exp/latent_xyzc_313.yaml' + +human: 390 + +train_dataset: + data_root: 'data/zju_mocap/CoreView_390' + human: 'CoreView_390' + ann_file: 'data/zju_mocap/CoreView_390/annots.npy' + split: 'train' + +test_dataset: + data_root: 'data/zju_mocap/CoreView_390' + human: 'CoreView_390' + ann_file: 'data/zju_mocap/CoreView_390/annots.npy' + split: 'test' + +# data options +num_train_frame: 300 +begin_ith_frame: 700 +num_novel_pose_frame: 700 diff --git a/configs/zju_mocap_exp/latent_xyzc_392.yaml b/configs/zju_mocap_exp/latent_xyzc_392.yaml new file mode 100644 index 0000000000000000000000000000000000000000..c4f604aca47c50650b35d172ef48ec2ed45f9581 --- /dev/null +++ b/configs/zju_mocap_exp/latent_xyzc_392.yaml @@ -0,0 +1,21 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/zju_mocap_exp/latent_xyzc_313.yaml' + +human: 392 + +train_dataset: + data_root: 'data/zju_mocap/CoreView_392' + human: 'CoreView_392' + ann_file: 'data/zju_mocap/CoreView_392/annots.npy' + split: 'train' + +test_dataset: + data_root: 'data/zju_mocap/CoreView_392' + human: 'CoreView_392' + ann_file: 'data/zju_mocap/CoreView_392/annots.npy' + split: 'test' + +# data options +num_train_frame: 300 diff --git a/configs/zju_mocap_exp/latent_xyzc_393.yaml b/configs/zju_mocap_exp/latent_xyzc_393.yaml new file mode 100644 index 0000000000000000000000000000000000000000..1352505fb18a4a16372105503d8385301e3f15f8 --- /dev/null +++ b/configs/zju_mocap_exp/latent_xyzc_393.yaml @@ -0,0 +1,21 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/zju_mocap_exp/latent_xyzc_313.yaml' + +human: 393 + +train_dataset: + data_root: 'data/zju_mocap/CoreView_393' + human: 'CoreView_393' + ann_file: 'data/zju_mocap/CoreView_393/annots.npy' + split: 'train' + +test_dataset: + data_root: 'data/zju_mocap/CoreView_393' + human: 'CoreView_393' + ann_file: 'data/zju_mocap/CoreView_393/annots.npy' + split: 'test' + +# data options +num_train_frame: 300 diff --git a/configs/zju_mocap_exp/latent_xyzc_394.yaml b/configs/zju_mocap_exp/latent_xyzc_394.yaml new file mode 100644 index 0000000000000000000000000000000000000000..bc40e416c47cafa17c490116a989b704c3efa881 --- /dev/null +++ b/configs/zju_mocap_exp/latent_xyzc_394.yaml @@ -0,0 +1,21 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/zju_mocap_exp/latent_xyzc_313.yaml' + +human: 394 + +train_dataset: + data_root: 'data/zju_mocap/CoreView_394' + human: 'CoreView_394' + ann_file: 'data/zju_mocap/CoreView_394/annots.npy' + split: 'train' + +test_dataset: + data_root: 'data/zju_mocap/CoreView_394' + human: 'CoreView_394' + ann_file: 'data/zju_mocap/CoreView_394/annots.npy' + split: 'test' + +# data options +num_train_frame: 300 diff --git a/configs/zju_mocap_exp/latent_xyzc_395.yaml b/configs/zju_mocap_exp/latent_xyzc_395.yaml new file mode 100644 index 0000000000000000000000000000000000000000..13d4002fc3748e528cd28ea196f237b69765310f --- /dev/null +++ b/configs/zju_mocap_exp/latent_xyzc_395.yaml @@ -0,0 +1,21 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/zju_mocap_exp/latent_xyzc_313.yaml' + +human: 395 + +train_dataset: + data_root: 'data/zju_mocap/CoreView_395' + human: 'CoreView_395' + ann_file: 'data/zju_mocap/CoreView_395/annots.npy' + split: 'train' + +test_dataset: + data_root: 'data/zju_mocap/CoreView_395' + human: 'CoreView_395' + ann_file: 'data/zju_mocap/CoreView_395/annots.npy' + split: 'test' + +# data options +num_train_frame: 300 diff --git a/configs/zju_mocap_exp/latent_xyzc_396.yaml b/configs/zju_mocap_exp/latent_xyzc_396.yaml new file mode 100644 index 0000000000000000000000000000000000000000..e9ab931e290f1551ac9d87e5bfebe117ac5c835c --- /dev/null +++ b/configs/zju_mocap_exp/latent_xyzc_396.yaml @@ -0,0 +1,22 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/zju_mocap_exp/latent_xyzc_313.yaml' + +human: 396 + +train_dataset: + data_root: 'data/zju_mocap/CoreView_396' + human: 'CoreView_396' + ann_file: 'data/zju_mocap/CoreView_396/annots.npy' + split: 'train' + +test_dataset: + data_root: 'data/zju_mocap/CoreView_396' + human: 'CoreView_396' + ann_file: 'data/zju_mocap/CoreView_396/annots.npy' + split: 'test' + +# data options +num_train_frame: 540 +begin_ith_frame: 810 diff --git a/configs/zju_mocap_exp/xyzc_rotate_demo_313.yaml b/configs/zju_mocap_exp/xyzc_rotate_demo_313.yaml new file mode 100644 index 0000000000000000000000000000000000000000..0dc691f9b6bba711f0cdda5d903b1a745275b7a6 --- /dev/null +++ b/configs/zju_mocap_exp/xyzc_rotate_demo_313.yaml @@ -0,0 +1,93 @@ +task: 'if_nerf' +gpus: [0] + +train_dataset_module: 'lib.datasets.light_stage.can_smpl_demo' +train_dataset_path: 'lib/datasets/light_stage/can_smpl_demo.py' +test_dataset_module: 'lib.datasets.light_stage.rotate_smpl' +test_dataset_path: 'lib/datasets/light_stage/rotate_smpl.py' + +network_module: 'lib.networks.latent_xyzc' +network_path: 'lib/networks/latent_xyzc.py' +renderer_module: 'lib.networks.renderer.if_clight_renderer' +renderer_path: 'lib/networks/renderer/if_clight_renderer.py' + +trainer_module: 'lib.train.trainers.if_nerf_clight' +trainer_path: 'lib/train/trainers/if_nerf_clight.py' + +evaluator_module: 'lib.evaluators.if_nerf' +evaluator_path: 'lib/evaluators/if_nerf.py' + +visualizer_module: 'lib.visualizers.if_nerf_demo' +visualizer_path: 'lib/visualizers/if_nerf_demo.py' + +human: 313 + +train: + dataset: Human313_0001_Train + batch_size: 1 + collator: '' + lr: 5e-4 + weight_decay: 0 + epoch: 400 + scheduler: + type: 'exponential' + gamma: 0.1 + decay_epochs: 1000 + num_workers: 16 + +test: + dataset: Human313_0001_Test + batch_size: 1 + collator: '' + +ep_iter: 500 +save_ep: 1000 +eval_ep: 1000 + +# training options +netdepth: 8 +netwidth: 256 +netdepth_fine: 8 +netwidth_fine: 256 +netchunk: 65536 +chunk: 32768 + +no_batching: True + +precrop_iters: 500 +precrop_frac: 0.5 + +# network options +point_feature: 6 + +# rendering options +use_viewdirs: True +i_embed: 0 +xyz_res: 10 +view_res: 4 +raw_noise_std: 0 + +N_samples: 64 +N_importance: 128 +N_rand: 1024 + +near: 1 +far: 3 + +perturb: 1 +white_bkgd: False + +render_views: 50 + +# data options +res: 256 +ratio: 0.5 +intv: 6 +ni: 60 +smpl: 'smpl' +params: 'params' + +voxel_size: [0.005, 0.005, 0.005] # dhw + +# record options +log_interval: 1 diff --git a/configs/zju_mocap_frame1_exp/latent_xyzc_313_ni1.yaml b/configs/zju_mocap_frame1_exp/latent_xyzc_313_ni1.yaml new file mode 100644 index 0000000000000000000000000000000000000000..c151973aa4a0ca57cc1fb640f53917cf0d2339fc --- /dev/null +++ b/configs/zju_mocap_frame1_exp/latent_xyzc_313_ni1.yaml @@ -0,0 +1,21 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/zju_mocap_exp/latent_xyzc_313.yaml' + +human: 313 + +train_dataset: + data_root: 'data/zju_mocap/CoreView_313' + human: 'CoreView_313' + ann_file: 'data/zju_mocap/CoreView_313/annots.npy' + split: 'train' + +test_dataset: + data_root: 'data/zju_mocap/CoreView_313' + human: 'CoreView_313' + ann_file: 'data/zju_mocap/CoreView_313/annots.npy' + split: 'test' + +# data options +num_train_frame: 1 diff --git a/configs/zju_mocap_frame1_exp/latent_xyzc_315_ni1.yaml b/configs/zju_mocap_frame1_exp/latent_xyzc_315_ni1.yaml new file mode 100644 index 0000000000000000000000000000000000000000..aab379ac3e32efc40123e0df15cf081de68e6c53 --- /dev/null +++ b/configs/zju_mocap_frame1_exp/latent_xyzc_315_ni1.yaml @@ -0,0 +1,21 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/zju_mocap_exp/latent_xyzc_313.yaml' + +human: 315 + +train_dataset: + data_root: 'data/zju_mocap/CoreView_315' + human: 'CoreView_315' + ann_file: 'data/zju_mocap/CoreView_315/annots.npy' + split: 'train' + +test_dataset: + data_root: 'data/zju_mocap/CoreView_315' + human: 'CoreView_315' + ann_file: 'data/zju_mocap/CoreView_315/annots.npy' + split: 'test' + +# data options +num_train_frame: 1 diff --git a/configs/zju_mocap_frame1_exp/latent_xyzc_377_ni1.yaml b/configs/zju_mocap_frame1_exp/latent_xyzc_377_ni1.yaml new file mode 100644 index 0000000000000000000000000000000000000000..b8917d299dca1b729ea1dba181a81345017ab06e --- /dev/null +++ b/configs/zju_mocap_frame1_exp/latent_xyzc_377_ni1.yaml @@ -0,0 +1,21 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/zju_mocap_exp/latent_xyzc_313.yaml' + +human: 377 + +train_dataset: + data_root: 'data/zju_mocap/CoreView_377' + human: 'CoreView_377' + ann_file: 'data/zju_mocap/CoreView_377/annots.npy' + split: 'train' + +test_dataset: + data_root: 'data/zju_mocap/CoreView_377' + human: 'CoreView_377' + ann_file: 'data/zju_mocap/CoreView_377/annots.npy' + split: 'test' + +# data options +num_train_frame: 1 diff --git a/configs/zju_mocap_frame1_exp/latent_xyzc_386_ni1.yaml b/configs/zju_mocap_frame1_exp/latent_xyzc_386_ni1.yaml new file mode 100644 index 0000000000000000000000000000000000000000..8b5a50fa3652e9dd5c2efb35b63da3e3d1017651 --- /dev/null +++ b/configs/zju_mocap_frame1_exp/latent_xyzc_386_ni1.yaml @@ -0,0 +1,21 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/zju_mocap_exp/latent_xyzc_313.yaml' + +human: 386 + +train_dataset: + data_root: 'data/zju_mocap/CoreView_386' + human: 'CoreView_386' + ann_file: 'data/zju_mocap/CoreView_386/annots.npy' + split: 'train' + +test_dataset: + data_root: 'data/zju_mocap/CoreView_386' + human: 'CoreView_386' + ann_file: 'data/zju_mocap/CoreView_386/annots.npy' + split: 'test' + +# data options +num_train_frame: 1 diff --git a/configs/zju_mocap_frame1_exp/latent_xyzc_387_ni1.yaml b/configs/zju_mocap_frame1_exp/latent_xyzc_387_ni1.yaml new file mode 100644 index 0000000000000000000000000000000000000000..7786600295e09f027fad6b2e13f0636b2986b397 --- /dev/null +++ b/configs/zju_mocap_frame1_exp/latent_xyzc_387_ni1.yaml @@ -0,0 +1,21 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/zju_mocap_exp/latent_xyzc_313.yaml' + +human: 387 + +train_dataset: + data_root: 'data/zju_mocap/CoreView_387' + human: 'CoreView_387' + ann_file: 'data/zju_mocap/CoreView_387/annots.npy' + split: 'train' + +test_dataset: + data_root: 'data/zju_mocap/CoreView_387' + human: 'CoreView_387' + ann_file: 'data/zju_mocap/CoreView_387/annots.npy' + split: 'test' + +# data options +num_train_frame: 1 diff --git a/configs/zju_mocap_frame1_exp/latent_xyzc_390_ni1.yaml b/configs/zju_mocap_frame1_exp/latent_xyzc_390_ni1.yaml new file mode 100644 index 0000000000000000000000000000000000000000..b8917d299dca1b729ea1dba181a81345017ab06e --- /dev/null +++ b/configs/zju_mocap_frame1_exp/latent_xyzc_390_ni1.yaml @@ -0,0 +1,21 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/zju_mocap_exp/latent_xyzc_313.yaml' + +human: 377 + +train_dataset: + data_root: 'data/zju_mocap/CoreView_377' + human: 'CoreView_377' + ann_file: 'data/zju_mocap/CoreView_377/annots.npy' + split: 'train' + +test_dataset: + data_root: 'data/zju_mocap/CoreView_377' + human: 'CoreView_377' + ann_file: 'data/zju_mocap/CoreView_377/annots.npy' + split: 'test' + +# data options +num_train_frame: 1 diff --git a/configs/zju_mocap_frame1_exp/latent_xyzc_392_ni1.yaml b/configs/zju_mocap_frame1_exp/latent_xyzc_392_ni1.yaml new file mode 100644 index 0000000000000000000000000000000000000000..b3d3589c6c0cf60abf58b770988e4d1215acd7b1 --- /dev/null +++ b/configs/zju_mocap_frame1_exp/latent_xyzc_392_ni1.yaml @@ -0,0 +1,21 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/zju_mocap_exp/latent_xyzc_313.yaml' + +human: 392 + +train_dataset: + data_root: 'data/zju_mocap/CoreView_392' + human: 'CoreView_392' + ann_file: 'data/zju_mocap/CoreView_392/annots.npy' + split: 'train' + +test_dataset: + data_root: 'data/zju_mocap/CoreView_392' + human: 'CoreView_392' + ann_file: 'data/zju_mocap/CoreView_392/annots.npy' + split: 'test' + +# data options +num_train_frame: 1 diff --git a/configs/zju_mocap_frame1_exp/latent_xyzc_393_ni1.yaml b/configs/zju_mocap_frame1_exp/latent_xyzc_393_ni1.yaml new file mode 100644 index 0000000000000000000000000000000000000000..21e3530c2a165a9cf501e2f6a139524cfc61f86e --- /dev/null +++ b/configs/zju_mocap_frame1_exp/latent_xyzc_393_ni1.yaml @@ -0,0 +1,21 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/zju_mocap_exp/latent_xyzc_313.yaml' + +human: 393 + +train_dataset: + data_root: 'data/zju_mocap/CoreView_393' + human: 'CoreView_393' + ann_file: 'data/zju_mocap/CoreView_393/annots.npy' + split: 'train' + +test_dataset: + data_root: 'data/zju_mocap/CoreView_393' + human: 'CoreView_393' + ann_file: 'data/zju_mocap/CoreView_393/annots.npy' + split: 'test' + +# data options +num_train_frame: 1 diff --git a/configs/zju_mocap_frame1_exp/latent_xyzc_394_ni1.yaml b/configs/zju_mocap_frame1_exp/latent_xyzc_394_ni1.yaml new file mode 100644 index 0000000000000000000000000000000000000000..8a187597b7ee6cf894e76dbe6f022d7896b2e112 --- /dev/null +++ b/configs/zju_mocap_frame1_exp/latent_xyzc_394_ni1.yaml @@ -0,0 +1,21 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/zju_mocap_exp/latent_xyzc_313.yaml' + +human: 394 + +train_dataset: + data_root: 'data/zju_mocap/CoreView_394' + human: 'CoreView_394' + ann_file: 'data/zju_mocap/CoreView_394/annots.npy' + split: 'train' + +test_dataset: + data_root: 'data/zju_mocap/CoreView_394' + human: 'CoreView_394' + ann_file: 'data/zju_mocap/CoreView_394/annots.npy' + split: 'test' + +# data options +num_train_frame: 1 diff --git a/configs/zju_mocap_view_exp/latent_xyzc_313_1view.yaml b/configs/zju_mocap_view_exp/latent_xyzc_313_1view.yaml new file mode 100644 index 0000000000000000000000000000000000000000..a483353cc5c2edab57a80b7a6f8b0260c8af9b7f --- /dev/null +++ b/configs/zju_mocap_view_exp/latent_xyzc_313_1view.yaml @@ -0,0 +1,21 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/zju_mocap_exp/latent_xyzc_313.yaml' + +human: 313 + +train_dataset: + data_root: 'data/zju_mocap/CoreView_313' + human: 'CoreView_313' + ann_file: 'data/zju_mocap/CoreView_313/annots.npy' + split: 'train' + +test_dataset: + data_root: 'data/zju_mocap/CoreView_313' + human: 'CoreView_313' + ann_file: 'data/zju_mocap/CoreView_313/annots.npy' + split: 'test' + +# data options +training_view: [0] diff --git a/configs/zju_mocap_view_exp/latent_xyzc_313_2view.yaml b/configs/zju_mocap_view_exp/latent_xyzc_313_2view.yaml new file mode 100644 index 0000000000000000000000000000000000000000..2134663691323ab786e51c2f3e6c814d74dee9c7 --- /dev/null +++ b/configs/zju_mocap_view_exp/latent_xyzc_313_2view.yaml @@ -0,0 +1,21 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/zju_mocap_exp/latent_xyzc_313.yaml' + +human: 313 + +train_dataset: + data_root: 'data/zju_mocap/CoreView_313' + human: 'CoreView_313' + ann_file: 'data/zju_mocap/CoreView_313/annots.npy' + split: 'train' + +test_dataset: + data_root: 'data/zju_mocap/CoreView_313' + human: 'CoreView_313' + ann_file: 'data/zju_mocap/CoreView_313/annots.npy' + split: 'test' + +# data options +training_view: [0, 12] diff --git a/configs/zju_mocap_view_exp/latent_xyzc_313_6view.yaml b/configs/zju_mocap_view_exp/latent_xyzc_313_6view.yaml new file mode 100644 index 0000000000000000000000000000000000000000..10f9610d1857d26f9a8152ae56381bac3e04bedf --- /dev/null +++ b/configs/zju_mocap_view_exp/latent_xyzc_313_6view.yaml @@ -0,0 +1,24 @@ +task: 'if_nerf' +gpus: [0] + +parent_cfg: 'configs/zju_mocap_exp/latent_xyzc_313.yaml' + +human: 313 + +train_dataset: + data_root: 'data/zju_mocap/CoreView_313' + human: 'CoreView_313' + ann_file: 'data/zju_mocap/CoreView_313/annots.npy' + split: 'train' + +test_dataset: + data_root: 'data/zju_mocap/CoreView_313' + human: 'CoreView_313' + ann_file: 'data/zju_mocap/CoreView_313/annots.npy' + split: 'test' + +# data options +smpl: 'smpl_6view' +params: 'params_6view' +vertices: 'vertices_6view' +training_view: [0, 3, 6, 12, 15, 18] diff --git a/docker/.condarc b/docker/.condarc new file mode 100644 index 0000000000000000000000000000000000000000..598af8b87fd0c607f55ae148d2cb7843a9362f36 --- /dev/null +++ b/docker/.condarc @@ -0,0 +1,14 @@ +channels: + - defaults +show_channel_urls: true +default_channels: + - https://mirrors.bfsu.edu.cn/anaconda/pkgs/main + - https://mirrors.bfsu.edu.cn/anaconda/pkgs/r + - https://mirrors.bfsu.edu.cn/anaconda/pkgs/msys2 +custom_channels: + conda-forge: https://mirrors.bfsu.edu.cn/anaconda/cloud + msys2: https://mirrors.bfsu.edu.cn/anaconda/cloud + bioconda: https://mirrors.bfsu.edu.cn/anaconda/cloud + menpo: https://mirrors.bfsu.edu.cn/anaconda/cloud + pytorch: https://mirrors.bfsu.edu.cn/anaconda/cloud + simpleitk: https://mirrors.bfsu.edu.cn/anaconda/cloud diff --git a/docker/Dockerfile b/docker/Dockerfile new file mode 100644 index 0000000000000000000000000000000000000000..5f4a69a270cd7cdf1b9f453e8d14c40455a8630e --- /dev/null +++ b/docker/Dockerfile @@ -0,0 +1,75 @@ +FROM nvidia/cuda:11.1.1-cudnn8-devel-ubuntu18.04 + +# For the convenience for users in China mainland +COPY docker/apt-sources.list /etc/apt/sources.list + +# Install some basic utilities +RUN rm /etc/apt/sources.list.d/nvidia-ml.list \ + && rm /etc/apt/sources.list.d/cuda.list \ + && apt-get update && apt-get install -y \ + curl \ + ca-certificates \ + sudo \ + git \ + bzip2 \ + libx11-6 \ + gcc \ + g++ \ + libusb-1.0-0 \ + cmake \ + libssl-dev \ + && DEBIAN_FRONTEND=noninteractive apt-get install -y python3-opencv \ + && rm -rf /var/lib/apt/lists/* + +# Create a working directory +RUN mkdir /app +WORKDIR /app + +# Create a non-root user and switch to it +RUN adduser --disabled-password --gecos '' --shell /bin/bash user \ + && chown -R user:user /app +RUN echo "user ALL=(ALL) NOPASSWD:ALL" > /etc/sudoers.d/90-user +USER user + +# All users can use /home/user as their home directory +ENV HOME=/home/user +RUN chmod 777 /home/user + +# Install Miniconda and Python 3.8 +ENV CONDA_AUTO_UPDATE_CONDA=false +ENV PATH=/home/user/miniconda/bin:$PATH +RUN curl -sLo ~/miniconda.sh https://mirrors.tuna.tsinghua.edu.cn/anaconda/miniconda/Miniconda3-py38_4.8.3-Linux-x86_64.sh \ + && chmod +x ~/miniconda.sh \ + && ~/miniconda.sh -b -p ~/miniconda \ + && rm ~/miniconda.sh \ + && conda install -y python==3.8.3 \ + && conda clean -ya +COPY --chown=user docker/.condarc /home/user/.condarc + +# CUDA 11.1-specific steps +RUN conda install -y -c conda-forge cudatoolkit=11.1.1 \ + && conda install -y -c pytorch \ + "pytorch=1.8.1=py3.8_cuda11.1_cudnn8.0.5_0" \ + "torchvision=0.9.1=py38_cu111" \ + && conda clean -ya + +# Alter sources for the convenience of users located in China mainland. +RUN pip config set global.index-url https://pypi.douban.com/simple +COPY requirements.txt requirements.txt +RUN pip install -r requirements.txt + +ENV CUDA_HOME=/usr/local/cuda +RUN bash -c "git clone --recursive https://github.com/traveller59/spconv.git" +# We manually download and install cmake since the requirements of spconv is newer than +# that included in apt for ubuntu18. +RUN curl -sLo cmake.tar.gz https://github.com/Kitware/CMake/releases/download/v3.20.1/cmake-3.20.1.tar.gz \ + && tar -xvf cmake.tar.gz \ + && cd cmake-3.20.1 \ + && ./configure \ + && make -j4 && sudo make install +RUN sudo apt-get update && sudo apt-get install -y libboost-dev \ + && sudo rm -rf /var/lib/apt/lists/* +COPY docker/spconv.sh spconv.sh +RUN bash spconv.sh + +CMD ["python3"] diff --git a/docker/README.md b/docker/README.md new file mode 100644 index 0000000000000000000000000000000000000000..8f0f09544d0cc3b7047569f7eb940aff78339f22 --- /dev/null +++ b/docker/README.md @@ -0,0 +1,32 @@ +## 1. Build the image + +From the root path of the project: +```shell +docker build -f docker/Dockerfile -t neuralbody . +``` + +You may want to try several times since there are so many packages to be downloaded through the Internet and htpp(s) erros could occur. + +## 2. Data preparation + +The docker image contains the environment you need to run the project, while you still need to manually download data as described in [INSTALL.md](https://github.com/zju3dv/neuralbody/blob/master/INSTALL.md). + +Note that the files downloaded are originally tar.gz files, while you need to extract each of them. + +An example is like: + +```shell +for name in $(ls *.tar.gz); do tar -xvf $name; done +``` + +## 3. Execution using docker containers + + +Suppose you are at the root path of the project, run a docker container like: +```shell +docker run -it --rm --gpus=all \ +--mount type=bind,source="$(pwd)",target=/app \ +--mount type=bind,source=,target=/app/data \ +neuralbody +``` +where `` can be obtained from [README.md](https://github.com/zju3dv/neuralbody/blob/master/README.md) and `` is your path for data. diff --git a/docker/apt-sources.list b/docker/apt-sources.list new file mode 100644 index 0000000000000000000000000000000000000000..3bba875c06ef839fce868313013d3776c8a85483 --- /dev/null +++ b/docker/apt-sources.list @@ -0,0 +1,10 @@ +deb http://mirrors.163.com/ubuntu/ bionic main restricted universe multiverse +deb http://mirrors.163.com/ubuntu/ bionic-security main restricted universe multiverse +deb http://mirrors.163.com/ubuntu/ bionic-updates main restricted universe multiverse +deb http://mirrors.163.com/ubuntu/ bionic-proposed main restricted universe multiverse +deb http://mirrors.163.com/ubuntu/ bionic-backports main restricted universe multiverse +deb-src http://mirrors.163.com/ubuntu/ bionic main restricted universe multiverse +deb-src http://mirrors.163.com/ubuntu/ bionic-security main restricted universe multiverse +deb-src http://mirrors.163.com/ubuntu/ bionic-updates main restricted universe multiverse +deb-src http://mirrors.163.com/ubuntu/ bionic-proposed main restricted universe multiverse +deb-src http://mirrors.163.com/ubuntu/ bionic-backports main restricted universe multiverse diff --git a/docker/spconv.sh b/docker/spconv.sh new file mode 100644 index 0000000000000000000000000000000000000000..b923242a5d1b9f61bdec7d367ad9a28a0767629c --- /dev/null +++ b/docker/spconv.sh @@ -0,0 +1,4 @@ +cd spconv +git checkout abf0acf30f5526ea93e687e3f424f62d9cd8313a +python setup.py bdist_wheel +pip install dist/spconv-1.2.1-cp38-cp38-linux_x86_64.whl diff --git a/eval_whole_img.sh b/eval_whole_img.sh new file mode 100644 index 0000000000000000000000000000000000000000..ee5edcbd895b9091b15aa704f69b22ad4a7fefea --- /dev/null +++ b/eval_whole_img.sh @@ -0,0 +1,26 @@ +python run.py --type evaluate --cfg_file configs/latent_xyzc_313.yaml exp_name xyzc_313 eval_whole_img True gpus "3," +python run.py --type evaluate --cfg_file configs/latent_xyzc_313.yaml exp_name xyzc_313 eval_whole_img True gpus "3," test_novel_pose True novel_pose_ni 1000 + +python run.py --type evaluate --cfg_file configs/latent_xyzc_315.yaml exp_name xyzc_315 eval_whole_img True gpus "3," +python run.py --type evaluate --cfg_file configs/latent_xyzc_315.yaml exp_name xyzc_315 eval_whole_img True gpus "3," test_novel_pose True novel_pose_ni 1000 + +python run.py --type evaluate --cfg_file configs/latent_xyzc_392.yaml exp_name xyzc_392 eval_whole_img True gpus "3," +python run.py --type evaluate --cfg_file configs/latent_xyzc_392.yaml exp_name xyzc_392 eval_whole_img True gpus "3," test_novel_pose True novel_pose_ni 1000 + +python run.py --type evaluate --cfg_file configs/latent_xyzc_393.yaml exp_name xyzc_393 eval_whole_img True gpus "3," +python run.py --type evaluate --cfg_file configs/latent_xyzc_393.yaml exp_name xyzc_393 eval_whole_img True gpus "3," test_novel_pose True novel_pose_ni 1000 + +python run.py --type evaluate --cfg_file configs/latent_xyzc_394.yaml exp_name xyzc_394 eval_whole_img True gpus "3," +python run.py --type evaluate --cfg_file configs/latent_xyzc_394.yaml exp_name xyzc_394 eval_whole_img True gpus "3," test_novel_pose True novel_pose_ni 1000 + +python run.py --type evaluate --cfg_file configs/latent_xyzc_377.yaml exp_name xyzc_377 eval_whole_img True gpus "3," +python run.py --type evaluate --cfg_file configs/latent_xyzc_377.yaml exp_name xyzc_377 eval_whole_img True gpus "3," test_novel_pose True novel_pose_ni 1000 + +python run.py --type evaluate --cfg_file configs/latent_xyzc_386.yaml exp_name xyzc_386 eval_whole_img True gpus "3," +python run.py --type evaluate --cfg_file configs/latent_xyzc_386.yaml exp_name xyzc_386 eval_whole_img True gpus "3," test_novel_pose True novel_pose_ni 1000 + +python run.py --type evaluate --cfg_file configs/latent_xyzc_390.yaml exp_name xyzc_390 eval_whole_img True gpus "3," +python run.py --type evaluate --cfg_file configs/latent_xyzc_390.yaml exp_name xyzc_390 eval_whole_img True gpus "3," test_novel_pose True novel_pose_ni 700 + +python run.py --type evaluate --cfg_file configs/latent_xyzc_387.yaml exp_name xyzc_387 eval_whole_img True gpus "3," +python run.py --type evaluate --cfg_file configs/latent_xyzc_387.yaml exp_name xyzc_387 eval_whole_img True gpus "3," test_novel_pose True novel_pose_ni 1000 diff --git a/lib/__init__.py b/lib/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 diff --git a/lib/config/__init__.py b/lib/config/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..a99d514f4d40f8962f334d1dc03b8df5a4b6d925 --- /dev/null +++ b/lib/config/__init__.py @@ -0,0 +1 @@ +from .config import cfg, args diff --git a/lib/config/config.py b/lib/config/config.py new file mode 100644 index 0000000000000000000000000000000000000000..4a25aea108d354eb24ae62973df7d174f45951df --- /dev/null +++ b/lib/config/config.py @@ -0,0 +1,187 @@ +import open3d as o3d +from . import yacs +from .yacs import CfgNode as CN +import argparse +import os +import numpy as np +import pprint + +cfg = CN() + +# experiment name +cfg.exp_name = 'hello' + +# network +cfg.point_feature = 9 +cfg.distributed = False + +# data +cfg.human = 313 +cfg.training_view = [0, 6, 12, 18] +cfg.intv = 1 +cfg.begin_ith_frame = 0 # the first smpl +cfg.num_train_frame = 1 # number of smpls +cfg.num_render_frame = -1 # number of frames to render +cfg.ith_frame = 0 # the i-th smpl +cfg.frame_interval = 1 +cfg.nv = 6890 # number of vertices +cfg.smpl = 'smpl_4views_5e-4' +cfg.vertices = 'vertices' +cfg.params = 'params_4views_5e-4' +cfg.mask_bkgd = True +cfg.sample_smpl = False +cfg.sample_grid = False +cfg.sample_fg_ratio = 0.7 +cfg.H = 1024 +cfg.W = 1024 +cfg.add_pointcloud = False + +cfg.big_box = False + +cfg.rot_ratio = 0. +cfg.rot_range = np.pi / 32 + +# mesh +cfg.mesh_th = 50 # threshold of alpha + +# task +cfg.task = 'nerf4d' + +# gpus +cfg.gpus = list(range(8)) +# if load the pretrained network +cfg.resume = True + +# epoch +cfg.ep_iter = -1 +cfg.save_ep = 100 +cfg.save_latest_ep = 5 +cfg.eval_ep = 100 + +# ----------------------------------------------------------------------------- +# train +# ----------------------------------------------------------------------------- +cfg.train = CN() + +cfg.train.dataset = 'CocoTrain' +cfg.train.epoch = 10000 +cfg.train.num_workers = 8 +cfg.train.collator = '' +cfg.train.batch_sampler = 'default' +cfg.train.sampler_meta = CN({'min_hw': [256, 256], 'max_hw': [480, 640], 'strategy': 'range'}) +cfg.train.shuffle = True + +# use adam as default +cfg.train.optim = 'adam' +cfg.train.lr = 1e-4 +cfg.train.weight_decay = 0 + +cfg.train.scheduler = CN({'type': 'multi_step', 'milestones': [80, 120, 200, 240], 'gamma': 0.5}) + +cfg.train.batch_size = 4 + +cfg.train.acti_func = 'relu' + +cfg.train.use_vgg = False +cfg.train.vgg_pretrained = '' +cfg.train.vgg_layer_name = [0,0,0,0,0] + +cfg.train.use_ssim = False +cfg.train.use_d = False + +# test +cfg.test = CN() +cfg.test.dataset = 'CocoVal' +cfg.test.batch_size = 1 +cfg.test.epoch = -1 +cfg.test.sampler = 'default' +cfg.test.batch_sampler = 'default' +cfg.test.sampler_meta = CN({'min_hw': [480, 640], 'max_hw': [480, 640], 'strategy': 'origin'}) +cfg.test.frame_sampler_interval = 30 + +# trained model +cfg.trained_model_dir = 'data/trained_model' + +# recorder +cfg.record_dir = 'data/record' +cfg.log_interval = 20 +cfg.record_interval = 20 + +# result +cfg.result_dir = 'data/result' + +# evaluation +cfg.skip_eval = False +cfg.test_novel_pose = False +cfg.novel_pose_ni = 100 +cfg.vis_novel_pose = False +cfg.vis_novel_view = False +cfg.vis_rotate_smpl = False +cfg.vis_mesh = False +cfg.eval_whole_img = False + +cfg.fix_random = False + +cfg.vis = 'mesh' + +# data +cfg.body_sample_ratio = 0.5 +cfg.face_sample_ratio = 0. + + +def parse_cfg(cfg, args): + if len(cfg.task) == 0: + raise ValueError('task must be specified') + + # assign the gpus + os.environ['CUDA_VISIBLE_DEVICES'] = ', '.join([str(gpu) for gpu in cfg.gpus]) + cfg.trained_model_dir = os.path.join(cfg.trained_model_dir, cfg.task, cfg.exp_name) + cfg.record_dir = os.path.join(cfg.record_dir, cfg.task, cfg.exp_name) + cfg.result_dir = os.path.join(cfg.result_dir, cfg.task, cfg.exp_name) + cfg.local_rank = args.local_rank + cfg.distributed = cfg.distributed or args.launcher not in ['none'] + + +def make_cfg(args): + with open(args.cfg_file, 'r') as f: + current_cfg = yacs.load_cfg(f) + + if 'parent_cfg' in current_cfg.keys(): + with open(current_cfg.parent_cfg, 'r') as f: + parent_cfg = yacs.load_cfg(f) + cfg.merge_from_other_cfg(parent_cfg) + + cfg.merge_from_other_cfg(current_cfg) + cfg.merge_from_list(args.opts) + + if cfg.vis_novel_pose: + cfg.merge_from_other_cfg(cfg.novel_pose_cfg) + + if cfg.vis_novel_view: + cfg.merge_from_other_cfg(cfg.novel_view_cfg) + + if cfg.vis_rotate_smpl: + cfg.merge_from_other_cfg(cfg.rotate_smpl_cfg) + + if cfg.vis_mesh: + cfg.merge_from_other_cfg(cfg.mesh_cfg) + + cfg.merge_from_list(args.opts) + + parse_cfg(cfg, args) + # pprint.pprint(cfg) + return cfg + + +parser = argparse.ArgumentParser() +parser.add_argument("--cfg_file", default="configs/default.yaml", type=str) +parser.add_argument('--test', action='store_true', dest='test', default=False) +parser.add_argument("--type", type=str, default="") +parser.add_argument('--det', type=str, default='') +parser.add_argument('--local_rank', type=int, default=0) +parser.add_argument('--launcher', type=str, default='none', choices=['none', 'pytorch']) +parser.add_argument("opts", default=None, nargs=argparse.REMAINDER) +args = parser.parse_args() +if len(args.type) > 0: + cfg.task = "run" +cfg = make_cfg(args) diff --git a/lib/config/yacs.py b/lib/config/yacs.py new file mode 100644 index 0000000000000000000000000000000000000000..6a14540840a3283912b089dc327487bb19705f8f --- /dev/null +++ b/lib/config/yacs.py @@ -0,0 +1,498 @@ +# Copyright (c) 2018-present, Facebook, Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +############################################################################## + +"""YACS -- Yet Another Configuration System is designed to be a simple +configuration management system for academic and industrial research +projects. + +See README.md for usage and examples. +""" + +import copy +import io +import logging +import os +from ast import literal_eval + +import yaml + + +# Flag for py2 and py3 compatibility to use when separate code paths are necessary +# When _PY2 is False, we assume Python 3 is in use +_PY2 = False + +# Filename extensions for loading configs from files +_YAML_EXTS = {"", ".yaml", ".yml"} +_PY_EXTS = {".py"} + +# py2 and py3 compatibility for checking file object type +# We simply use this to infer py2 vs py3 +try: + _FILE_TYPES = (file, io.IOBase) + _PY2 = True +except NameError: + _FILE_TYPES = (io.IOBase,) + +# CfgNodes can only contain a limited set of valid types +_VALID_TYPES = {tuple, list, str, int, float, bool} +# py2 allow for str and unicode +if _PY2: + _VALID_TYPES = _VALID_TYPES.union({unicode}) # noqa: F821 + +# Utilities for importing modules from file paths +if _PY2: + # imp is available in both py2 and py3 for now, but is deprecated in py3 + import imp +else: + import importlib.util + +logger = logging.getLogger(__name__) + + +class CfgNode(dict): + """ + CfgNode represents an internal node in the configuration tree. It's a simple + dict-like container that allows for attribute-based access to keys. + """ + + IMMUTABLE = "__immutable__" + DEPRECATED_KEYS = "__deprecated_keys__" + RENAMED_KEYS = "__renamed_keys__" + + def __init__(self, init_dict=None, key_list=None): + # Recursively convert nested dictionaries in init_dict into CfgNodes + init_dict = {} if init_dict is None else init_dict + key_list = [] if key_list is None else key_list + for k, v in init_dict.items(): + if type(v) is dict: + # Convert dict to CfgNode + init_dict[k] = CfgNode(v, key_list=key_list + [k]) + else: + # Check for valid leaf type or nested CfgNode + _assert_with_logging( + _valid_type(v, allow_cfg_node=True), + "Key {} with value {} is not a valid type; valid types: {}".format( + ".".join(key_list + [k]), type(v), _VALID_TYPES + ), + ) + super(CfgNode, self).__init__(init_dict) + # Manage if the CfgNode is frozen or not + self.__dict__[CfgNode.IMMUTABLE] = False + # Deprecated options + # If an option is removed from the code and you don't want to break existing + # yaml configs, you can add the full config key as a string to the set below. + self.__dict__[CfgNode.DEPRECATED_KEYS] = set() + # Renamed options + # If you rename a config option, record the mapping from the old name to the new + # name in the dictionary below. Optionally, if the type also changed, you can + # make the value a tuple that specifies first the renamed key and then + # instructions for how to edit the config file. + self.__dict__[CfgNode.RENAMED_KEYS] = { + # 'EXAMPLE.OLD.KEY': 'EXAMPLE.NEW.KEY', # Dummy example to follow + # 'EXAMPLE.OLD.KEY': ( # A more complex example to follow + # 'EXAMPLE.NEW.KEY', + # "Also convert to a tuple, e.g., 'foo' -> ('foo',) or " + # + "'foo:bar' -> ('foo', 'bar')" + # ), + } + + def __getattr__(self, name): + if name in self: + return self[name] + else: + raise AttributeError(name) + + def __setattr__(self, name, value): + if self.is_frozen(): + raise AttributeError( + "Attempted to set {} to {}, but CfgNode is immutable".format( + name, value + ) + ) + + _assert_with_logging( + name not in self.__dict__, + "Invalid attempt to modify internal CfgNode state: {}".format(name), + ) + _assert_with_logging( + _valid_type(value, allow_cfg_node=True), + "Invalid type {} for key {}; valid types = {}".format( + type(value), name, _VALID_TYPES + ), + ) + + self[name] = value + + def __str__(self): + def _indent(s_, num_spaces): + s = s_.split("\n") + if len(s) == 1: + return s_ + first = s.pop(0) + s = [(num_spaces * " ") + line for line in s] + s = "\n".join(s) + s = first + "\n" + s + return s + + r = "" + s = [] + for k, v in sorted(self.items()): + seperator = "\n" if isinstance(v, CfgNode) else " " + attr_str = "{}:{}{}".format(str(k), seperator, str(v)) + attr_str = _indent(attr_str, 2) + s.append(attr_str) + r += "\n".join(s) + return r + + def __repr__(self): + return "{}({})".format(self.__class__.__name__, super(CfgNode, self).__repr__()) + + def dump(self): + """Dump to a string.""" + self_as_dict = _to_dict(self) + return yaml.safe_dump(self_as_dict) + + def merge_from_file(self, cfg_filename): + """Load a yaml config file and merge it this CfgNode.""" + with open(cfg_filename, "r") as f: + cfg = load_cfg(f) + self.merge_from_other_cfg(cfg) + + def merge_from_other_cfg(self, cfg_other): + """Merge `cfg_other` into this CfgNode.""" + _merge_a_into_b(cfg_other, self, self, []) + + def merge_from_list(self, cfg_list): + """Merge config (keys, values) in a list (e.g., from command line) into + this CfgNode. For example, `cfg_list = ['FOO.BAR', 0.5]`. + """ + _assert_with_logging( + len(cfg_list) % 2 == 0, + "Override list has odd length: {}; it must be a list of pairs".format( + cfg_list + ), + ) + root = self + for full_key, v in zip(cfg_list[0::2], cfg_list[1::2]): + if root.key_is_deprecated(full_key): + continue + if root.key_is_renamed(full_key): + root.raise_key_rename_error(full_key) + key_list = full_key.split(".") + d = self + for subkey in key_list[:-1]: + _assert_with_logging( + subkey in d, "Non-existent key: {}".format(full_key) + ) + d = d[subkey] + subkey = key_list[-1] + _assert_with_logging(subkey in d, "Non-existent key: {}".format(full_key)) + value = _decode_cfg_value(v) + value = _check_and_coerce_cfg_value_type(value, d[subkey], subkey, full_key) + d[subkey] = value + + def freeze(self): + """Make this CfgNode and all of its children immutable.""" + self._immutable(True) + + def defrost(self): + """Make this CfgNode and all of its children mutable.""" + self._immutable(False) + + def is_frozen(self): + """Return mutability.""" + return self.__dict__[CfgNode.IMMUTABLE] + + def _immutable(self, is_immutable): + """Set immutability to is_immutable and recursively apply the setting + to all nested CfgNodes. + """ + self.__dict__[CfgNode.IMMUTABLE] = is_immutable + # Recursively set immutable state + for v in self.__dict__.values(): + if isinstance(v, CfgNode): + v._immutable(is_immutable) + for v in self.values(): + if isinstance(v, CfgNode): + v._immutable(is_immutable) + + def clone(self): + """Recursively copy this CfgNode.""" + return copy.deepcopy(self) + + def register_deprecated_key(self, key): + """Register key (e.g. `FOO.BAR`) a deprecated option. When merging deprecated + keys a warning is generated and the key is ignored. + """ + _assert_with_logging( + key not in self.__dict__[CfgNode.DEPRECATED_KEYS], + "key {} is already registered as a deprecated key".format(key), + ) + self.__dict__[CfgNode.DEPRECATED_KEYS].add(key) + + def register_renamed_key(self, old_name, new_name, message=None): + """Register a key as having been renamed from `old_name` to `new_name`. + When merging a renamed key, an exception is thrown alerting to user to + the fact that the key has been renamed. + """ + _assert_with_logging( + old_name not in self.__dict__[CfgNode.RENAMED_KEYS], + "key {} is already registered as a renamed cfg key".format(old_name), + ) + value = new_name + if message: + value = (new_name, message) + self.__dict__[CfgNode.RENAMED_KEYS][old_name] = value + + def key_is_deprecated(self, full_key): + """Test if a key is deprecated.""" + if full_key in self.__dict__[CfgNode.DEPRECATED_KEYS]: + logger.warning("Deprecated config key (ignoring): {}".format(full_key)) + return True + return False + + def key_is_renamed(self, full_key): + """Test if a key is renamed.""" + return full_key in self.__dict__[CfgNode.RENAMED_KEYS] + + def raise_key_rename_error(self, full_key): + new_key = self.__dict__[CfgNode.RENAMED_KEYS][full_key] + if isinstance(new_key, tuple): + msg = " Note: " + new_key[1] + new_key = new_key[0] + else: + msg = "" + raise KeyError( + "Key {} was renamed to {}; please update your config.{}".format( + full_key, new_key, msg + ) + ) + + +def load_cfg(cfg_file_obj_or_str): + """Load a cfg. Supports loading from: + - A file object backed by a YAML file + - A file object backed by a Python source file that exports an attribute + "cfg" that is either a dict or a CfgNode + - A string that can be parsed as valid YAML + """ + _assert_with_logging( + isinstance(cfg_file_obj_or_str, _FILE_TYPES + (str,)), + "Expected first argument to be of type {} or {}, but it was {}".format( + _FILE_TYPES, str, type(cfg_file_obj_or_str) + ), + ) + if isinstance(cfg_file_obj_or_str, str): + return _load_cfg_from_yaml_str(cfg_file_obj_or_str) + elif isinstance(cfg_file_obj_or_str, _FILE_TYPES): + return _load_cfg_from_file(cfg_file_obj_or_str) + else: + raise NotImplementedError("Impossible to reach here (unless there's a bug)") + + +def _load_cfg_from_file(file_obj): + """Load a config from a YAML file or a Python source file.""" + _, file_extension = os.path.splitext(file_obj.name) + if file_extension in _YAML_EXTS: + return _load_cfg_from_yaml_str(file_obj.read()) + elif file_extension in _PY_EXTS: + return _load_cfg_py_source(file_obj.name) + else: + raise Exception( + "Attempt to load from an unsupported file type {}; " + "only {} are supported".format(file_obj, _YAML_EXTS.union(_PY_EXTS)) + ) + + +def _load_cfg_from_yaml_str(str_obj): + """Load a config from a YAML string encoding.""" + cfg_as_dict = yaml.safe_load(str_obj) + return CfgNode(cfg_as_dict) + + +def _load_cfg_py_source(filename): + """Load a config from a Python source file.""" + module = _load_module_from_file("yacs.config.override", filename) + _assert_with_logging( + hasattr(module, "cfg"), + "Python module from file {} must have 'cfg' attr".format(filename), + ) + VALID_ATTR_TYPES = {dict, CfgNode} + _assert_with_logging( + type(module.cfg) in VALID_ATTR_TYPES, + "Imported module 'cfg' attr must be in {} but is {} instead".format( + VALID_ATTR_TYPES, type(module.cfg) + ), + ) + if type(module.cfg) is dict: + return CfgNode(module.cfg) + else: + return module.cfg + + +def _to_dict(cfg_node): + """Recursively convert all CfgNode objects to dict objects.""" + + def convert_to_dict(cfg_node, key_list): + if not isinstance(cfg_node, CfgNode): + _assert_with_logging( + _valid_type(cfg_node), + "Key {} with value {} is not a valid type; valid types: {}".format( + ".".join(key_list), type(cfg_node), _VALID_TYPES + ), + ) + return cfg_node + else: + cfg_dict = dict(cfg_node) + for k, v in cfg_dict.items(): + cfg_dict[k] = convert_to_dict(v, key_list + [k]) + return cfg_dict + + return convert_to_dict(cfg_node, []) + + +def _valid_type(value, allow_cfg_node=False): + return (type(value) in _VALID_TYPES) or (allow_cfg_node and type(value) == CfgNode) + + +def _merge_a_into_b(a, b, root, key_list): + """Merge config dictionary a into config dictionary b, clobbering the + options in b whenever they are also specified in a. + """ + _assert_with_logging( + isinstance(a, CfgNode), + "`a` (cur type {}) must be an instance of {}".format(type(a), CfgNode), + ) + _assert_with_logging( + isinstance(b, CfgNode), + "`b` (cur type {}) must be an instance of {}".format(type(b), CfgNode), + ) + + for k, v_ in a.items(): + full_key = ".".join(key_list + [k]) + # a must specify keys that are in b + if k not in b: + if root.key_is_deprecated(full_key): + continue + elif root.key_is_renamed(full_key): + root.raise_key_rename_error(full_key) + else: + v = copy.deepcopy(v_) + v = _decode_cfg_value(v) + b.update({k: v}) + else: + v = copy.deepcopy(v_) + v = _decode_cfg_value(v) + v = _check_and_coerce_cfg_value_type(v, b[k], k, full_key) + + # Recursively merge dicts + if isinstance(v, CfgNode): + try: + _merge_a_into_b(v, b[k], root, key_list + [k]) + except BaseException: + raise + else: + b[k] = v + + +def _decode_cfg_value(v): + """Decodes a raw config value (e.g., from a yaml config files or command + line argument) into a Python object. + """ + # Configs parsed from raw yaml will contain dictionary keys that need to be + # converted to CfgNode objects + if isinstance(v, dict): + return CfgNode(v) + # All remaining processing is only applied to strings + if not isinstance(v, str): + return v + # Try to interpret `v` as a: + # string, number, tuple, list, dict, boolean, or None + try: + v = literal_eval(v) + # The following two excepts allow v to pass through when it represents a + # string. + # + # Longer explanation: + # The type of v is always a string (before calling literal_eval), but + # sometimes it *represents* a string and other times a data structure, like + # a list. In the case that v represents a string, what we got back from the + # yaml parser is 'foo' *without quotes* (so, not '"foo"'). literal_eval is + # ok with '"foo"', but will raise a ValueError if given 'foo'. In other + # cases, like paths (v = 'foo/bar' and not v = '"foo/bar"'), literal_eval + # will raise a SyntaxError. + except ValueError: + pass + except SyntaxError: + pass + return v + + +def _check_and_coerce_cfg_value_type(replacement, original, key, full_key): + """Checks that `replacement`, which is intended to replace `original` is of + the right type. The type is correct if it matches exactly or is one of a few + cases in which the type can be easily coerced. + """ + original_type = type(original) + replacement_type = type(replacement) + + # The types must match (with some exceptions) + if replacement_type == original_type: + return replacement + + # Cast replacement from from_type to to_type if the replacement and original + # types match from_type and to_type + def conditional_cast(from_type, to_type): + if replacement_type == from_type and original_type == to_type: + return True, to_type(replacement) + else: + return False, None + + # Conditionally casts + # list <-> tuple + casts = [(tuple, list), (list, tuple)] + # For py2: allow converting from str (bytes) to a unicode string + try: + casts.append((str, unicode)) # noqa: F821 + except Exception: + pass + + for (from_type, to_type) in casts: + converted, converted_value = conditional_cast(from_type, to_type) + if converted: + return converted_value + + raise ValueError( + "Type mismatch ({} vs. {}) with values ({} vs. {}) for config " + "key: {}".format( + original_type, replacement_type, original, replacement, full_key + ) + ) + + +def _assert_with_logging(cond, msg): + if not cond: + logger.debug(msg) + assert cond, msg + + +def _load_module_from_file(name, filename): + if _PY2: + module = imp.load_source(name, filename) + else: + spec = importlib.util.spec_from_file_location(name, filename) + module = importlib.util.module_from_spec(spec) + spec.loader.exec_module(module) + return module diff --git a/lib/datasets/__init__.py b/lib/datasets/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..89206de2afb3ca658f00cfd3c691a23d82aa7ce5 --- /dev/null +++ b/lib/datasets/__init__.py @@ -0,0 +1 @@ +from .make_dataset import make_data_loader diff --git a/lib/datasets/collate_batch.py b/lib/datasets/collate_batch.py new file mode 100644 index 0000000000000000000000000000000000000000..8a0927a1b66f6b6d778b986df4bc4b29a87f9dbb --- /dev/null +++ b/lib/datasets/collate_batch.py @@ -0,0 +1,15 @@ +from torch.utils.data.dataloader import default_collate +import torch +import numpy as np + + +_collators = { +} + + +def make_collator(cfg, is_train): + collator = cfg.train.collator if is_train else cfg.test.collator + if collator in _collators: + return _collators[collator] + else: + return default_collate diff --git a/lib/datasets/light_stage/monocular_dataset.py b/lib/datasets/light_stage/monocular_dataset.py new file mode 100644 index 0000000000000000000000000000000000000000..1be66a0e7bca8f4ea381a18cb7be7c47d79cb157 --- /dev/null +++ b/lib/datasets/light_stage/monocular_dataset.py @@ -0,0 +1,141 @@ +import torch.utils.data as data +from lib.utils import base_utils +from PIL import Image +import numpy as np +import json +import os +import imageio +import cv2 +from lib.config import cfg +from lib.utils.if_nerf import if_nerf_data_utils as if_nerf_dutils +from plyfile import PlyData +from lib.utils import snapshot_data_utils as snapshot_dutils + + +class Dataset(data.Dataset): + def __init__(self, data_root, human, ann_file, split): + super(Dataset, self).__init__() + + self.data_root = data_root + self.human = human + self.split = split + + camera_path = os.path.join(self.data_root, 'camera.pkl') + self.cam = snapshot_dutils.get_camera(camera_path) + self.num_train_frame = cfg.num_train_frame + + params_path = ann_file + self.params = np.load(params_path, allow_pickle=True).item() + + self.nrays = cfg.N_rand + + def prepare_input(self, i): + # read xyz, normal, color from the npy file + vertices_path = os.path.join(self.data_root, 'vertices', + '{}.npy'.format(i)) + xyz = np.load(vertices_path).astype(np.float32) + nxyz = np.zeros_like(xyz).astype(np.float32) + + # obtain the original bounds for point sampling + min_xyz = np.min(xyz, axis=0) + max_xyz = np.max(xyz, axis=0) + min_xyz[1] -= 0.1 + max_xyz[1] += 0.1 + can_bounds = np.stack([min_xyz, max_xyz], axis=0) + + # transform smpl from the world coordinate to the smpl coordinate + Rh = self.params['pose'][i][:3] + R = cv2.Rodrigues(Rh)[0].astype(np.float32) + Th = self.params['trans'][i].astype(np.float32) + xyz = np.dot(xyz - Th, R) + + # obtain the bounds for coord construction + min_xyz = np.min(xyz, axis=0) + max_xyz = np.max(xyz, axis=0) + min_xyz[1] -= 0.1 + max_xyz[1] += 0.1 + bounds = np.stack([min_xyz, max_xyz], axis=0) + + # construct the coordinate + dhw = xyz[:, [2, 1, 0]] + min_dhw = min_xyz[[2, 1, 0]] + max_dhw = max_xyz[[2, 1, 0]] + voxel_size = np.array(cfg.voxel_size) + coord = np.round((dhw - min_dhw) / voxel_size).astype(np.int32) + + # construct the output shape + out_sh = np.ceil((max_dhw - min_dhw) / voxel_size).astype(np.int32) + x = 32 + out_sh = (out_sh | (x - 1)) + 1 + + return coord, out_sh, can_bounds, bounds, Rh, Th + + def __getitem__(self, index): + img_path = os.path.join(self.data_root, 'image', + '{}.jpg'.format(index)) + img = imageio.imread(img_path).astype(np.float32) / 255. + msk_path = os.path.join(self.data_root, 'mask', '{}.png'.format(index)) + msk = imageio.imread(msk_path) + + frame_index = index + latent_index = index + + K = self.cam['K'] + D = self.cam['D'] + img = cv2.undistort(img, K, D) + msk = cv2.undistort(msk, K, D) + + R = self.cam['R'] + T = self.cam['T'][:, None] + RT = np.concatenate([R, T], axis=1).astype(np.float32) + + coord, out_sh, can_bounds, bounds, Rh, Th = self.prepare_input( + frame_index) + + # reduce the image resolution by ratio + H, W = int(img.shape[0] * cfg.ratio), int(img.shape[1] * cfg.ratio) + img = cv2.resize(img, (W, H), interpolation=cv2.INTER_AREA) + msk = cv2.resize(msk, (W, H), interpolation=cv2.INTER_NEAREST) + if cfg.mask_bkgd: + img[msk == 0] = 0 + if cfg.white_bkgd: + img[msk == 0] = 1 + K = K.copy().astype(np.float32) + K[:2] = K[:2] * cfg.ratio + + rgb, ray_o, ray_d, near, far, coord_, mask_at_box = if_nerf_dutils.sample_ray( + img, msk, K, R, T, can_bounds, self.nrays, self.split) + + ret = { + 'coord': coord, + 'out_sh': out_sh, + 'rgb': rgb, + 'ray_o': ray_o, + 'ray_d': ray_d, + 'near': near, + 'far': far, + 'mask_at_box': mask_at_box, + 'msk': msk + } + + R = cv2.Rodrigues(Rh)[0].astype(np.float32) + meta = { + 'bounds': bounds, + 'R': R, + 'Th': Th, + 'latent_index': latent_index, + 'frame_index': frame_index, + 'view_index': 0 + } + ret.update(meta) + + Rh0 = self.params['pose'][index][:3] + R0 = cv2.Rodrigues(Rh0)[0].astype(np.float32) + Th0 = self.params['trans'][index].astype(np.float32) + meta = {'R0_snap': R0, 'Th0_snap': Th0, 'K': K, 'RT': RT} + ret.update(meta) + + return ret + + def __len__(self): + return self.num_train_frame diff --git a/lib/datasets/light_stage/monocular_demo_dataset.py b/lib/datasets/light_stage/monocular_demo_dataset.py new file mode 100644 index 0000000000000000000000000000000000000000..5aea0f13c69a978fc28664625f9004d521354015 --- /dev/null +++ b/lib/datasets/light_stage/monocular_demo_dataset.py @@ -0,0 +1,147 @@ +import torch.utils.data as data +from lib.utils import base_utils +from PIL import Image +import numpy as np +import json +import os +import imageio +import cv2 +from lib.config import cfg +from lib.utils.if_nerf import if_nerf_data_utils as if_nerf_dutils +from plyfile import PlyData +from lib.utils import snapshot_data_utils as snapshot_dutils +from lib.utils import render_utils + + +class Dataset(data.Dataset): + def __init__(self, data_root, human, ann_file, split): + super(Dataset, self).__init__() + + self.data_root = data_root + self.split = split + + camera_path = os.path.join(self.data_root, 'camera.pkl') + self.cam = snapshot_dutils.get_camera(camera_path) + self.ts = np.arange(0, np.pi * 2, np.pi / 72) + self.nt = len(self.ts) + + params_path = ann_file + self.params = np.load(params_path, allow_pickle=True).item() + + self.nrays = cfg.N_rand + + def prepare_input(self, i, index): + # read xyz, normal, color from the ply file + vertices_path = os.path.join(self.data_root, 'vertices', + '{}.npy'.format(i)) + xyz = np.load(vertices_path).astype(np.float32) + nxyz = np.zeros_like(xyz).astype(np.float32) + + t = self.ts[index] + rot_ = np.array([[np.cos(t), -np.sin(t)], [np.sin(t), np.cos(t)]]) + rot = np.eye(3) + rot[[0, 0, 2, 2], [0, 2, 0, 2]] = rot_.ravel() + center = np.mean(xyz, axis=0) + xyz = xyz - center + xyz = np.dot(xyz, rot.T) + xyz = xyz + center + xyz = xyz.astype(np.float32) + + # obtain the original bounds for point sampling + min_xyz = np.min(xyz, axis=0) + max_xyz = np.max(xyz, axis=0) + min_xyz[1] -= 0.1 + max_xyz[1] += 0.1 + can_bounds = np.stack([min_xyz, max_xyz], axis=0) + + # transform smpl from the world coordinate to the smpl coordinate + Rh = self.params['pose'][i][:3] + R = cv2.Rodrigues(Rh)[0].astype(np.float32) + Th = self.params['trans'][i].astype(np.float32) + R = np.dot(rot, R) + Rh = cv2.Rodrigues(R)[0] + Th = np.sum(rot * (Th - center), axis=1) + center + Th = Th.astype(np.float32) + xyz = np.dot(xyz - Th, R).astype(np.float32) + + # obtain the bounds for coord construction + min_xyz = np.min(xyz, axis=0) + max_xyz = np.max(xyz, axis=0) + min_xyz[1] -= 0.1 + max_xyz[1] += 0.1 + bounds = np.stack([min_xyz, max_xyz], axis=0) + + # construct the coordinate + dhw = xyz[:, [2, 1, 0]] + min_dhw = min_xyz[[2, 1, 0]] + max_dhw = max_xyz[[2, 1, 0]] + voxel_size = np.array(cfg.voxel_size) + coord = np.round((dhw - min_dhw) / voxel_size).astype(np.int32) + + # construct the output shape + out_sh = np.ceil((max_dhw - min_dhw) / voxel_size).astype(np.int32) + x = 32 + out_sh = (out_sh | (x - 1)) + 1 + + return coord, out_sh, can_bounds, bounds, Rh, Th + + def __getitem__(self, index): + K = self.cam['K'] + D = self.cam['D'] + + R = self.cam['R'] + T = self.cam['T'][:, None] + + i = 0 + frame_index = i + latent_index = i + view_index = index + coord, out_sh, can_bounds, bounds, Rh, Th = self.prepare_input( + i, index) + + msk_path = os.path.join(self.data_root, 'mask', '{}.png'.format(i)) + msk = imageio.imread(msk_path) + msk = cv2.undistort(msk, K, D) + + # reduce the image resolution by ratio + H, W = int(msk.shape[0] * cfg.ratio), int(msk.shape[1] * cfg.ratio) + msk = cv2.resize(msk, (W, H), interpolation=cv2.INTER_NEAREST) + K = K.copy().astype(np.float32) + K[:2] = K[:2] * cfg.ratio + + RT = np.concatenate([R, T], axis=1).astype(np.float32) + ray_o, ray_d, near, far, _, _, mask_at_box = render_utils.image_rays( + RT, K, can_bounds) + + ret = { + 'coord': coord, + 'out_sh': out_sh, + 'ray_o': ray_o, + 'ray_d': ray_d, + 'near': near, + 'far': far, + 'mask_at_box': mask_at_box, + 'msk': msk + } + + R = cv2.Rodrigues(Rh)[0].astype(np.float32) + meta = { + 'bounds': bounds, + 'R': R, + 'Th': Th, + 'latent_index': latent_index, + 'frame_index': frame_index, + 'view_index': view_index + } + ret.update(meta) + + Rh0 = self.params['pose'][i][:3] + R0 = cv2.Rodrigues(Rh0)[0].astype(np.float32) + Th0 = self.params['trans'][i].astype(np.float32) + meta = {'R0_snap': R0, 'Th0_snap': Th0, 'K': K, 'RT': RT} + ret.update(meta) + + return ret + + def __len__(self): + return self.nt diff --git a/lib/datasets/light_stage/monocular_mesh_dataset.py b/lib/datasets/light_stage/monocular_mesh_dataset.py new file mode 100644 index 0000000000000000000000000000000000000000..e45f17e09b8641efa6ab697da5305c86ffafdd39 --- /dev/null +++ b/lib/datasets/light_stage/monocular_mesh_dataset.py @@ -0,0 +1,113 @@ +import torch.utils.data as data +from lib.utils import base_utils +from PIL import Image +import numpy as np +import json +import os +import imageio +import cv2 +from lib.config import cfg +from lib.utils.if_nerf import if_nerf_data_utils as if_nerf_dutils +from plyfile import PlyData +from lib.utils import snapshot_data_utils as snapshot_dutils +from . import monocular_dataset + + +class Dataset(monocular_dataset.Dataset): + def __init__(self, data_root, human, ann_file, split): + super(Dataset, self).__init__(data_root, human, ann_file, split) + + self.data_root = data_root + self.split = split + + camera_path = os.path.join(self.data_root, 'camera.pkl') + self.cam = snapshot_dutils.get_camera(camera_path) + self.begin_ith_frame = cfg.begin_ith_frame + self.num_train_frame = cfg.num_train_frame + + self.ims = np.arange(self.num_train_frame) + self.num_cams = 1 + + params_path = ann_file + self.params = np.load(params_path, allow_pickle=True).item() + + self.nrays = cfg.N_rand + + def prepare_inside_pts(self, pts, msk, K, R, T): + sh = pts.shape + pts3d = pts.reshape(-1, 3) + RT = np.concatenate([R, T], axis=1) + pts2d = base_utils.project(pts3d, K, RT) + + H, W = msk.shape + pts2d = np.round(pts2d).astype(np.int32) + pts2d[:, 0] = np.clip(pts2d[:, 0], 0, W - 1) + pts2d[:, 1] = np.clip(pts2d[:, 1], 0, H - 1) + inside = msk[pts2d[:, 1], pts2d[:, 0]] + inside = inside.reshape(*sh[:-1]) + + return inside + + def __getitem__(self, index): + latent_index = index + index = index + self.begin_ith_frame + frame_index = index + + img_path = os.path.join(self.data_root, 'image', + '{}.jpg'.format(index)) + img = imageio.imread(img_path).astype(np.float32) / 255. + msk_path = os.path.join(self.data_root, 'mask', '{}.png'.format(index)) + msk = imageio.imread(msk_path) + + K = self.cam['K'] + D = self.cam['D'] + img = cv2.undistort(img, K, D) + msk = cv2.undistort(msk, K, D) + + R = self.cam['R'] + T = self.cam['T'][:, None] + + coord, out_sh, can_bounds, bounds, Rh, Th = self.prepare_input( + index) + + # reduce the image resolution by ratio + H, W = int(img.shape[0] * cfg.ratio), int(img.shape[1] * cfg.ratio) + img = cv2.resize(img, (W, H), interpolation=cv2.INTER_AREA) + msk = cv2.resize(msk, (W, H), interpolation=cv2.INTER_NEAREST) + img[msk == 0] = 0 + K = K.copy() + K[:2] = K[:2] * cfg.ratio + + voxel_size = cfg.voxel_size + x = np.arange(can_bounds[0, 0], can_bounds[1, 0] + voxel_size[0], + voxel_size[0]) + y = np.arange(can_bounds[0, 1], can_bounds[1, 1] + voxel_size[1], + voxel_size[1]) + z = np.arange(can_bounds[0, 2], can_bounds[1, 2] + voxel_size[2], + voxel_size[2]) + pts = np.stack(np.meshgrid(x, y, z, indexing='ij'), axis=-1) + pts = pts.astype(np.float32) + + inside = self.prepare_inside_pts(pts, msk, K, R, T) + + ret = { + 'coord': coord, + 'out_sh': out_sh, + 'pts': pts, + 'inside': inside + } + + R = cv2.Rodrigues(Rh)[0].astype(np.float32) + meta = { + 'bounds': bounds, + 'R': R, + 'Th': Th, + 'latent_index': latent_index, + 'frame_index': frame_index + } + ret.update(meta) + + return ret + + def __len__(self): + return self.num_train_frame diff --git a/lib/datasets/light_stage/multi_view_dataset.py b/lib/datasets/light_stage/multi_view_dataset.py new file mode 100644 index 0000000000000000000000000000000000000000..ba050237f534328623b59185a3d750521441d797 --- /dev/null +++ b/lib/datasets/light_stage/multi_view_dataset.py @@ -0,0 +1,185 @@ +import torch.utils.data as data +from lib.utils import base_utils +from PIL import Image +import numpy as np +import json +import os +import imageio +import cv2 +from lib.config import cfg +from lib.utils.if_nerf import if_nerf_data_utils as if_nerf_dutils +from plyfile import PlyData + + +class Dataset(data.Dataset): + def __init__(self, data_root, human, ann_file, split): + super(Dataset, self).__init__() + + self.data_root = data_root + self.human = human + self.split = split + + annots = np.load(ann_file, allow_pickle=True).item() + self.cams = annots['cams'] + + num_cams = len(self.cams['K']) + test_view = [i for i in range(num_cams) if i not in cfg.training_view] + view = cfg.training_view if split == 'train' else test_view + if len(view) == 0: + view = [0] + + # prepare input images + i = 0 + i = i + cfg.begin_ith_frame + i_intv = cfg.frame_interval + ni = cfg.num_train_frame + if cfg.test_novel_pose: + i = (i + cfg.num_train_frame) * i_intv + ni = cfg.num_novel_pose_frame + if self.human == 'CoreView_390': + i = 0 + + self.ims = np.array([ + np.array(ims_data['ims'])[view] + for ims_data in annots['ims'][i:i + ni * i_intv][::i_intv] + ]).ravel() + self.cam_inds = np.array([ + np.arange(len(ims_data['ims']))[view] + for ims_data in annots['ims'][i:i + ni * i_intv][::i_intv] + ]).ravel() + self.num_cams = len(view) + + self.nrays = cfg.N_rand + + def get_mask(self, index): + msk_path = os.path.join(self.data_root, 'mask_cihp', + self.ims[index])[:-4] + '.png' + msk_cihp = imageio.imread(msk_path) + msk = (msk_cihp != 0).astype(np.uint8) + + border = 5 + kernel = np.ones((border, border), np.uint8) + msk_erode = cv2.erode(msk.copy(), kernel) + msk_dilate = cv2.dilate(msk.copy(), kernel) + msk[(msk_dilate - msk_erode) == 1] = 100 + + return msk + + def prepare_input(self, i): + # read xyz, normal, color from the ply file + vertices_path = os.path.join(self.data_root, cfg.vertices, + '{}.npy'.format(i)) + xyz = np.load(vertices_path).astype(np.float32) + nxyz = np.zeros_like(xyz).astype(np.float32) + + # obtain the original bounds for point sampling + min_xyz = np.min(xyz, axis=0) + max_xyz = np.max(xyz, axis=0) + if cfg.big_box: + min_xyz -= 0.05 + max_xyz += 0.05 + else: + min_xyz[2] -= 0.05 + max_xyz[2] += 0.05 + can_bounds = np.stack([min_xyz, max_xyz], axis=0) + + # transform smpl from the world coordinate to the smpl coordinate + params_path = os.path.join(self.data_root, cfg.params, + '{}.npy'.format(i)) + params = np.load(params_path, allow_pickle=True).item() + Rh = params['Rh'] + R = cv2.Rodrigues(Rh)[0].astype(np.float32) + Th = params['Th'].astype(np.float32) + xyz = np.dot(xyz - Th, R) + + # obtain the bounds for coord construction + min_xyz = np.min(xyz, axis=0) + max_xyz = np.max(xyz, axis=0) + if cfg.big_box: + min_xyz -= 0.05 + max_xyz += 0.05 + else: + min_xyz[2] -= 0.05 + max_xyz[2] += 0.05 + bounds = np.stack([min_xyz, max_xyz], axis=0) + + # construct the coordinate + dhw = xyz[:, [2, 1, 0]] + min_dhw = min_xyz[[2, 1, 0]] + max_dhw = max_xyz[[2, 1, 0]] + voxel_size = np.array(cfg.voxel_size) + coord = np.round((dhw - min_dhw) / voxel_size).astype(np.int32) + + # construct the output shape + out_sh = np.ceil((max_dhw - min_dhw) / voxel_size).astype(np.int32) + x = 32 + out_sh = (out_sh | (x - 1)) + 1 + + return coord, out_sh, can_bounds, bounds, Rh, Th + + def __getitem__(self, index): + img_path = os.path.join(self.data_root, self.ims[index]) + img = imageio.imread(img_path).astype(np.float32) / 255. + img = cv2.resize(img, (cfg.W, cfg.H)) + msk = self.get_mask(index) + + cam_ind = self.cam_inds[index] + K = np.array(self.cams['K'][cam_ind]) + D = np.array(self.cams['D'][cam_ind]) + img = cv2.undistort(img, K, D) + msk = cv2.undistort(msk, K, D) + + R = np.array(self.cams['R'][cam_ind]) + T = np.array(self.cams['T'][cam_ind]) / 1000. + + # reduce the image resolution by ratio + H, W = int(img.shape[0] * cfg.ratio), int(img.shape[1] * cfg.ratio) + img = cv2.resize(img, (W, H), interpolation=cv2.INTER_AREA) + msk = cv2.resize(msk, (W, H), interpolation=cv2.INTER_NEAREST) + if cfg.mask_bkgd: + img[msk == 0] = 0 + if cfg.white_bkgd: + img[msk == 0] = 1 + K[:2] = K[:2] * cfg.ratio + + if self.human in ['CoreView_313', 'CoreView_315']: + i = int(os.path.basename(img_path).split('_')[4]) + frame_index = i - 1 + else: + i = int(os.path.basename(img_path)[:-4]) + frame_index = i + coord, out_sh, can_bounds, bounds, Rh, Th = self.prepare_input( + i) + + rgb, ray_o, ray_d, near, far, coord_, mask_at_box = if_nerf_dutils.sample_ray_h36m( + img, msk, K, R, T, can_bounds, self.nrays, self.split) + + ret = { + 'coord': coord, + 'out_sh': out_sh, + 'rgb': rgb, + 'ray_o': ray_o, + 'ray_d': ray_d, + 'near': near, + 'far': far, + 'mask_at_box': mask_at_box + } + + R = cv2.Rodrigues(Rh)[0].astype(np.float32) + latent_index = frame_index - cfg.begin_ith_frame + if cfg.test_novel_pose: + latent_index = cfg.num_train_frame - 1 + meta = { + 'bounds': bounds, + 'R': R, + 'Th': Th, + 'latent_index': latent_index, + 'frame_index': frame_index, + 'cam_ind': cam_ind + } + ret.update(meta) + + return ret + + def __len__(self): + return len(self.ims) diff --git a/lib/datasets/light_stage/multi_view_demo_dataset.py b/lib/datasets/light_stage/multi_view_demo_dataset.py new file mode 100644 index 0000000000000000000000000000000000000000..60367f2f77198e80354c848b53fcb23cd0dac5d3 --- /dev/null +++ b/lib/datasets/light_stage/multi_view_demo_dataset.py @@ -0,0 +1,182 @@ +import torch.utils.data as data +from lib.utils import base_utils +from PIL import Image +import numpy as np +import json +import os +import imageio +import cv2 +from lib.config import cfg +from lib.utils.if_nerf import if_nerf_data_utils as if_nerf_dutils +from plyfile import PlyData +from lib.utils import render_utils + + +class Dataset(data.Dataset): + def __init__(self, data_root, human, ann_file, split): + super(Dataset, self).__init__() + + self.data_root = data_root + self.human = human + self.split = split + + annots = np.load(ann_file, allow_pickle=True).item() + self.cams = annots['cams'] + + K, RT = render_utils.load_cam(ann_file) + render_w2c = render_utils.gen_path(RT) + + i = 0 + i = i + cfg.begin_ith_frame + ni = cfg.num_train_frame + i_intv = cfg.frame_interval + self.ims = np.array([ + np.array(ims_data['ims'])[cfg.training_view] + for ims_data in annots['ims'][i:i + ni * i_intv] + ]) + + self.K = K[0] + self.render_w2c = render_w2c + + self.Ks = np.array(K)[cfg.training_view].astype(np.float32) + self.RT = np.array(RT)[cfg.training_view].astype(np.float32) + self.center_rayd = [ + render_utils.get_center_rayd(K_, RT_) + for K_, RT_ in zip(self.Ks, self.RT) + ] + + self.Ds = np.array(self.cams['D'])[cfg.training_view].astype( + np.float32) + + self.nrays = cfg.N_rand + + def prepare_input(self, i): + if self.human in ['CoreView_313', 'CoreView_315']: + i = i + 1 + + # read xyz, normal, color from the ply file + vertices_path = os.path.join(self.data_root, cfg.vertices, + '{}.npy'.format(i)) + xyz = np.load(vertices_path).astype(np.float32) + nxyz = np.zeros_like(xyz).astype(np.float32) + + # obtain the origin bounds for point sampling + min_xyz = np.min(xyz, axis=0) + max_xyz = np.max(xyz, axis=0) + if cfg.big_box: + min_xyz -= 0.05 + max_xyz += 0.05 + else: + min_xyz[2] -= 0.05 + max_xyz[2] += 0.05 + can_bounds = np.stack([min_xyz, max_xyz], axis=0) + + # transform smpl from the world coordinate to the smpl coordinate + params_path = os.path.join(self.data_root, cfg.params, + '{}.npy'.format(i)) + params = np.load(params_path, allow_pickle=True).item() + Rh = params['Rh'] + R = cv2.Rodrigues(Rh)[0].astype(np.float32) + Th = params['Th'].astype(np.float32) + xyz = np.dot(xyz - Th, R) + + min_xyz = np.min(xyz, axis=0) + max_xyz = np.max(xyz, axis=0) + if cfg.big_box: + min_xyz -= 0.05 + max_xyz += 0.05 + else: + min_xyz[2] -= 0.05 + max_xyz[2] += 0.05 + bounds = np.stack([min_xyz, max_xyz], axis=0) + + # construct the coordinate + dhw = xyz[:, [2, 1, 0]] + min_dhw = min_xyz[[2, 1, 0]] + max_dhw = max_xyz[[2, 1, 0]] + voxel_size = np.array(cfg.voxel_size) + coord = np.round((dhw - min_dhw) / voxel_size).astype(np.int32) + + # construct the output shape + out_sh = np.ceil((max_dhw - min_dhw) / voxel_size).astype(np.int32) + x = 32 + out_sh = (out_sh | (x - 1)) + 1 + + return coord, out_sh, can_bounds, bounds, Rh, Th + + def get_mask(self, i): + ims = self.ims[i] + msks = [] + + for nv in range(len(ims)): + im = ims[nv] + + msk_path = os.path.join(self.data_root, 'mask_cihp', + im)[:-4] + '.png' + msk_cihp = imageio.imread(msk_path) + msk = (msk_cihp != 0).astype(np.uint8) + + K = self.Ks[nv].copy() + K[:2] = K[:2] / cfg.ratio + msk = cv2.undistort(msk, K, self.Ds[nv]) + + border = 5 + kernel = np.ones((border, border), np.uint8) + msk = cv2.dilate(msk.copy(), kernel) + + msks.append(msk) + + return msks + + def __getitem__(self, index): + i = cfg.ith_frame + latent_index = i + frame_index = i + cfg.begin_ith_frame + view_index = index + + coord, out_sh, can_bounds, bounds, Rh, Th = self.prepare_input( + frame_index) + + msks = self.get_mask(i) + + # reduce the image resolution by ratio + H, W = int(cfg.H * cfg.ratio), int(cfg.W * cfg.ratio) + msks = [ + cv2.resize(msk, (W, H), interpolation=cv2.INTER_NEAREST) + for msk in msks + ] + msks = np.array(msks) + K = self.K + + ray_o, ray_d, near, far, center, scale, mask_at_box = render_utils.image_rays( + self.render_w2c[index], K, can_bounds) + + ret = { + 'coord': coord, + 'out_sh': out_sh, + 'ray_o': ray_o, + 'ray_d': ray_d, + 'near': near, + 'far': far, + 'mask_at_box': mask_at_box + } + + R = cv2.Rodrigues(Rh)[0].astype(np.float32) + latent_index = min(latent_index, cfg.num_train_frame - 1) + meta = { + 'bounds': bounds, + 'R': R, + 'Th': Th, + 'latent_index': latent_index, + 'frame_index': frame_index, + 'view_index': view_index, + } + ret.update(meta) + + meta = {'msks': msks, 'Ks': self.Ks, 'RT': self.RT} + ret.update(meta) + + return ret + + def __len__(self): + return len(self.render_w2c) diff --git a/lib/datasets/light_stage/multi_view_mesh_dataset.py b/lib/datasets/light_stage/multi_view_mesh_dataset.py new file mode 100644 index 0000000000000000000000000000000000000000..740782101a0670c5e5450bc42e9cd0f513f1c885 --- /dev/null +++ b/lib/datasets/light_stage/multi_view_mesh_dataset.py @@ -0,0 +1,184 @@ +import torch.utils.data as data +from lib.utils import base_utils +from PIL import Image +import numpy as np +import json +import os +import imageio +import cv2 +from lib.config import cfg +from lib.utils.if_nerf import if_nerf_data_utils as if_nerf_dutils +from plyfile import PlyData + + +class Dataset(data.Dataset): + def __init__(self, data_root, human, ann_file, split): + super(Dataset, self).__init__() + + self.data_root = data_root + self.human = human + self.split = split + + annots = np.load(ann_file, allow_pickle=True).item() + self.cams = annots['cams'] + + i = 0 + i = i + cfg.begin_ith_frame + ni = cfg.num_train_frame + if cfg.num_render_frame > 0: + ni = cfg.num_render_frame + self.ims = np.array([ + np.array(ims_data['ims'])[cfg.training_view] + for ims_data in annots['ims'][i:i + ni] + ]) + self.num_cams = 1 + + self.Ks = np.array(self.cams['K'])[cfg.training_view].astype( + np.float32) + self.Rs = np.array(self.cams['R'])[cfg.training_view].astype( + np.float32) + self.Ts = np.array(self.cams['T'])[cfg.training_view].astype( + np.float32) / 1000. + self.Ds = np.array(self.cams['D'])[cfg.training_view].astype( + np.float32) + + self.ni = ni + + def prepare_input(self, i): + if self.human in ['CoreView_313', 'CoreView_315']: + i = i + 1 + + # read xyz, normal, color from the ply file + vertices_path = os.path.join(self.data_root, cfg.vertices, + '{}.npy'.format(i)) + xyz = np.load(vertices_path).astype(np.float32) + nxyz = np.zeros_like(xyz).astype(np.float32) + + # obtain the original bounds for point sampling + min_xyz = np.min(xyz, axis=0) + max_xyz = np.max(xyz, axis=0) + if cfg.big_box: + min_xyz -= 0.05 + max_xyz += 0.05 + else: + min_xyz[2] -= 0.05 + max_xyz[2] += 0.05 + can_bounds = np.stack([min_xyz, max_xyz], axis=0) + + # transform smpl from the world coordinate to the smpl coordinate + params_path = os.path.join(self.data_root, cfg.params, + '{}.npy'.format(i)) + params = np.load(params_path, allow_pickle=True).item() + Rh = params['Rh'] + R = cv2.Rodrigues(Rh)[0].astype(np.float32) + Th = params['Th'].astype(np.float32) + xyz = np.dot(xyz - Th, R) + + # obtain the bounds for coord construction + min_xyz = np.min(xyz, axis=0) + max_xyz = np.max(xyz, axis=0) + if self.human in ['CoreView_362']: + min_xyz -= 0.05 + max_xyz += 0.05 + else: + min_xyz[2] -= 0.05 + max_xyz[2] += 0.05 + bounds = np.stack([min_xyz, max_xyz], axis=0) + + # construct the coordinate + dhw = xyz[:, [2, 1, 0]] + min_dhw = min_xyz[[2, 1, 0]] + max_dhw = max_xyz[[2, 1, 0]] + voxel_size = np.array(cfg.voxel_size) + coord = np.round((dhw - min_dhw) / voxel_size).astype(np.int32) + + # construct the output shape + out_sh = np.ceil((max_dhw - min_dhw) / voxel_size).astype(np.int32) + x = 32 + out_sh = (out_sh | (x - 1)) + 1 + + return coord, out_sh, can_bounds, bounds, Rh, Th + + def get_mask(self, i, nv): + im = self.ims[i, nv] + + msk_path = os.path.join(self.data_root, 'mask_cihp', im)[:-4] + '.png' + msk_cihp = imageio.imread(msk_path) + msk = (msk_cihp != 0).astype(np.uint8) + + msk = cv2.undistort(msk, self.Ks[nv], self.Ds[nv]) + + border = 5 + kernel = np.ones((border, border), np.uint8) + msk = cv2.dilate(msk.copy(), kernel) + + return msk + + def prepare_inside_pts(self, pts, i): + sh = pts.shape + pts3d = pts.reshape(-1, 3) + + inside = np.ones([len(pts3d)]).astype(np.uint8) + for nv in range(self.ims.shape[1]): + ind = inside == 1 + pts3d_ = pts3d[ind] + + RT = np.concatenate([self.Rs[nv], self.Ts[nv]], axis=1) + pts2d = base_utils.project(pts3d_, self.Ks[nv], RT) + + msk = self.get_mask(i, nv) + H, W = msk.shape + pts2d = np.round(pts2d).astype(np.int32) + pts2d[:, 0] = np.clip(pts2d[:, 0], 0, W - 1) + pts2d[:, 1] = np.clip(pts2d[:, 1], 0, H - 1) + msk_ = msk[pts2d[:, 1], pts2d[:, 0]] + + inside[ind] = msk_ + + inside = inside.reshape(*sh[:-1]) + + return inside + + def __getitem__(self, index): + i = index + latent_index = index + frame_index = index + cfg.begin_ith_frame + + coord, out_sh, can_bounds, bounds, Rh, Th = self.prepare_input( + frame_index) + + voxel_size = cfg.voxel_size + x = np.arange(can_bounds[0, 0], can_bounds[1, 0] + voxel_size[0], + voxel_size[0]) + y = np.arange(can_bounds[0, 1], can_bounds[1, 1] + voxel_size[1], + voxel_size[1]) + z = np.arange(can_bounds[0, 2], can_bounds[1, 2] + voxel_size[2], + voxel_size[2]) + pts = np.stack(np.meshgrid(x, y, z, indexing='ij'), axis=-1) + pts = pts.astype(np.float32) + + inside = self.prepare_inside_pts(pts, i) + + ret = { + 'coord': coord, + 'out_sh': out_sh, + 'pts': pts, + 'inside': inside + } + + R = cv2.Rodrigues(Rh)[0].astype(np.float32) + latent_index = min(latent_index, cfg.num_train_frame - 1) + meta = { + 'wbounds': can_bounds, + 'bounds': bounds, + 'R': R, + 'Th': Th, + 'latent_index': latent_index, + 'frame_index': frame_index + } + ret.update(meta) + + return ret + + def __len__(self): + return self.ni diff --git a/lib/datasets/light_stage/multi_view_perform_dataset.py b/lib/datasets/light_stage/multi_view_perform_dataset.py new file mode 100644 index 0000000000000000000000000000000000000000..867068d0298a6692b430cbc5487ce4213608783c --- /dev/null +++ b/lib/datasets/light_stage/multi_view_perform_dataset.py @@ -0,0 +1,179 @@ +import torch.utils.data as data +from lib.utils import base_utils +from PIL import Image +import numpy as np +import json +import os +import imageio +import cv2 +from lib.config import cfg +from lib.utils.if_nerf import if_nerf_data_utils as if_nerf_dutils +from plyfile import PlyData +from lib.utils import render_utils + + +class Dataset(data.Dataset): + def __init__(self, data_root, human, ann_file, split): + super(Dataset, self).__init__() + + self.data_root = data_root + self.human = human + self.split = split + + annots = np.load(ann_file, allow_pickle=True).item() + self.cams = annots['cams'] + + K, RT = render_utils.load_cam(ann_file) + render_w2c = render_utils.gen_path(RT) + + i = 0 + i = i + cfg.begin_ith_frame + ni = cfg.num_train_frame + if cfg.num_render_frame > 0: + ni = cfg.num_render_frame + i_intv = cfg.frame_interval + self.ims = np.array([ + np.array(ims_data['ims'])[cfg.training_view] + for ims_data in annots['ims'][i:i + ni * i_intv] + ]) + + self.K = K[0] + self.render_w2c = render_w2c + + self.Ks = np.array(K)[cfg.training_view].astype(np.float32) + self.RT = np.array(RT)[cfg.training_view].astype(np.float32) + + self.Ds = np.array(self.cams['D'])[cfg.training_view].astype( + np.float32) + + self.ni = ni * cfg.frame_interval + + def prepare_input(self, i): + if self.human in ['CoreView_313', 'CoreView_315']: + i = i + 1 + + # read xyz, normal, color from the ply file + vertices_path = os.path.join(self.data_root, cfg.vertices, + '{}.npy'.format(i)) + xyz = np.load(vertices_path).astype(np.float32) + nxyz = np.zeros_like(xyz).astype(np.float32) + + # obtain the origin bounds for point sampling + min_xyz = np.min(xyz, axis=0) + max_xyz = np.max(xyz, axis=0) + if cfg.big_box: + min_xyz -= 0.05 + max_xyz += 0.05 + else: + min_xyz[2] -= 0.05 + max_xyz[2] += 0.05 + can_bounds = np.stack([min_xyz, max_xyz], axis=0) + + # transform smpl from the world coordinate to the smpl coordinate + params_path = os.path.join(self.data_root, cfg.params, + '{}.npy'.format(i)) + params = np.load(params_path, allow_pickle=True).item() + Rh = params['Rh'] + R = cv2.Rodrigues(Rh)[0].astype(np.float32) + Th = params['Th'].astype(np.float32) + xyz = np.dot(xyz - Th, R) + + min_xyz = np.min(xyz, axis=0) + max_xyz = np.max(xyz, axis=0) + if cfg.big_box: + min_xyz -= 0.05 + max_xyz += 0.05 + else: + min_xyz[2] -= 0.05 + max_xyz[2] += 0.05 + bounds = np.stack([min_xyz, max_xyz], axis=0) + + # construct the coordinate + dhw = xyz[:, [2, 1, 0]] + min_dhw = min_xyz[[2, 1, 0]] + max_dhw = max_xyz[[2, 1, 0]] + voxel_size = np.array(cfg.voxel_size) + coord = np.round((dhw - min_dhw) / voxel_size).astype(np.int32) + + # construct the output shape + out_sh = np.ceil((max_dhw - min_dhw) / voxel_size).astype(np.int32) + x = 32 + out_sh = (out_sh | (x - 1)) + 1 + + return coord, out_sh, can_bounds, bounds, Rh, Th + + def get_mask(self, i): + ims = self.ims[i] + msks = [] + + for nv in range(len(ims)): + im = ims[nv] + + msk_path = os.path.join(self.data_root, 'mask_cihp', + im)[:-4] + '.png' + msk_cihp = imageio.imread(msk_path) + msk = (msk_cihp != 0).astype(np.uint8) + + K = self.Ks[nv].copy() + K[:2] = K[:2] / cfg.ratio + msk = cv2.undistort(msk, K, self.Ds[nv]) + + border = 5 + kernel = np.ones((border, border), np.uint8) + msk = cv2.dilate(msk.copy(), kernel) + + msks.append(msk) + + return msks + + def __getitem__(self, index): + frame_index = index + cfg.begin_ith_frame + latent_index = index + coord, out_sh, can_bounds, bounds, Rh, Th = self.prepare_input( + frame_index) + + msks = self.get_mask(index) + + # reduce the image resolution by ratio + H, W = int(cfg.H * cfg.ratio), int(cfg.W * cfg.ratio) + msks = [ + cv2.resize(msk, (W, H), interpolation=cv2.INTER_NEAREST) + for msk in msks + ] + msks = np.array(msks) + K = self.K + + cam_ind = index % len(self.render_w2c) + # cam_ind = 50 + ray_o, ray_d, near, far, center, scale, mask_at_box = render_utils.image_rays( + self.render_w2c[cam_ind], K, can_bounds) + + ret = { + 'coord': coord, + 'out_sh': out_sh, + 'ray_o': ray_o, + 'ray_d': ray_d, + 'near': near, + 'far': far, + 'mask_at_box': mask_at_box + } + + R = cv2.Rodrigues(Rh)[0].astype(np.float32) + latent_index = min(latent_index, cfg.num_train_frame - 1) + meta = { + 'bounds': bounds, + 'R': R, + 'Th': Th, + 'latent_index': latent_index, + 'frame_index': frame_index, + 'view_index': cam_ind + } + ret.update(meta) + + meta = {'msks': msks, 'Ks': self.Ks, 'RT': self.RT} + ret.update(meta) + + return ret + + def __len__(self): + return self.ni diff --git a/lib/datasets/light_stage/rotate_smpl_dataset.py b/lib/datasets/light_stage/rotate_smpl_dataset.py new file mode 100644 index 0000000000000000000000000000000000000000..a81329118c5889fe0064d283ee6da549d3d2df6f --- /dev/null +++ b/lib/datasets/light_stage/rotate_smpl_dataset.py @@ -0,0 +1,198 @@ +import torch.utils.data as data +from lib.utils import base_utils +from PIL import Image +import numpy as np +import json +import os +import imageio +import cv2 +from lib.config import cfg +from lib.utils.if_nerf import if_nerf_data_utils as if_nerf_dutils +from plyfile import PlyData +from lib.utils import render_utils + + +class Dataset(data.Dataset): + def __init__(self, data_root, human, ann_file, split): + super(Dataset, self).__init__() + + self.data_root = data_root + self.human = human + self.split = split + + annots = np.load(ann_file, allow_pickle=True).item() + self.cams = annots['cams'] + + K, RT = render_utils.load_cam(ann_file) + render_w2c = RT + + self.ts = np.arange(0, np.pi * 2, np.pi / 72) + self.nt = len(self.ts) + + i = 0 + i = i + cfg.begin_ith_frame + ni = cfg.num_train_frame + i_intv = cfg.frame_interval + self.ims = np.array([ + np.array(ims_data['ims'])[cfg.training_view] + for ims_data in annots['ims'][i:i + ni * i_intv] + ]) + + self.K = K[0] + self.render_w2c = render_w2c + img_root = 'data/render/{}'.format(cfg.exp_name) + # base_utils.write_K_pose_inf(self.K, self.render_w2c, img_root) + + self.Ks = np.array(K)[cfg.training_view].astype(np.float32) + self.RT = np.array(RT)[cfg.training_view].astype(np.float32) + self.center_rayd = [ + render_utils.get_center_rayd(K_, RT_) + for K_, RT_ in zip(self.Ks, self.RT) + ] + + self.Ds = np.array(self.cams['D'])[cfg.training_view].astype( + np.float32) + + self.nrays = cfg.N_rand + + def prepare_input(self, i, index): + if self.human in ['CoreView_313', 'CoreView_315']: + i = i + 1 + + # read xyz, normal, color from the ply file + vertices_path = os.path.join(self.data_root, cfg.vertices, + '{}.npy'.format(i)) + xyz = np.load(vertices_path).astype(np.float32) + nxyz = np.zeros_like(xyz).astype(np.float32) + + # rotate smpl + t = self.ts[index] + rot_ = np.array([[np.cos(t), -np.sin(t)], [np.sin(t), np.cos(t)]]) + rot = np.eye(3) + rot[[0, 0, 1, 1], [0, 1, 0, 1]] = rot_.ravel() + center = np.mean(xyz, axis=0) + xyz = xyz - center + xyz = np.dot(xyz, rot.T) + xyz = xyz + center + xyz = xyz.astype(np.float32) + + # obtain the origin bounds for point sampling + min_xyz = np.min(xyz, axis=0) + max_xyz = np.max(xyz, axis=0) + if cfg.big_box: + min_xyz -= 0.05 + max_xyz += 0.05 + else: + min_xyz[2] -= 0.05 + max_xyz[2] += 0.05 + can_bounds = np.stack([min_xyz, max_xyz], axis=0) + + # transform smpl from the world coordinate to the smpl coordinate + params_path = os.path.join(self.data_root, cfg.params, + '{}.npy'.format(i)) + params = np.load(params_path, allow_pickle=True).item() + Rh = params['Rh'] + R = cv2.Rodrigues(Rh)[0].astype(np.float32) + R = np.dot(rot, R) + Rh = cv2.Rodrigues(R)[0] + Th = params['Th'].astype(np.float32) + Th = np.sum(rot * (Th - center), axis=1) + center + Th = Th.astype(np.float32) + xyz = np.dot(xyz - Th, R).astype(np.float32) + + min_xyz = np.min(xyz, axis=0) + max_xyz = np.max(xyz, axis=0) + if cfg.big_box: + min_xyz -= 0.05 + max_xyz += 0.05 + else: + min_xyz[2] -= 0.05 + max_xyz[2] += 0.05 + bounds = np.stack([min_xyz, max_xyz], axis=0) + + # construct the coordinate + dhw = xyz[:, [2, 1, 0]] + min_dhw = min_xyz[[2, 1, 0]] + max_dhw = max_xyz[[2, 1, 0]] + voxel_size = np.array(cfg.voxel_size) + coord = np.round((dhw - min_dhw) / voxel_size).astype(np.int32) + + # construct the output shape + out_sh = np.ceil((max_dhw - min_dhw) / voxel_size).astype(np.int32) + x = 32 + out_sh = (out_sh | (x - 1)) + 1 + + return coord, out_sh, can_bounds, bounds, Rh, Th + + def get_mask(self, i): + ims = self.ims[i] + msks = [] + + for nv in range(len(ims)): + im = ims[nv] + + msk_path = os.path.join(self.data_root, 'mask', im)[:-4] + '.png' + msk = imageio.imread(msk_path) + msk = (msk != 0).astype(np.uint8) + + msk_path = os.path.join(self.data_root, 'mask_cihp', + im)[:-4] + '.png' + msk_cihp = imageio.imread(msk_path) + msk_cihp = (msk_cihp != 0).astype(np.uint8) + + msk = (msk | msk_cihp).astype(np.uint8) + + K = self.Ks[nv].copy() + K[:2] = K[:2] / cfg.ratio + msk = cv2.undistort(msk, K, self.Ds[nv]) + + border = 5 + kernel = np.ones((border, border), np.uint8) + msk = cv2.dilate(msk.copy(), kernel) + + msks.append(msk) + + return msks + + def __getitem__(self, index): + i = cfg.ith_frame + latent_index = i + frame_index = i + cfg.begin_ith_frame + view_index = index + + coord, out_sh, can_bounds, bounds, Rh, Th = self.prepare_input( + frame_index, view_index) + + # reduce the image resolution by ratio + H, W = int(cfg.H * cfg.ratio), int(cfg.W * cfg.ratio) + K = self.K + + ray_o, ray_d, near, far, center, scale, mask_at_box = render_utils.image_rays( + self.render_w2c[0], K, can_bounds) + + ret = { + 'coord': coord, + 'out_sh': out_sh, + 'ray_o': ray_o, + 'ray_d': ray_d, + 'near': near, + 'far': far, + 'mask_at_box': mask_at_box + } + + R = cv2.Rodrigues(Rh)[0].astype(np.float32) + latent_index = min(latent_index, cfg.num_train_frame - 1) + meta = { + 'bounds': bounds, + 'R': R, + 'Th': Th, + 'latent_index': latent_index, + 'frame_index': frame_index, + 'view_index': view_index + } + ret.update(meta) + + return ret + + def __len__(self): + return self.nt diff --git a/lib/datasets/make_dataset.py b/lib/datasets/make_dataset.py new file mode 100644 index 0000000000000000000000000000000000000000..79672d934447c3aacbab79b290891102557d9fc0 --- /dev/null +++ b/lib/datasets/make_dataset.py @@ -0,0 +1,96 @@ +from .transforms import make_transforms +from . import samplers +import torch +import torch.utils.data +import imp +import os +from .collate_batch import make_collator +import numpy as np +import time +from lib.config.config import cfg + + +def _dataset_factory(is_train): + if is_train: + module = cfg.train_dataset_module + path = cfg.train_dataset_path + args = cfg.train_dataset + else: + module = cfg.test_dataset_module + path = cfg.test_dataset_path + args = cfg.test_dataset + dataset = imp.load_source(module, path).Dataset(**args) + return dataset + + +def make_dataset(cfg, dataset_name, transforms, is_train=True): + dataset = _dataset_factory(is_train) + return dataset + + +def make_data_sampler(dataset, shuffle, is_distributed, is_train): + if not is_train and cfg.test.sampler == 'FrameSampler': + sampler = samplers.FrameSampler(dataset) + return sampler + if is_distributed: + return samplers.DistributedSampler(dataset, shuffle=shuffle) + if shuffle: + sampler = torch.utils.data.sampler.RandomSampler(dataset) + else: + sampler = torch.utils.data.sampler.SequentialSampler(dataset) + return sampler + + +def make_batch_data_sampler(cfg, sampler, batch_size, drop_last, max_iter, + is_train): + if is_train: + batch_sampler = cfg.train.batch_sampler + sampler_meta = cfg.train.sampler_meta + else: + batch_sampler = cfg.test.batch_sampler + sampler_meta = cfg.test.sampler_meta + + if batch_sampler == 'default': + batch_sampler = torch.utils.data.sampler.BatchSampler( + sampler, batch_size, drop_last) + elif batch_sampler == 'image_size': + batch_sampler = samplers.ImageSizeBatchSampler(sampler, batch_size, + drop_last, sampler_meta) + + if max_iter != -1: + batch_sampler = samplers.IterationBasedBatchSampler( + batch_sampler, max_iter) + return batch_sampler + + +def worker_init_fn(worker_id): + np.random.seed(worker_id + (int(round(time.time() * 1000) % (2**16)))) + + +def make_data_loader(cfg, is_train=True, is_distributed=False, max_iter=-1): + if is_train: + batch_size = cfg.train.batch_size + # shuffle = True + shuffle = cfg.train.shuffle + drop_last = False + else: + batch_size = cfg.test.batch_size + shuffle = True if is_distributed else False + drop_last = False + + dataset_name = cfg.train.dataset if is_train else cfg.test.dataset + + transforms = make_transforms(cfg, is_train) + dataset = make_dataset(cfg, dataset_name, transforms, is_train) + sampler = make_data_sampler(dataset, shuffle, is_distributed, is_train) + batch_sampler = make_batch_data_sampler(cfg, sampler, batch_size, + drop_last, max_iter, is_train) + num_workers = cfg.train.num_workers + collator = make_collator(cfg, is_train) + data_loader = torch.utils.data.DataLoader(dataset, + batch_sampler=batch_sampler, + num_workers=num_workers, + collate_fn=collator, + worker_init_fn=worker_init_fn) + + return data_loader diff --git a/lib/datasets/samplers.py b/lib/datasets/samplers.py new file mode 100644 index 0000000000000000000000000000000000000000..06da62a68f35b50d5e7ae1196156e1c12a224f97 --- /dev/null +++ b/lib/datasets/samplers.py @@ -0,0 +1,148 @@ +from torch.utils.data.sampler import Sampler +from torch.utils.data.sampler import BatchSampler +import numpy as np +import torch +import math +import torch.distributed as dist +from lib.config import cfg + + +class ImageSizeBatchSampler(Sampler): + def __init__(self, sampler, batch_size, drop_last, sampler_meta): + self.sampler = sampler + self.batch_size = batch_size + self.drop_last = drop_last + self.strategy = sampler_meta.strategy + self.hmin, self.wmin = sampler_meta.min_hw + self.hmax, self.wmax = sampler_meta.max_hw + self.divisor = 32 + if cfg.fix_random: + np.random.seed(0) + + def generate_height_width(self): + if self.strategy == 'origin': + return -1, -1 + h = np.random.randint(self.hmin, self.hmax + 1) + w = np.random.randint(self.wmin, self.wmax + 1) + h = (h | (self.divisor - 1)) + 1 + w = (w | (self.divisor - 1)) + 1 + return h, w + + def __iter__(self): + batch = [] + h, w = self.generate_height_width() + for idx in self.sampler: + batch.append((idx, h, w)) + if len(batch) == self.batch_size: + h, w = self.generate_height_width() + yield batch + batch = [] + if len(batch) > 0 and not self.drop_last: + yield batch + + def __len__(self): + if self.drop_last: + return len(self.sampler) // self.batch_size + else: + return (len(self.sampler) + self.batch_size - 1) // self.batch_size + + +class IterationBasedBatchSampler(BatchSampler): + """ + Wraps a BatchSampler, resampling from it until + a specified number of iterations have been sampled + """ + + def __init__(self, batch_sampler, num_iterations, start_iter=0): + self.batch_sampler = batch_sampler + self.sampler = self.batch_sampler.sampler + self.num_iterations = num_iterations + self.start_iter = start_iter + + def __iter__(self): + iteration = self.start_iter + while iteration <= self.num_iterations: + for batch in self.batch_sampler: + iteration += 1 + if iteration > self.num_iterations: + break + yield batch + + def __len__(self): + return self.num_iterations + + +class DistributedSampler(Sampler): + """Sampler that restricts data loading to a subset of the dataset. + It is especially useful in conjunction with + :class:`torch.nn.parallel.DistributedDataParallel`. In such case, each + process can pass a DistributedSampler instance as a DataLoader sampler, + and load a subset of the original dataset that is exclusive to it. + .. note:: + Dataset is assumed to be of constant size. + Arguments: + dataset: Dataset used for sampling. + num_replicas (optional): Number of processes participating in + distributed training. + rank (optional): Rank of the current process within num_replicas. + """ + + def __init__(self, dataset, num_replicas=None, rank=None, shuffle=True): + if num_replicas is None: + if not dist.is_available(): + raise RuntimeError("Requires distributed package to be available") + num_replicas = dist.get_world_size() + if rank is None: + if not dist.is_available(): + raise RuntimeError("Requires distributed package to be available") + rank = dist.get_rank() + self.dataset = dataset + self.num_replicas = num_replicas + self.rank = rank + self.epoch = 0 + self.num_samples = int(math.ceil(len(self.dataset) * 1.0 / self.num_replicas)) + self.total_size = self.num_samples * self.num_replicas + self.shuffle = shuffle + + def __iter__(self): + if self.shuffle: + # deterministically shuffle based on epoch + g = torch.Generator() + g.manual_seed(self.epoch) + indices = torch.randperm(len(self.dataset), generator=g).tolist() + else: + indices = torch.arange(len(self.dataset)).tolist() + + # add extra samples to make it evenly divisible + indices += indices[: (self.total_size - len(indices))] + assert len(indices) == self.total_size + + # subsample + offset = self.num_samples * self.rank + indices = indices[offset:offset+self.num_samples] + assert len(indices) == self.num_samples + + return iter(indices) + + def __len__(self): + return self.num_samples + + def set_epoch(self, epoch): + self.epoch = epoch + + +class FrameSampler(Sampler): + """Sampler certain frames for test + """ + + def __init__(self, dataset): + inds = np.arange(0, len(dataset.ims)) + ni = len(dataset.ims) // dataset.num_cams + inds = inds.reshape(ni, -1)[::cfg.test.frame_sampler_interval] + self.inds = inds.ravel() + + def __iter__(self): + return iter(self.inds) + + def __len__(self): + return len(self.inds) diff --git a/lib/datasets/transforms.py b/lib/datasets/transforms.py new file mode 100644 index 0000000000000000000000000000000000000000..da06ef69f1426ea95f1f3cc1371588c74050d033 --- /dev/null +++ b/lib/datasets/transforms.py @@ -0,0 +1,54 @@ +class Compose(object): + def __init__(self, transforms): + self.transforms = transforms + + def __call__(self, img, kpts=None): + for t in self.transforms: + img, kpts = t(img, kpts) + if kpts is None: + return img + else: + return img, kpts + + def __repr__(self): + format_string = self.__class__.__name__ + "(" + for t in self.transforms: + format_string += "\n" + format_string += " {0}".format(t) + format_string += "\n)" + return format_string + + +class ToTensor(object): + def __call__(self, img, kpts): + return img / 255., kpts + + +class Normalize(object): + def __init__(self, mean, std): + self.mean = mean + self.std = std + + def __call__(self, img, kpts): + img -= self.mean + img /= self.std + return img, kpts + + +def make_transforms(cfg, is_train): + if is_train is True: + transform = Compose( + [ + ToTensor(), + Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), + ] + ) + else: + transform = Compose( + [ + ToTensor(), + Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), + ] + ) + + return transform diff --git a/lib/evaluators/__init__.py b/lib/evaluators/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..73b03fb6582b35f74d19179d2e4ebfbfb4fc4572 --- /dev/null +++ b/lib/evaluators/__init__.py @@ -0,0 +1 @@ +from .make_evaluator import make_evaluator diff --git a/lib/evaluators/if_nerf.py b/lib/evaluators/if_nerf.py new file mode 100644 index 0000000000000000000000000000000000000000..0ed113cd059736bd9dbd0238a89647e5c764ce31 --- /dev/null +++ b/lib/evaluators/if_nerf.py @@ -0,0 +1,91 @@ +import numpy as np +from lib.config import cfg +from skimage.measure import compare_ssim +import os +import cv2 +from termcolor import colored + + +class Evaluator: + def __init__(self): + self.mse = [] + self.psnr = [] + self.ssim = [] + + def psnr_metric(self, img_pred, img_gt): + mse = np.mean((img_pred - img_gt)**2) + psnr = -10 * np.log(mse) / np.log(10) + return psnr + + def ssim_metric(self, img_pred, img_gt, batch): + if not cfg.eval_whole_img: + mask_at_box = batch['mask_at_box'][0].detach().cpu().numpy() + H, W = int(cfg.H * cfg.ratio), int(cfg.W * cfg.ratio) + mask_at_box = mask_at_box.reshape(H, W) + # crop the object region + x, y, w, h = cv2.boundingRect(mask_at_box.astype(np.uint8)) + img_pred = img_pred[y:y + h, x:x + w] + img_gt = img_gt[y:y + h, x:x + w] + + result_dir = os.path.join(cfg.result_dir, 'comparison') + os.system('mkdir -p {}'.format(result_dir)) + frame_index = batch['frame_index'].item() + view_index = batch['cam_ind'].item() + cv2.imwrite( + '{}/frame{:04d}_view{:04d}.png'.format(result_dir, frame_index, + view_index), + (img_pred[..., [2, 1, 0]] * 255)) + cv2.imwrite( + '{}/frame{:04d}_view{:04d}_gt.png'.format(result_dir, frame_index, + view_index), + (img_gt[..., [2, 1, 0]] * 255)) + + # compute the ssim + ssim = compare_ssim(img_pred, img_gt, multichannel=True) + return ssim + + def evaluate(self, output, batch): + rgb_pred = output['rgb_map'][0].detach().cpu().numpy() + rgb_gt = batch['rgb'][0].detach().cpu().numpy() + + mask_at_box = batch['mask_at_box'][0].detach().cpu().numpy() + H, W = int(cfg.H * cfg.ratio), int(cfg.W * cfg.ratio) + mask_at_box = mask_at_box.reshape(H, W) + # convert the pixels into an image + white_bkgd = int(cfg.white_bkgd) + img_pred = np.zeros((H, W, 3)) + white_bkgd + img_pred[mask_at_box] = rgb_pred + img_gt = np.zeros((H, W, 3)) + white_bkgd + img_gt[mask_at_box] = rgb_gt + + if cfg.eval_whole_img: + rgb_pred = img_pred + rgb_gt = img_gt + + mse = np.mean((rgb_pred - rgb_gt)**2) + self.mse.append(mse) + + psnr = self.psnr_metric(rgb_pred, rgb_gt) + self.psnr.append(psnr) + + rgb_pred = img_pred + rgb_gt = img_gt + ssim = self.ssim_metric(rgb_pred, rgb_gt, batch) + self.ssim.append(ssim) + + def summarize(self): + result_dir = cfg.result_dir + print( + colored('the results are saved at {}'.format(result_dir), + 'yellow')) + + result_path = os.path.join(cfg.result_dir, 'metrics.npy') + os.system('mkdir -p {}'.format(os.path.dirname(result_path))) + metrics = {'mse': self.mse, 'psnr': self.psnr, 'ssim': self.ssim} + np.save(result_path, metrics) + print('mse: {}'.format(np.mean(self.mse))) + print('psnr: {}'.format(np.mean(self.psnr))) + print('ssim: {}'.format(np.mean(self.ssim))) + self.mse = [] + self.psnr = [] + self.ssim = [] diff --git a/lib/evaluators/if_nerf_mesh.py b/lib/evaluators/if_nerf_mesh.py new file mode 100644 index 0000000000000000000000000000000000000000..a1715d96452a0d952a331f146703de5343c0f0ae --- /dev/null +++ b/lib/evaluators/if_nerf_mesh.py @@ -0,0 +1,21 @@ +import numpy as np +from lib.config import cfg +import os + + +class Evaluator: + def evaluate(self, output, batch): + cube = output['cube'] + cube = cube[10:-10, 10:-10, 10:-10] + + pts = batch['pts'][0].detach().cpu().numpy() + pts = pts[cube > cfg.mesh_th] + + i = batch['i'].item() + result_dir = os.path.join(cfg.result_dir, 'pts') + os.system('mkdir -p {}'.format(result_dir)) + result_path = os.path.join(result_dir, '{}.npy'.format(i)) + np.save(result_path, pts) + + def summarize(self): + return {} diff --git a/lib/evaluators/make_evaluator.py b/lib/evaluators/make_evaluator.py new file mode 100644 index 0000000000000000000000000000000000000000..c8242fbf99da5778e82195b39021db3fde01a8ff --- /dev/null +++ b/lib/evaluators/make_evaluator.py @@ -0,0 +1,16 @@ +import imp +import os + + +def _evaluator_factory(cfg): + module = cfg.evaluator_module + path = cfg.evaluator_path + evaluator = imp.load_source(module, path).Evaluator() + return evaluator + + +def make_evaluator(cfg): + if cfg.skip_eval: + return None + else: + return _evaluator_factory(cfg) diff --git a/lib/evaluators/neural_volume.py b/lib/evaluators/neural_volume.py new file mode 100644 index 0000000000000000000000000000000000000000..9a5776e23d8e9636121c0cc211d80a01d249fe8b --- /dev/null +++ b/lib/evaluators/neural_volume.py @@ -0,0 +1,96 @@ +import numpy as np +from lib.config import cfg +from skimage.measure import compare_ssim +import os +import cv2 +import imageio + + +class Evaluator: + def __init__(self): + self.mse = [] + self.psnr = [] + self.ssim = [] + + def psnr_metric(self, img_pred, img_gt): + mse = np.mean((img_pred - img_gt)**2) + psnr = -10 * np.log(mse) / np.log(10) + return psnr + + def ssim_metric(self, rgb_pred, rgb_gt, batch): + mask_at_box = batch['mask_at_box'][0].detach().cpu().numpy() + H, W = int(cfg.H * cfg.ratio), int(cfg.W * cfg.ratio) + mask_at_box = mask_at_box.reshape(H, W) + # convert the pixels into an image + img_pred = np.zeros((H, W, 3)) + img_pred[mask_at_box] = rgb_pred + img_gt = np.zeros((H, W, 3)) + img_gt[mask_at_box] = rgb_gt + # crop the object region + x, y, w, h = cv2.boundingRect(mask_at_box.astype(np.uint8)) + img_pred = img_pred[y:y + h, x:x + w] + img_gt = img_gt[y:y + h, x:x + w] + # compute the ssim + ssim = compare_ssim(img_pred, img_gt, multichannel=True) + return ssim + + def evaluate(self, batch): + if cfg.human in [302, 313, 315]: + i = batch['i'].item() + 1 + else: + i = batch['i'].item() + i = i + cfg.begin_i + cam_ind = batch['cam_ind'].item() + + # obtain the image path + result_dir = 'data/result/neural_volumes/{}_nv'.format(cfg.human) + frame_dir = os.path.join(result_dir, 'frame_{}'.format(i)) + gt_img_path = os.path.join(frame_dir, 'gt_{}.jpg'.format(cam_ind + 1)) + pred_img_path = os.path.join(frame_dir, + 'pred_{}.jpg'.format(cam_ind + 1)) + + mask_at_box = batch['mask_at_box'][0].detach().cpu().numpy() + H, W = int(cfg.H * cfg.ratio), int(cfg.W * cfg.ratio) + mask_at_box = mask_at_box.reshape(H, W) + + # convert the pixels into an image + rgb_gt = batch['rgb'][0].detach().cpu().numpy() + img_gt = np.zeros((H, W, 3)) + img_gt[mask_at_box] = rgb_gt + + # gt_img_path = gt_img_path.replace('neural_volumes', 'gt') + # os.system('mkdir -p {}'.format(os.path.dirname(gt_img_path))) + # img_gt = img_gt[..., [2, 1, 0]] * 255 + # cv2.imwrite(gt_img_path, img_gt) + + img_pred = imageio.imread(pred_img_path).astype(np.float32) / 255. + img_pred[mask_at_box != 1] = 0 + rgb_pred = img_pred[mask_at_box] + + # import matplotlib.pyplot as plt + # _, (ax1, ax2) = plt.subplots(1, 2) + # ax1.imshow(img_gt) + # ax2.imshow(img_pred) + # plt.show() + # return + + mse = np.mean((rgb_pred - rgb_gt)**2) + self.mse.append(mse) + + psnr = self.psnr_metric(rgb_pred, rgb_gt) + self.psnr.append(psnr) + + ssim = self.ssim_metric(rgb_pred, rgb_gt, batch) + self.ssim.append(ssim) + + def summarize(self): + result_path = os.path.join(cfg.result_dir, 'metrics.npy') + os.system('mkdir -p {}'.format(os.path.dirname(result_path))) + metrics = {'mse': self.mse, 'psnr': self.psnr, 'ssim': self.ssim} + np.save(result_path, self.mse) + print('mse: {}'.format(np.mean(self.mse))) + print('psnr: {}'.format(np.mean(self.psnr))) + print('ssim: {}'.format(np.mean(self.ssim))) + self.mse = [] + self.psnr = [] + self.ssim = [] diff --git a/lib/networks/__init__.py b/lib/networks/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..2f8c8acec90dc65af35aae93895a52fe13f6952a --- /dev/null +++ b/lib/networks/__init__.py @@ -0,0 +1 @@ +from .make_network import make_network diff --git a/lib/networks/embedder.py b/lib/networks/embedder.py new file mode 100644 index 0000000000000000000000000000000000000000..1f844e787d763be49ed5bc1abc07ff10baab2c8f --- /dev/null +++ b/lib/networks/embedder.py @@ -0,0 +1,54 @@ +import torch +from lib.config import cfg + + +class Embedder: + def __init__(self, **kwargs): + self.kwargs = kwargs + self.create_embedding_fn() + + def create_embedding_fn(self): + embed_fns = [] + d = self.kwargs['input_dims'] + out_dim = 0 + if self.kwargs['include_input']: + embed_fns.append(lambda x: x) + out_dim += d + + max_freq = self.kwargs['max_freq_log2'] + N_freqs = self.kwargs['num_freqs'] + + if self.kwargs['log_sampling']: + freq_bands = 2.**torch.linspace(0., max_freq, steps=N_freqs) + else: + freq_bands = torch.linspace(2.**0., 2.**max_freq, steps=N_freqs) + + for freq in freq_bands: + for p_fn in self.kwargs['periodic_fns']: + embed_fns.append( + lambda x, p_fn=p_fn, freq=freq: p_fn(x * freq)) + out_dim += d + + self.embed_fns = embed_fns + self.out_dim = out_dim + + def embed(self, inputs): + return torch.cat([fn(inputs) for fn in self.embed_fns], -1) + + +def get_embedder(multires, input_dims=3): + embed_kwargs = { + 'include_input': True, + 'input_dims': input_dims, + 'max_freq_log2': multires - 1, + 'num_freqs': multires, + 'log_sampling': True, + 'periodic_fns': [torch.sin, torch.cos], + } + embedder_obj = Embedder(**embed_kwargs) + embed = lambda x, eo=embedder_obj: eo.embed(x) + return embed, embedder_obj.out_dim + + +xyz_embedder, xyz_dim = get_embedder(cfg.xyz_res) +view_embedder, view_dim = get_embedder(cfg.view_res) diff --git a/lib/networks/latent_xyzc.py b/lib/networks/latent_xyzc.py new file mode 100644 index 0000000000000000000000000000000000000000..5d7ea1c9a837c5fd71618f544717ac64af7f038e --- /dev/null +++ b/lib/networks/latent_xyzc.py @@ -0,0 +1,274 @@ +import torch.nn as nn +import spconv +import torch.nn.functional as F +import torch +from lib.config import cfg +from . import embedder + + +class Network(nn.Module): + def __init__(self): + super(Network, self).__init__() + + self.c = nn.Embedding(6890, 16) + self.xyzc_net = SparseConvNet() + + self.latent = nn.Embedding(cfg.num_train_frame, 128) + + self.actvn = nn.ReLU() + + self.fc_0 = nn.Conv1d(352, 256, 1) + self.fc_1 = nn.Conv1d(256, 256, 1) + self.fc_2 = nn.Conv1d(256, 256, 1) + self.alpha_fc = nn.Conv1d(256, 1, 1) + + self.feature_fc = nn.Conv1d(256, 256, 1) + self.latent_fc = nn.Conv1d(384, 256, 1) + self.view_fc = nn.Conv1d(346, 128, 1) + self.rgb_fc = nn.Conv1d(128, 3, 1) + + def encode_sparse_voxels(self, sp_input): + coord = sp_input['coord'] + out_sh = sp_input['out_sh'] + batch_size = sp_input['batch_size'] + + code = self.c(torch.arange(0, 6890).to(coord.device)) + xyzc = spconv.SparseConvTensor(code, coord, out_sh, batch_size) + feature_volume = self.xyzc_net(xyzc) + + return feature_volume + + def pts_to_can_pts(self, pts, sp_input): + """transform pts from the world coordinate to the smpl coordinate""" + Th = sp_input['Th'] + pts = pts - Th + R = sp_input['R'] + pts = torch.matmul(pts, R) + return pts + + def get_grid_coords(self, pts, sp_input): + # convert xyz to the voxel coordinate dhw + dhw = pts[..., [2, 1, 0]] + min_dhw = sp_input['bounds'][:, 0, [2, 1, 0]] + dhw = dhw - min_dhw[:, None] + dhw = dhw / torch.tensor(cfg.voxel_size).to(dhw) + # convert the voxel coordinate to [-1, 1] + out_sh = torch.tensor(sp_input['out_sh']).to(dhw) + dhw = dhw / out_sh * 2 - 1 + # convert dhw to whd, since the occupancy is indexed by dhw + grid_coords = dhw[..., [2, 1, 0]] + return grid_coords + + def interpolate_features(self, grid_coords, feature_volume): + features = [] + for volume in feature_volume: + feature = F.grid_sample(volume, + grid_coords, + padding_mode='zeros', + align_corners=True) + features.append(feature) + features = torch.cat(features, dim=1) + features = features.view(features.size(0), -1, features.size(4)) + return features + + def calculate_density(self, wpts, feature_volume, sp_input): + # interpolate features + ppts = self.pts_to_can_pts(wpts, sp_input) + grid_coords = self.get_grid_coords(ppts, sp_input) + grid_coords = grid_coords[:, None, None] + xyzc_features = self.interpolate_features(grid_coords, feature_volume) + + # calculate density + net = self.actvn(self.fc_0(xyzc_features)) + net = self.actvn(self.fc_1(net)) + net = self.actvn(self.fc_2(net)) + + alpha = self.alpha_fc(net) + alpha = alpha.transpose(1, 2) + + return alpha + + def calculate_density_color(self, wpts, viewdir, feature_volume, sp_input): + # interpolate features + ppts = self.pts_to_can_pts(wpts, sp_input) + grid_coords = self.get_grid_coords(ppts, sp_input) + grid_coords = grid_coords[:, None, None] + xyzc_features = self.interpolate_features(grid_coords, feature_volume) + + # calculate density + net = self.actvn(self.fc_0(xyzc_features)) + net = self.actvn(self.fc_1(net)) + net = self.actvn(self.fc_2(net)) + + alpha = self.alpha_fc(net) + + # calculate color + features = self.feature_fc(net) + + latent = self.latent(sp_input['latent_index']) + latent = latent[..., None].expand(*latent.shape, net.size(2)) + features = torch.cat((features, latent), dim=1) + features = self.latent_fc(features) + + viewdir = embedder.view_embedder(viewdir) + viewdir = viewdir.transpose(1, 2) + light_pts = embedder.xyz_embedder(wpts) + light_pts = light_pts.transpose(1, 2) + + features = torch.cat((features, viewdir, light_pts), dim=1) + + net = self.actvn(self.view_fc(features)) + rgb = self.rgb_fc(net) + + raw = torch.cat((rgb, alpha), dim=1) + raw = raw.transpose(1, 2) + + return raw + + def forward(self, sp_input, grid_coords, viewdir, light_pts): + coord = sp_input['coord'] + out_sh = sp_input['out_sh'] + batch_size = sp_input['batch_size'] + + p_features = grid_coords.transpose(1, 2) + grid_coords = grid_coords[:, None, None] + + code = self.c(torch.arange(0, 6890).to(p_features.device)) + xyzc = spconv.SparseConvTensor(code, coord, out_sh, batch_size) + + xyzc_features = self.xyzc_net(xyzc, grid_coords) + + net = self.actvn(self.fc_0(xyzc_features)) + net = self.actvn(self.fc_1(net)) + net = self.actvn(self.fc_2(net)) + + alpha = self.alpha_fc(net) + + features = self.feature_fc(net) + + latent = self.latent(sp_input['latent_index']) + latent = latent[..., None].expand(*latent.shape, net.size(2)) + features = torch.cat((features, latent), dim=1) + features = self.latent_fc(features) + + viewdir = viewdir.transpose(1, 2) + light_pts = light_pts.transpose(1, 2) + features = torch.cat((features, viewdir, light_pts), dim=1) + net = self.actvn(self.view_fc(features)) + rgb = self.rgb_fc(net) + + raw = torch.cat((rgb, alpha), dim=1) + raw = raw.transpose(1, 2) + + return raw + + +class SparseConvNet(nn.Module): + def __init__(self): + super(SparseConvNet, self).__init__() + + self.conv0 = double_conv(16, 16, 'subm0') + self.down0 = stride_conv(16, 32, 'down0') + + self.conv1 = double_conv(32, 32, 'subm1') + self.down1 = stride_conv(32, 64, 'down1') + + self.conv2 = triple_conv(64, 64, 'subm2') + self.down2 = stride_conv(64, 128, 'down2') + + self.conv3 = triple_conv(128, 128, 'subm3') + self.down3 = stride_conv(128, 128, 'down3') + + self.conv4 = triple_conv(128, 128, 'subm4') + + def forward(self, x): + net = self.conv0(x) + net = self.down0(net) + + net = self.conv1(net) + net1 = net.dense() + net = self.down1(net) + + net = self.conv2(net) + net2 = net.dense() + net = self.down2(net) + + net = self.conv3(net) + net3 = net.dense() + net = self.down3(net) + + net = self.conv4(net) + net4 = net.dense() + + volumes = [net1, net2, net3, net4] + + return volumes + + +def single_conv(in_channels, out_channels, indice_key=None): + return spconv.SparseSequential( + spconv.SubMConv3d(in_channels, + out_channels, + 1, + bias=False, + indice_key=indice_key), + nn.BatchNorm1d(out_channels, eps=1e-3, momentum=0.01), + nn.ReLU(), + ) + + +def double_conv(in_channels, out_channels, indice_key=None): + return spconv.SparseSequential( + spconv.SubMConv3d(in_channels, + out_channels, + 3, + bias=False, + indice_key=indice_key), + nn.BatchNorm1d(out_channels, eps=1e-3, momentum=0.01), + nn.ReLU(), + spconv.SubMConv3d(out_channels, + out_channels, + 3, + bias=False, + indice_key=indice_key), + nn.BatchNorm1d(out_channels, eps=1e-3, momentum=0.01), + nn.ReLU(), + ) + + +def triple_conv(in_channels, out_channels, indice_key=None): + return spconv.SparseSequential( + spconv.SubMConv3d(in_channels, + out_channels, + 3, + bias=False, + indice_key=indice_key), + nn.BatchNorm1d(out_channels, eps=1e-3, momentum=0.01), + nn.ReLU(), + spconv.SubMConv3d(out_channels, + out_channels, + 3, + bias=False, + indice_key=indice_key), + nn.BatchNorm1d(out_channels, eps=1e-3, momentum=0.01), + nn.ReLU(), + spconv.SubMConv3d(out_channels, + out_channels, + 3, + bias=False, + indice_key=indice_key), + nn.BatchNorm1d(out_channels, eps=1e-3, momentum=0.01), + nn.ReLU(), + ) + + +def stride_conv(in_channels, out_channels, indice_key=None): + return spconv.SparseSequential( + spconv.SparseConv3d(in_channels, + out_channels, + 3, + 2, + padding=1, + bias=False, + indice_key=indice_key), + nn.BatchNorm1d(out_channels, eps=1e-3, momentum=0.01), nn.ReLU()) diff --git a/lib/networks/make_network.py b/lib/networks/make_network.py new file mode 100644 index 0000000000000000000000000000000000000000..38a69d56a2e7b2a287d5467de0884fd9c842fc5f --- /dev/null +++ b/lib/networks/make_network.py @@ -0,0 +1,9 @@ +import os +import imp + + +def make_network(cfg): + module = cfg.network_module + path = cfg.network_path + network = imp.load_source(module, path).Network() + return network diff --git a/lib/networks/nerf.py b/lib/networks/nerf.py new file mode 100644 index 0000000000000000000000000000000000000000..2da4fcdfaaee6d77d06f2e168519ebd9a2eba175 --- /dev/null +++ b/lib/networks/nerf.py @@ -0,0 +1,158 @@ +import torch.nn as nn +import torch +from lib.config import cfg +from .embedder import get_embedder +import torch.nn.functional as F + + +class Nerf(nn.Module): + def __init__(self, + D=8, + W=256, + input_ch=3, + input_ch_views=3, + skips=[4], + use_viewdirs=False): + """ + """ + super(Nerf, self).__init__() + + self.D = D + self.W = W + self.input_ch = input_ch + self.input_ch_views = input_ch_views + self.skips = skips + self.use_viewdirs = use_viewdirs + + self.pts_linears = nn.ModuleList([nn.Linear(input_ch, W)] + [ + nn.Linear(W, W) if i not in + self.skips else nn.Linear(W + input_ch, W) for i in range(D - 1) + ]) + + ### Implementation according to the official code release (https://github.com/bmild/nerf/blob/master/run_nerf_helpers.py#L104-L105) + self.views_linears = nn.ModuleList( + [nn.Linear(input_ch_views + W, W // 2)]) + + ### Implementation according to the paper + # self.views_linears = nn.ModuleList( + # [nn.Linear(input_ch_views + W, W//2)] + [nn.Linear(W//2, W//2) for i in range(D//2)]) + + if self.use_viewdirs: + self.feature_linear = nn.Linear(W, W) + self.alpha_linear = nn.Linear(W, 1) + self.rgb_linear = nn.Linear(W // 2, 3) + + def forward(self, x): + input_pts, input_views = torch.split( + x, [self.input_ch, self.input_ch_views], dim=-1) + h = input_pts + for i, l in enumerate(self.pts_linears): + h = self.pts_linears[i](h) + h = F.relu(h) + if i in self.skips: + h = torch.cat([input_pts, h], -1) + + if self.use_viewdirs: + alpha = self.alpha_linear(h) + feature = self.feature_linear(h) + h = torch.cat([feature, input_views], -1) + + for i, l in enumerate(self.views_linears): + h = self.views_linears[i](h) + h = F.relu(h) + + rgb = self.rgb_linear(h) + outputs = torch.cat([rgb, alpha], -1) + else: + outputs = self.output_linear(h) + + return outputs + + def load_weights_from_keras(self, weights): + assert self.use_viewdirs, "Not implemented if use_viewdirs=False" + + # Load pts_linears + for i in range(self.D): + idx_pts_linears = 2 * i + self.pts_linears[i].weight.data = torch.from_numpy( + np.transpose(weights[idx_pts_linears])) + self.pts_linears[i].bias.data = torch.from_numpy( + np.transpose(weights[idx_pts_linears + 1])) + + # Load feature_linear + idx_feature_linear = 2 * self.D + self.feature_linear.weight.data = torch.from_numpy( + np.transpose(weights[idx_feature_linear])) + self.feature_linear.bias.data = torch.from_numpy( + np.transpose(weights[idx_feature_linear + 1])) + + # Load views_linears + idx_views_linears = 2 * self.D + 2 + self.views_linears[0].weight.data = torch.from_numpy( + np.transpose(weights[idx_views_linears])) + self.views_linears[0].bias.data = torch.from_numpy( + np.transpose(weights[idx_views_linears + 1])) + + # Load rgb_linear + idx_rbg_linear = 2 * self.D + 4 + self.rgb_linear.weight.data = torch.from_numpy( + np.transpose(weights[idx_rbg_linear])) + self.rgb_linear.bias.data = torch.from_numpy( + np.transpose(weights[idx_rbg_linear + 1])) + + # Load alpha_linear + idx_alpha_linear = 2 * self.D + 6 + self.alpha_linear.weight.data = torch.from_numpy( + np.transpose(weights[idx_alpha_linear])) + self.alpha_linear.bias.data = torch.from_numpy( + np.transpose(weights[idx_alpha_linear + 1])) + + +class Network(nn.Module): + def __init__(self): + super(Network, self).__init__() + + self.embed_fn, input_ch = get_embedder(cfg.xyz_res) + self.embeddirs_fn, input_ch_views = get_embedder(cfg.view_res) + + skips = [4] + self.model = Nerf(D=cfg.netdepth, + W=cfg.netwidth, + input_ch=input_ch, + skips=skips, + input_ch_views=input_ch_views, + use_viewdirs=cfg.use_viewdirs) + + self.model_fine = Nerf(D=cfg.netdepth_fine, + W=cfg.netwidth_fine, + input_ch=input_ch, + skips=skips, + input_ch_views=input_ch_views, + use_viewdirs=cfg.use_viewdirs) + + def batchify(self, fn, chunk): + """Constructs a version of 'fn' that applies to smaller batches. + """ + def ret(inputs): + return torch.cat([fn(inputs[i:i+chunk]) for i in range(0, inputs.shape[0], chunk)], 0) + return ret + + def forward(self, inputs, viewdirs, model=''): + """Prepares inputs and applies network 'fn'. + """ + if model == 'fine': + fn = self.model_fine + else: + fn = self.model + + inputs_flat = torch.reshape(inputs, [-1, inputs.shape[-1]]) + embedded = self.embed_fn(inputs_flat) + + input_dirs = viewdirs[:,None].expand(inputs.shape) + input_dirs_flat = torch.reshape(input_dirs, [-1, input_dirs.shape[-1]]) + embedded_dirs = self.embeddirs_fn(input_dirs_flat) + embedded = torch.cat([embedded, embedded_dirs], -1) + + outputs_flat = self.batchify(fn, cfg.netchunk)(embedded) + outputs = torch.reshape(outputs_flat, list(inputs.shape[:-1]) + [outputs_flat.shape[-1]]) + return outputs diff --git a/lib/networks/nerf_mesh.py b/lib/networks/nerf_mesh.py new file mode 100644 index 0000000000000000000000000000000000000000..4d3d478d80ad2e545762739e6e8548e76f065ee1 --- /dev/null +++ b/lib/networks/nerf_mesh.py @@ -0,0 +1,138 @@ +import torch.nn as nn +import torch +from lib.config import cfg +from .embedder import get_embedder +import torch.nn.functional as F + + +class Nerf(nn.Module): + def __init__(self, + D=8, + W=256, + input_ch=3, + input_ch_views=3, + skips=[4], + use_viewdirs=False): + """ + """ + super(Nerf, self).__init__() + + self.D = D + self.W = W + self.input_ch = input_ch + self.input_ch_views = input_ch_views + self.skips = skips + self.use_viewdirs = use_viewdirs + + self.pts_linears = nn.ModuleList([nn.Linear(input_ch, W)] + [ + nn.Linear(W, W) if i not in + self.skips else nn.Linear(W + input_ch, W) for i in range(D - 1) + ]) + + ### Implementation according to the official code release (https://github.com/bmild/nerf/blob/master/run_nerf_helpers.py#L104-L105) + self.views_linears = nn.ModuleList( + [nn.Linear(input_ch_views + W, W // 2)]) + + ### Implementation according to the paper + # self.views_linears = nn.ModuleList( + # [nn.Linear(input_ch_views + W, W//2)] + [nn.Linear(W//2, W//2) for i in range(D//2)]) + + if self.use_viewdirs: + self.feature_linear = nn.Linear(W, W) + self.alpha_linear = nn.Linear(W, 1) + self.rgb_linear = nn.Linear(W // 2, 3) + + def forward(self, x): + input_pts = x + h = input_pts + for i, l in enumerate(self.pts_linears): + h = self.pts_linears[i](h) + h = F.relu(h) + if i in self.skips: + h = torch.cat([input_pts, h], -1) + alpha = self.alpha_linear(h) + return alpha + + def load_weights_from_keras(self, weights): + assert self.use_viewdirs, "Not implemented if use_viewdirs=False" + + # Load pts_linears + for i in range(self.D): + idx_pts_linears = 2 * i + self.pts_linears[i].weight.data = torch.from_numpy( + np.transpose(weights[idx_pts_linears])) + self.pts_linears[i].bias.data = torch.from_numpy( + np.transpose(weights[idx_pts_linears + 1])) + + # Load feature_linear + idx_feature_linear = 2 * self.D + self.feature_linear.weight.data = torch.from_numpy( + np.transpose(weights[idx_feature_linear])) + self.feature_linear.bias.data = torch.from_numpy( + np.transpose(weights[idx_feature_linear + 1])) + + # Load views_linears + idx_views_linears = 2 * self.D + 2 + self.views_linears[0].weight.data = torch.from_numpy( + np.transpose(weights[idx_views_linears])) + self.views_linears[0].bias.data = torch.from_numpy( + np.transpose(weights[idx_views_linears + 1])) + + # Load rgb_linear + idx_rbg_linear = 2 * self.D + 4 + self.rgb_linear.weight.data = torch.from_numpy( + np.transpose(weights[idx_rbg_linear])) + self.rgb_linear.bias.data = torch.from_numpy( + np.transpose(weights[idx_rbg_linear + 1])) + + # Load alpha_linear + idx_alpha_linear = 2 * self.D + 6 + self.alpha_linear.weight.data = torch.from_numpy( + np.transpose(weights[idx_alpha_linear])) + self.alpha_linear.bias.data = torch.from_numpy( + np.transpose(weights[idx_alpha_linear + 1])) + + +class Network(nn.Module): + def __init__(self): + super(Network, self).__init__() + + self.embed_fn, input_ch = get_embedder(cfg.xyz_res) + self.embeddirs_fn, input_ch_views = get_embedder(cfg.view_res) + + skips = [4] + self.model = Nerf(D=cfg.netdepth, + W=cfg.netwidth, + input_ch=input_ch, + skips=skips, + input_ch_views=input_ch_views, + use_viewdirs=cfg.use_viewdirs) + + # self.model_fine = Nerf(D=cfg.netdepth_fine, + # W=cfg.netwidth_fine, + # input_ch=input_ch, + # skips=skips, + # input_ch_views=input_ch_views, + # use_viewdirs=cfg.use_viewdirs) + + def batchify(self, fn, chunk): + """Constructs a version of 'fn' that applies to smaller batches. + """ + def ret(inputs): + return torch.cat([fn(inputs[i:i+chunk]) for i in range(0, inputs.shape[0], chunk)], 0) + return ret + + def forward(self, inputs, model=''): + """Prepares inputs and applies network 'fn'. + """ + if model == 'fine': + fn = self.model_fine + else: + fn = self.model + + inputs_flat = torch.reshape(inputs, [-1, inputs.shape[-1]]) + embedded = self.embed_fn(inputs_flat) + outputs_flat = self.batchify(fn, cfg.netchunk)(embedded) + outputs = torch.reshape(outputs_flat, list(inputs.shape[:-1]) + [outputs_flat.shape[-1]]) + + return outputs diff --git a/lib/networks/renderer/__init__.py b/lib/networks/renderer/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..1cd8a49ee60761d80ecac326f0cf778db051cc37 --- /dev/null +++ b/lib/networks/renderer/__init__.py @@ -0,0 +1 @@ +from .make_renderer import make_renderer \ No newline at end of file diff --git a/lib/networks/renderer/if_clight_renderer.py b/lib/networks/renderer/if_clight_renderer.py new file mode 100644 index 0000000000000000000000000000000000000000..d44c11a50cad34d802d0b6b95fc1add0d84dc5cb --- /dev/null +++ b/lib/networks/renderer/if_clight_renderer.py @@ -0,0 +1,122 @@ +import torch +from lib.config import cfg +from .nerf_net_utils import * +from .. import embedder + + +class Renderer: + def __init__(self, net): + self.net = net + + def get_sampling_points(self, ray_o, ray_d, near, far): + # calculate the steps for each ray + t_vals = torch.linspace(0., 1., steps=cfg.N_samples).to(near) + z_vals = near[..., None] * (1. - t_vals) + far[..., None] * t_vals + + if cfg.perturb > 0. and self.net.training: + # get intervals between samples + mids = .5 * (z_vals[..., 1:] + z_vals[..., :-1]) + upper = torch.cat([mids, z_vals[..., -1:]], -1) + lower = torch.cat([z_vals[..., :1], mids], -1) + # stratified samples in those intervals + t_rand = torch.rand(z_vals.shape).to(upper) + z_vals = lower + (upper - lower) * t_rand + + pts = ray_o[:, :, None] + ray_d[:, :, None] * z_vals[..., None] + + return pts, z_vals + + def prepare_sp_input(self, batch): + # feature, coordinate, shape, batch size + sp_input = {} + + # coordinate: [N, 4], batch_idx, z, y, x + sh = batch['coord'].shape + idx = [torch.full([sh[1]], i) for i in range(sh[0])] + idx = torch.cat(idx).to(batch['coord']) + coord = batch['coord'].view(-1, sh[-1]) + sp_input['coord'] = torch.cat([idx[:, None], coord], dim=1) + + out_sh, _ = torch.max(batch['out_sh'], dim=0) + sp_input['out_sh'] = out_sh.tolist() + sp_input['batch_size'] = sh[0] + + # used for feature interpolation + sp_input['bounds'] = batch['bounds'] + sp_input['R'] = batch['R'] + sp_input['Th'] = batch['Th'] + + # used for color function + sp_input['latent_index'] = batch['latent_index'] + + return sp_input + + def get_density_color(self, wpts, viewdir, raw_decoder): + n_batch, n_pixel, n_sample = wpts.shape[:3] + wpts = wpts.view(n_batch, n_pixel * n_sample, -1) + viewdir = viewdir[:, :, None].repeat(1, 1, n_sample, 1).contiguous() + viewdir = viewdir.view(n_batch, n_pixel * n_sample, -1) + raw = raw_decoder(wpts, viewdir) + return raw + + def get_pixel_value(self, ray_o, ray_d, near, far, feature_volume, + sp_input, batch): + # sampling points along camera rays + wpts, z_vals = self.get_sampling_points(ray_o, ray_d, near, far) + + # viewing direction + viewdir = ray_d / torch.norm(ray_d, dim=2, keepdim=True) + + raw_decoder = lambda x_point, viewdir_val: self.net.calculate_density_color( + x_point, viewdir_val, feature_volume, sp_input) + + # compute the color and density + wpts_raw = self.get_density_color(wpts, viewdir, raw_decoder) + + # volume rendering for wpts + n_batch, n_pixel, n_sample = wpts.shape[:3] + raw = wpts_raw.reshape(-1, n_sample, 4) + z_vals = z_vals.view(-1, n_sample) + ray_d = ray_d.view(-1, 3) + rgb_map, disp_map, acc_map, weights, depth_map = raw2outputs( + raw, z_vals, ray_d, cfg.raw_noise_std, cfg.white_bkgd) + + ret = { + 'rgb_map': rgb_map.view(n_batch, n_pixel, -1), + 'disp_map': disp_map.view(n_batch, n_pixel), + 'acc_map': acc_map.view(n_batch, n_pixel), + 'weights': weights.view(n_batch, n_pixel, -1), + 'depth_map': depth_map.view(n_batch, n_pixel) + } + + return ret + + def render(self, batch): + ray_o = batch['ray_o'] + ray_d = batch['ray_d'] + near = batch['near'] + far = batch['far'] + sh = ray_o.shape + + # encode neural body + sp_input = self.prepare_sp_input(batch) + feature_volume = self.net.encode_sparse_voxels(sp_input) + + # volume rendering for each pixel + n_batch, n_pixel = ray_o.shape[:2] + chunk = 2048 + ret_list = [] + for i in range(0, n_pixel, chunk): + ray_o_chunk = ray_o[:, i:i + chunk] + ray_d_chunk = ray_d[:, i:i + chunk] + near_chunk = near[:, i:i + chunk] + far_chunk = far[:, i:i + chunk] + pixel_value = self.get_pixel_value(ray_o_chunk, ray_d_chunk, + near_chunk, far_chunk, + feature_volume, sp_input, batch) + ret_list.append(pixel_value) + + keys = ret_list[0].keys() + ret = {k: torch.cat([r[k] for r in ret_list], dim=1) for k in keys} + + return ret diff --git a/lib/networks/renderer/if_clight_renderer_mmsk.py b/lib/networks/renderer/if_clight_renderer_mmsk.py new file mode 100644 index 0000000000000000000000000000000000000000..ec0cdb5df4ca1afa68813875a396b34cb3c15217 --- /dev/null +++ b/lib/networks/renderer/if_clight_renderer_mmsk.py @@ -0,0 +1,94 @@ +import torch +from lib.config import cfg +from .nerf_net_utils import * +from .. import embedder +from . import if_clight_renderer + + +class Renderer(if_clight_renderer.Renderer): + def __init__(self, net): + super(Renderer, self).__init__(net) + + def prepare_inside_pts(self, pts, batch): + if 'Ks' not in batch: + __import__('ipdb').set_trace() + return raw + + sh = pts.shape + pts = pts.view(sh[0], -1, sh[3]) + + insides = [] + for nv in range(batch['Ks'].size(1)): + # project pts to image space + R = batch['RT'][:, nv, :3, :3] + T = batch['RT'][:, nv, :3, 3] + pts_ = torch.matmul(pts, R.transpose(2, 1)) + T[:, None] + pts_ = torch.matmul(pts_, batch['Ks'][:, nv].transpose(2, 1)) + pts2d = pts_[..., :2] / pts_[..., 2:] + + # ensure that pts2d is inside the image + pts2d = pts2d.round().long() + H, W = int(cfg.H * cfg.ratio), int(cfg.W * cfg.ratio) + pts2d[..., 0] = torch.clamp(pts2d[..., 0], 0, W - 1) + pts2d[..., 1] = torch.clamp(pts2d[..., 1], 0, H - 1) + + # remove the points outside the mask + pts2d = pts2d[0] + msk = batch['msks'][0, nv] + inside = msk[pts2d[:, 1], pts2d[:, 0]][None].bool() + insides.append(inside) + + inside = insides[0] + for i in range(1, len(insides)): + inside = inside * insides[i] + + return inside + + def get_density_color(self, wpts, viewdir, inside, raw_decoder): + n_batch, n_pixel, n_sample = wpts.shape[:3] + wpts = wpts.view(n_batch, n_pixel * n_sample, -1) + viewdir = viewdir[:, :, None].repeat(1, 1, n_sample, 1).contiguous() + viewdir = viewdir.view(n_batch, n_pixel * n_sample, -1) + wpts = wpts[inside][None] + viewdir = viewdir[inside][None] + full_raw = torch.zeros([n_batch, n_pixel * n_sample, 4]).to(wpts) + if inside.sum() == 0: + return full_raw + + raw = raw_decoder(wpts, viewdir) + full_raw[inside] = raw[0] + + return full_raw + + def get_pixel_value(self, ray_o, ray_d, near, far, feature_volume, + sp_input, batch): + # sampling points along camera rays + wpts, z_vals = self.get_sampling_points(ray_o, ray_d, near, far) + inside = self.prepare_inside_pts(wpts, batch) + + # viewing direction + viewdir = ray_d / torch.norm(ray_d, dim=2, keepdim=True) + + raw_decoder = lambda x_point, viewdir_val: self.net.calculate_density_color( + x_point, viewdir_val, feature_volume, sp_input) + + # compute the color and density + wpts_raw = self.get_density_color(wpts, viewdir, inside, raw_decoder) + + # volume rendering for wpts + n_batch, n_pixel, n_sample = wpts.shape[:3] + raw = wpts_raw.reshape(-1, n_sample, 4) + z_vals = z_vals.view(-1, n_sample) + ray_d = ray_d.view(-1, 3) + rgb_map, disp_map, acc_map, weights, depth_map = raw2outputs( + raw, z_vals, ray_d, cfg.raw_noise_std, cfg.white_bkgd) + + ret = { + 'rgb_map': rgb_map.view(n_batch, n_pixel, -1), + 'disp_map': disp_map.view(n_batch, n_pixel), + 'acc_map': acc_map.view(n_batch, n_pixel), + 'weights': weights.view(n_batch, n_pixel, -1), + 'depth_map': depth_map.view(n_batch, n_pixel) + } + + return ret diff --git a/lib/networks/renderer/if_clight_renderer_msk.py b/lib/networks/renderer/if_clight_renderer_msk.py new file mode 100644 index 0000000000000000000000000000000000000000..04d3d86baf53eec569794c63a9e3ced1be4000a0 --- /dev/null +++ b/lib/networks/renderer/if_clight_renderer_msk.py @@ -0,0 +1,49 @@ +import torch +from lib.config import cfg +from .nerf_net_utils import * +from .. import embedder +from . import if_clight_renderer_mmsk + + +class Renderer(if_clight_renderer_mmsk.Renderer): + def __init__(self, net): + super(Renderer, self).__init__(net) + + def prepare_inside_pts(self, wpts, batch): + if 'R0_snap' not in batch: + __import__('ipdb').set_trace() + return raw + + # transform points from the world space to the smpl space + Th = batch['Th'] + can_pts = wpts - Th[:, None, None] + R = batch['R'] + can_pts = torch.matmul(can_pts, R) + + R0 = batch['R0_snap'] + Th0 = batch['Th0_snap'] + + # transform pts from smpl coordinate to the world coordinate + sh = can_pts.shape + can_pts = can_pts.view(sh[0], -1, sh[3]) + pts = torch.matmul(can_pts, R0.transpose(2, 1)) + Th0[:, None] + + # project pts to image space + R = batch['RT'][..., :3] + T = batch['RT'][..., 3] + pts = torch.matmul(pts, R.transpose(2, 1)) + T[:, None] + pts = torch.matmul(pts, batch['K'].transpose(2, 1)) + pts2d = pts[..., :2] / pts[..., 2:] + + # ensure that pts2d is inside the image + pts2d = pts2d.round().long() + H, W = int(cfg.H * cfg.ratio), int(cfg.W * cfg.ratio) + pts2d[..., 0] = torch.clamp(pts2d[..., 0], 0, W - 1) + pts2d[..., 1] = torch.clamp(pts2d[..., 1], 0, H - 1) + + # remove the points outside the mask + pts2d = pts2d[0] + msk = batch['msk'][0] + inside = msk[pts2d[:, 1], pts2d[:, 0]][None].bool() + + return inside diff --git a/lib/networks/renderer/if_mesh_renderer.py b/lib/networks/renderer/if_mesh_renderer.py new file mode 100644 index 0000000000000000000000000000000000000000..c62bcdcb1e942976e6e3bc921b916fadd34d97ab --- /dev/null +++ b/lib/networks/renderer/if_mesh_renderer.py @@ -0,0 +1,56 @@ +import torch +from lib.config import cfg +from .nerf_net_utils import * +from .. import embedder +import numpy as np +import mcubes +import trimesh +from . import if_clight_renderer + + +class Renderer(if_clight_renderer.Renderer): + def __init__(self, net): + super(Renderer, self).__init__(net) + + def batchify_rays(self, wpts, alpha_decoder, chunk=1024 * 32): + """Render rays in smaller minibatches to avoid OOM. + """ + n_batch, n_point = wpts.shape[:2] + all_ret = [] + for i in range(0, n_point, chunk): + ret = alpha_decoder(wpts[:, i:i + chunk]) + all_ret.append(ret) + all_ret = torch.cat(all_ret, 1) + return all_ret + + def render(self, batch): + pts = batch['pts'] + sh = pts.shape + + inside = batch['inside'][0].bool() + pts = pts[0][inside][None] + + # encode neural body + sp_input = self.prepare_sp_input(batch) + feature_volume = self.net.encode_sparse_voxels(sp_input) + alpha_decoder = lambda x: self.net.calculate_density( + x, feature_volume, sp_input) + + alpha = self.batchify_rays(pts, alpha_decoder, 2048 * 64) + + alpha = alpha[0, :, 0].detach().cpu().numpy() + cube = np.zeros(sh[1:-1]) + inside = inside.detach().cpu().numpy() + cube[inside == 1] = alpha + + cube = np.pad(cube, 10, mode='constant') + vertices, triangles = mcubes.marching_cubes(cube, cfg.mesh_th) + + # vertices = (vertices - 10) * 0.005 + # vertices = vertices + batch['wbounds'][0, 0].detach().cpu().numpy() + + mesh = trimesh.Trimesh(vertices, triangles) + + ret = {'cube': cube, 'mesh': mesh} + + return ret diff --git a/lib/networks/renderer/make_renderer.py b/lib/networks/renderer/make_renderer.py new file mode 100644 index 0000000000000000000000000000000000000000..5c3f1d040f1dacc064e1d12115efce1c9e3b8503 --- /dev/null +++ b/lib/networks/renderer/make_renderer.py @@ -0,0 +1,9 @@ +import os +import imp + + +def make_renderer(cfg, network): + module = cfg.renderer_module + path = cfg.renderer_path + renderer = imp.load_source(module, path).Renderer(network) + return renderer diff --git a/lib/networks/renderer/nerf_net_utils.py b/lib/networks/renderer/nerf_net_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..5516c14a375066ebabea0c53c81f9802787190e0 --- /dev/null +++ b/lib/networks/renderer/nerf_net_utils.py @@ -0,0 +1,90 @@ +import torch.nn.functional as F +import torch +from lib.config import cfg + + +def raw2outputs(raw, z_vals, rays_d, raw_noise_std=0, white_bkgd=False): + """Transforms model's predictions to semantically meaningful values. + Args: + raw: [num_rays, num_samples along ray, 4]. Prediction from model. + z_vals: [num_rays, num_samples along ray]. Integration time. + rays_d: [num_rays, 3]. Direction of each ray. + Returns: + rgb_map: [num_rays, 3]. Estimated RGB color of a ray. + disp_map: [num_rays]. Disparity map. Inverse of depth map. + acc_map: [num_rays]. Sum of weights along each ray. + weights: [num_rays, num_samples]. Weights assigned to each sampled color. + depth_map: [num_rays]. Estimated distance to object. + """ + raw2alpha = lambda raw, dists, act_fn=F.relu: 1. - torch.exp(-act_fn(raw) * + dists) + + dists = z_vals[..., 1:] - z_vals[..., :-1] + dists = torch.cat( + [dists, + torch.Tensor([1e10]).expand(dists[..., :1].shape).to(dists)], + -1) # [N_rays, N_samples] + + dists = dists * torch.norm(rays_d[..., None, :], dim=-1) + + rgb = torch.sigmoid(raw[..., :3]) # [N_rays, N_samples, 3] + noise = 0. + if raw_noise_std > 0.: + noise = torch.randn(raw[..., 3].shape) * raw_noise_std + + alpha = raw2alpha(raw[..., 3] + noise, dists) # [N_rays, N_samples] + # weights = alpha * tf.math.cumprod(1.-alpha + 1e-10, -1, exclusive=True) + weights = alpha * torch.cumprod( + torch.cat( + [torch.ones((alpha.shape[0], 1)).to(alpha), 1. - alpha + 1e-10], + -1), -1)[:, :-1] + rgb_map = torch.sum(weights[..., None] * rgb, -2) # [N_rays, 3] + + depth_map = torch.sum(weights * z_vals, -1) + disp_map = 1. / torch.max(1e-10 * torch.ones_like(depth_map).to(depth_map), + depth_map / torch.sum(weights, -1)) + acc_map = torch.sum(weights, -1) + + if white_bkgd: + rgb_map = rgb_map + (1. - acc_map[..., None]) + + return rgb_map, disp_map, acc_map, weights, depth_map + + +# Hierarchical sampling (section 5.2) +def sample_pdf(bins, weights, N_samples, det=False): + from torchsearchsorted import searchsorted + + # Get pdf + weights = weights + 1e-5 # prevent nans + pdf = weights / torch.sum(weights, -1, keepdim=True) + cdf = torch.cumsum(pdf, -1) + cdf = torch.cat([torch.zeros_like(cdf[..., :1]), cdf], + -1) # (batch, len(bins)) + + # Take uniform samples + if det: + u = torch.linspace(0., 1., steps=N_samples).to(cdf) + u = u.expand(list(cdf.shape[:-1]) + [N_samples]) + else: + u = torch.rand(list(cdf.shape[:-1]) + [N_samples]).to(cdf) + + # Invert CDF + u = u.contiguous() + inds = searchsorted(cdf, u, side='right') + below = torch.max(torch.zeros_like(inds - 1), inds - 1) + above = torch.min((cdf.shape[-1] - 1) * torch.ones_like(inds), inds) + inds_g = torch.stack([below, above], -1) # (batch, N_samples, 2) + + # cdf_g = tf.gather(cdf, inds_g, axis=-1, batch_dims=len(inds_g.shape)-2) + # bins_g = tf.gather(bins, inds_g, axis=-1, batch_dims=len(inds_g.shape)-2) + matched_shape = [inds_g.shape[0], inds_g.shape[1], cdf.shape[-1]] + cdf_g = torch.gather(cdf.unsqueeze(1).expand(matched_shape), 2, inds_g) + bins_g = torch.gather(bins.unsqueeze(1).expand(matched_shape), 2, inds_g) + + denom = (cdf_g[..., 1] - cdf_g[..., 0]) + denom = torch.where(denom < 1e-5, torch.ones_like(denom), denom) + t = (u - cdf_g[..., 0]) / denom + samples = bins_g[..., 0] + t * (bins_g[..., 1] - bins_g[..., 0]) + + return samples diff --git a/lib/networks/renderer/tpose_renderer.py b/lib/networks/renderer/tpose_renderer.py new file mode 100644 index 0000000000000000000000000000000000000000..f29562677cc704c3f79836de6376e586ab0aea0a --- /dev/null +++ b/lib/networks/renderer/tpose_renderer.py @@ -0,0 +1,174 @@ +import torch +from lib.config import cfg +from .nerf_net_utils import * +from .. import embedder + + +class Renderer: + def __init__(self, net): + self.net = net + + def get_sampling_points(self, ray_o, ray_d, near, far): + # calculate the steps for each ray + t_vals = torch.linspace(0., 1., steps=cfg.N_samples).to(near) + z_vals = near[..., None] * (1. - t_vals) + far[..., None] * t_vals + + if cfg.perturb > 0. and self.net.training: + # get intervals between samples + mids = .5 * (z_vals[..., 1:] + z_vals[..., :-1]) + upper = torch.cat([mids, z_vals[..., -1:]], -1) + lower = torch.cat([z_vals[..., :1], mids], -1) + # stratified samples in those intervals + t_rand = torch.rand(z_vals.shape).to(upper) + z_vals = lower + (upper - lower) * t_rand + + pts = ray_o[:, :, None] + ray_d[:, :, None] * z_vals[..., None] + + return pts, z_vals + + def pts_to_can_pts(self, pts, batch): + """transform pts from the world coordinate to the smpl coordinate""" + Th = batch['Th'][:, None] + pts = pts - Th + R = batch['R'] + sh = pts.shape + pts = torch.matmul(pts.view(sh[0], -1, sh[3]), R) + pts = pts.view(*sh) + return pts + + def transform_sampling_points(self, pts, batch): + if not self.net.training: + return pts + center = batch['center'][:, None, None] + pts = pts - center + rot = batch['rot'] + pts_ = pts[..., [0, 2]].clone() + sh = pts_.shape + pts_ = torch.matmul(pts_.view(sh[0], -1, sh[3]), rot.permute(0, 2, 1)) + pts[..., [0, 2]] = pts_.view(*sh) + pts = pts + center + trans = batch['trans'][:, None, None] + pts = pts + trans + return pts + + def prepare_sp_input(self, batch): + # feature, coordinate, shape, batch size + sp_input = {} + + # coordinate: [N, 4], batch_idx, x, y, z + sh = batch['tcoord'].shape + idx = [torch.full([sh[1]], i) for i in range(sh[0])] + idx = torch.cat(idx).to(batch['tcoord']) + coord = batch['tcoord'].view(-1, sh[-1]) + sp_input['coord'] = torch.cat([idx[:, None], coord], dim=1) + + out_sh, _ = torch.max(batch['tout_sh'], dim=0) + sp_input['out_sh'] = out_sh.tolist() + sp_input['batch_size'] = sh[0] + + sp_input['i'] = batch['i'] + + return sp_input + + def get_ptot_grid_coords(self, pts, out_sh, bounds): + # pts: [batch_size, x, y, z, 3], x, y, z + min_xyz = bounds[:, 0] + pts = pts - min_xyz[:, None, None, None] + pts = pts / torch.tensor(cfg.voxel_size).to(pts) + # convert the voxel coordinate to [-1, 1] + out_sh = torch.tensor(out_sh).to(pts) + pts = pts / out_sh * 2 - 1 + # convert xyz to zyx, since the occupancy is indexed by xyz + grid_coords = pts[..., [2, 1, 0]] + return grid_coords + + def get_grid_coords(self, pts, ptot_pts, bounds): + out_sh = torch.tensor(ptot_pts.shape[1:-1]).to(pts) + # pts: [batch_size, N, 3], x, y, z + min_xyz = bounds[:, 0] + pts = pts - min_xyz[:, None] + pts = pts / torch.tensor(cfg.ptot_vsize).to(pts) + # convert the voxel coordinate to [-1, 1] + pts = pts / out_sh * 2 - 1 + # convert xyz to zyx, since the occupancy is indexed by xyz + grid_coords = pts[..., [2, 1, 0]] + return grid_coords + + # def batchify_rays(self, rays_flat, chunk=1024 * 32, net_c=None): + def batchify_rays(self, + sp_input, + tgrid_coords, + pgrid_coords, + viewdir, + light_pts, + chunk=1024 * 32, + net_c=None): + """Render rays in smaller minibatches to avoid OOM. + """ + all_ret = [] + for i in range(0, tgrid_coords.shape[1], chunk): + # ret = self.render_rays(rays_flat[i:i + chunk], net_c) + ret = self.net(sp_input, tgrid_coords[:, i:i + chunk], + pgrid_coords[:, i:i + chunk], + viewdir[:, i:i + chunk], light_pts[:, i:i + chunk]) + # for k in ret: + # if k not in all_ret: + # all_ret[k] = [] + # all_ret[k].append(ret[k]) + all_ret.append(ret) + # all_ret = {k: torch.cat(all_ret[k], 0) for k in all_ret} + all_ret = torch.cat(all_ret, 1) + return all_ret + + def render(self, batch): + ray_o = batch['ray_o'] + ray_d = batch['ray_d'] + near = batch['near'] + far = batch['far'] + sh = ray_o.shape + + pts, z_vals = self.get_sampling_points(ray_o, ray_d, near, far) + # light intensity varies with 3D location + light_pts = embedder.xyz_embedder(pts) + ppts = self.pts_to_can_pts(pts, batch) + + ray_d0 = batch['ray_d'] + viewdir = ray_d0 / torch.norm(ray_d0, dim=2, keepdim=True) + viewdir = embedder.view_embedder(viewdir) + viewdir = viewdir[:, :, None].repeat(1, 1, pts.size(2), 1).contiguous() + + sp_input = self.prepare_sp_input(batch) + + # reshape to [batch_size, n, 3] + light_pts = light_pts.view(sh[0], -1, embedder.xyz_dim) + viewdir = viewdir.view(sh[0], -1, embedder.view_dim) + ppts = ppts.view(sh[0], -1, 3) + + # create grid coords for sampling feature volume at t pose + ptot_pts = batch['ptot_pts'] + tgrid_coords = self.get_ptot_grid_coords(ptot_pts, sp_input['out_sh'], + batch['tbounds']) + + # create grid coords for sampling feature volume at i-th frame + pgrid_coords = self.get_grid_coords(ppts, ptot_pts, batch['pbounds']) + + if ray_o.size(1) <= 2048: + raw = self.net(sp_input, tgrid_coords, pgrid_coords, viewdir, + light_pts) + else: + raw = self.batchify_rays(sp_input, tgrid_coords, pgrid_coords, + viewdir, light_pts, 1024 * 32, None) + + # reshape to [num_rays, num_samples along ray, 4] + raw = raw.reshape(-1, z_vals.size(2), 4) + z_vals = z_vals.view(-1, z_vals.size(2)) + ray_d = ray_d.view(-1, 3) + rgb_map, disp_map, acc_map, weights, depth_map = raw2outputs( + raw, z_vals, ray_d, cfg.raw_noise_std, cfg.white_bkgd) + rgb_map = rgb_map.view(*sh[:-1], -1) + acc_map = acc_map.view(*sh[:-1]) + depth_map = depth_map.view(*sh[:-1]) + + ret = {'rgb_map': rgb_map, 'acc_map': acc_map, 'depth_map': depth_map} + + return ret diff --git a/lib/networks/renderer/volume_mesh_renderer.py b/lib/networks/renderer/volume_mesh_renderer.py new file mode 100644 index 0000000000000000000000000000000000000000..6ffbe7eaf62da723dde235b9335079d6cf243b2e --- /dev/null +++ b/lib/networks/renderer/volume_mesh_renderer.py @@ -0,0 +1,107 @@ +import torch +from lib.config import cfg +from .nerf_net_utils import * +import numpy as np +import mcubes +import trimesh + + +class Renderer: + def __init__(self, net): + self.net = net + + def render_rays(self, ray_batch, net_c=None, pytest=False): + """Volumetric rendering. + Args: + ray_batch: array of shape [batch_size, ...]. All information necessary + for sampling along a ray, including: ray origin, ray direction, min + dist, max dist, and unit-magnitude viewing direction. + network_fn: function. Model for predicting RGB and density at each point + in space. + network_query_fn: function used for passing queries to network_fn. + N_samples: int. Number of different times to sample along each ray. + retraw: bool. If True, include model's raw, unprocessed predictions. + lindisp: bool. If True, sample linearly in inverse depth rather than in depth. + perturb: float, 0 or 1. If non-zero, each ray is sampled at stratified + random points in time. + N_importance: int. Number of additional times to sample along each ray. + These samples are only passed to network_fine. + network_fine: "fine" network with same spec as network_fn. + white_bkgd: bool. If True, assume a white background. + raw_noise_std: ... + verbose: bool. If True, print more debugging info. + Returns: + rgb_map: [num_rays, 3]. Estimated RGB color of a ray. Comes from fine model. + disp_map: [num_rays]. Disparity map. 1 / depth. + acc_map: [num_rays]. Accumulated opacity along each ray. Comes from fine model. + raw: [num_rays, num_samples, 4]. Raw predictions from model. + rgb0: See rgb_map. Output for coarse model. + disp0: See disp_map. Output for coarse model. + acc0: See acc_map. Output for coarse model. + z_std: [num_rays]. Standard deviation of distances along ray for each + sample. + """ + pts = ray_batch + if net_c is None: + alpha = self.net(pts) + else: + alpha = self.net(pts, net_c) + + if cfg.N_importance > 0: + alpha_0 = alpha + if net_c is None: + alpha = self.net(pts, model='fine') + else: + alpha = self.net(pts, net_c, model='fine') + + ret = { + 'alpha': alpha + } + if cfg.N_importance > 0: + ret['alpha0'] = alpha_0 + + for k in ret: + DEBUG = False + if (torch.isnan(ret[k]).any() + or torch.isinf(ret[k]).any()) and DEBUG: + print(f"! [Numerical Error] {k} contains nan or inf.") + + return ret + + def batchify_rays(self, rays_flat, chunk=1024 * 32): + """Render rays in smaller minibatches to avoid OOM. + """ + all_ret = {} + for i in range(0, rays_flat.shape[0], chunk): + ret = self.render_rays(rays_flat[i:i + chunk]) + for k in ret: + if k not in all_ret: + all_ret[k] = [] + all_ret[k].append(ret[k]) + all_ret = {k: torch.cat(all_ret[k], 0) for k in all_ret} + return all_ret + + def render(self, batch): + pts = batch['pts'] + sh = pts.shape + + inside = batch['inside'][0].bool() + pts = pts[0][inside][None] + + pts = pts.view(sh[0], -1, 1, 3) + + ret = self.batchify_rays(pts, cfg.chunk) + + alpha = ret['alpha'] + alpha = alpha[0, :, 0, 0].detach().cpu().numpy() + cube = np.zeros(sh[1:-1]) + inside = inside.detach().cpu().numpy() + cube[inside == 1] = alpha + + cube = np.pad(cube, 10, mode='constant') + vertices, triangles = mcubes.marching_cubes(cube, cfg.mesh_th) + mesh = trimesh.Trimesh(vertices, triangles) + + ret = {'cube': cube, 'mesh': mesh} + + return ret diff --git a/lib/networks/renderer/volume_renderer.py b/lib/networks/renderer/volume_renderer.py new file mode 100644 index 0000000000000000000000000000000000000000..45756de6b89b78467db4d5b9426b636a890c2110 --- /dev/null +++ b/lib/networks/renderer/volume_renderer.py @@ -0,0 +1,156 @@ +import torch +from lib.config import cfg +from .nerf_net_utils import * + + +class Renderer: + def __init__(self, net): + self.net = net + + def render_rays(self, ray_batch, net_c=None, pytest=False): + """Volumetric rendering. + Args: + ray_batch: array of shape [batch_size, ...]. All information necessary + for sampling along a ray, including: ray origin, ray direction, min + dist, max dist, and unit-magnitude viewing direction. + network_fn: function. Model for predicting RGB and density at each point + in space. + network_query_fn: function used for passing queries to network_fn. + N_samples: int. Number of different times to sample along each ray. + retraw: bool. If True, include model's raw, unprocessed predictions. + lindisp: bool. If True, sample linearly in inverse depth rather than in depth. + perturb: float, 0 or 1. If non-zero, each ray is sampled at stratified + random points in time. + N_importance: int. Number of additional times to sample along each ray. + These samples are only passed to network_fine. + network_fine: "fine" network with same spec as network_fn. + white_bkgd: bool. If True, assume a white background. + raw_noise_std: ... + verbose: bool. If True, print more debugging info. + Returns: + rgb_map: [num_rays, 3]. Estimated RGB color of a ray. Comes from fine model. + disp_map: [num_rays]. Disparity map. 1 / depth. + acc_map: [num_rays]. Accumulated opacity along each ray. Comes from fine model. + raw: [num_rays, num_samples, 4]. Raw predictions from model. + rgb0: See rgb_map. Output for coarse model. + disp0: See disp_map. Output for coarse model. + acc0: See acc_map. Output for coarse model. + z_std: [num_rays]. Standard deviation of distances along ray for each + sample. + """ + N_rays = ray_batch.shape[0] + rays_o, rays_d = ray_batch[:, 0:3], ray_batch[:, + 3:6] # [N_rays, 3] each + viewdirs = ray_batch[:, -3:] if ray_batch.shape[-1] > 8 else None + bounds = torch.reshape(ray_batch[..., 6:8], [-1, 1, 2]) + near, far = bounds[..., 0], bounds[..., 1] # [-1,1] + + t_vals = torch.linspace(0., 1., steps=cfg.N_samples).to(near) + if not cfg.lindisp: + z_vals = near * (1. - t_vals) + far * (t_vals) + else: + z_vals = 1. / (1. / near * (1. - t_vals) + 1. / far * (t_vals)) + + z_vals = z_vals.expand([N_rays, cfg.N_samples]) + + if cfg.perturb > 0. and self.net.training: + # get intervals between samples + mids = .5 * (z_vals[..., 1:] + z_vals[..., :-1]) + upper = torch.cat([mids, z_vals[..., -1:]], -1) + lower = torch.cat([z_vals[..., :1], mids], -1) + # stratified samples in those intervals + t_rand = torch.rand(z_vals.shape).to(upper) + + # Pytest, overwrite u with numpy's fixed random numbers + if pytest: + np.random.seed(0) + t_rand = np.random.rand(*list(z_vals.shape)) + t_rand = torch.Tensor(t_rand) + + z_vals = lower + (upper - lower) * t_rand + + pts = rays_o[..., None, :] + rays_d[..., None, :] * z_vals[ + ..., :, None] # [N_rays, N_samples, 3] + + if net_c is None: + raw = self.net(pts, viewdirs) + else: + raw = self.net(pts, viewdirs, net_c) + rgb_map, disp_map, acc_map, weights, depth_map = raw2outputs( + raw, z_vals, rays_d, cfg.raw_noise_std, cfg.white_bkgd) + + if cfg.N_importance > 0: + + rgb_map_0, disp_map_0, acc_map_0 = rgb_map, disp_map, acc_map + + z_vals_mid = .5 * (z_vals[..., 1:] + z_vals[..., :-1]) + z_samples = sample_pdf(z_vals_mid, + weights[..., 1:-1], + cfg.N_importance, + det=(cfg.perturb == 0.)) + z_samples = z_samples.detach() + + z_vals, _ = torch.sort(torch.cat([z_vals, z_samples], -1), -1) + pts = rays_o[..., None, :] + rays_d[..., None, :] * z_vals[ + ..., :, None] # [N_rays, N_samples + N_importance, 3] + + # raw = run_network(pts, fn=run_fn) + if net_c is None: + raw = self.net(pts, viewdirs, model='fine') + else: + raw = self.net(pts, viewdirs, net_c, model='fine') + + rgb_map, disp_map, acc_map, weights, depth_map = raw2outputs( + raw, z_vals, rays_d, cfg.raw_noise_std, cfg.white_bkgd) + + ret = { + 'rgb_map': rgb_map, + 'disp_map': disp_map, + 'acc_map': acc_map, + 'depth_map': depth_map + } + ret['raw'] = raw + if cfg.N_importance > 0: + ret['rgb0'] = rgb_map_0 + ret['disp0'] = disp_map_0 + ret['acc0'] = acc_map_0 + ret['z_std'] = torch.std(z_samples, dim=-1, + unbiased=False) # [N_rays] + + for k in ret: + DEBUG = False + if (torch.isnan(ret[k]).any() + or torch.isinf(ret[k]).any()) and DEBUG: + print(f"! [Numerical Error] {k} contains nan or inf.") + + return ret + + def batchify_rays(self, rays_flat, chunk=1024 * 32): + """Render rays in smaller minibatches to avoid OOM. + """ + all_ret = {} + for i in range(0, rays_flat.shape[0], chunk): + ret = self.render_rays(rays_flat[i:i + chunk]) + for k in ret: + if k not in all_ret: + all_ret[k] = [] + all_ret[k].append(ret[k]) + all_ret = {k: torch.cat(all_ret[k], 0) for k in all_ret} + return all_ret + + def render(self, batch): + rays_o = batch['ray_o'] + rays_d = batch['ray_d'] + near = batch['near'] + far = batch['far'] + + sh = rays_o.shape + rays_o, rays_d = rays_o.view(-1, 3), rays_d.view(-1, 3) + near, far = near.transpose(0, 1), far.transpose(0, 1) + viewdirs = rays_d + viewdirs = viewdirs / torch.norm(viewdirs, dim=-1, keepdim=True) + rays = torch.cat([rays_o, rays_d, near, far, viewdirs], dim=-1) + ret = self.batchify_rays(rays, cfg.chunk) + ret = {k: v.view(*sh[:-1], -1) for k, v in ret.items()} + ret['depth_map'] = ret['depth_map'].view(*sh[:-1]) + return ret diff --git a/lib/networks/tpose_xyzc.py b/lib/networks/tpose_xyzc.py new file mode 100644 index 0000000000000000000000000000000000000000..ced33a672639383d1dc7a6c673994e0b1065ca2f --- /dev/null +++ b/lib/networks/tpose_xyzc.py @@ -0,0 +1,209 @@ +import torch.nn as nn +import spconv +import torch.nn.functional as F +import torch +from lib.config import cfg + + +class Network(nn.Module): + def __init__(self): + super(Network, self).__init__() + + self.c = nn.Embedding(6890, 16) + self.xyzc_net = SparseConvNet() + + self.latent = nn.Embedding(cfg.ni, 128) + + self.actvn = nn.ReLU() + + self.fc_0 = nn.Conv1d(352, 256, 1) + self.fc_1 = nn.Conv1d(256, 256, 1) + self.fc_2 = nn.Conv1d(256, 256, 1) + self.alpha_fc = nn.Conv1d(256, 1, 1) + + self.feature_fc = nn.Conv1d(256, 256, 1) + self.latent_fc = nn.Conv1d(384, 256, 1) + self.view_fc = nn.Conv1d(346, 128, 1) + self.rgb_fc = nn.Conv1d(128, 3, 1) + + def forward(self, sp_input, tgrid_coords, pgrid_coords, viewdir, + light_pts): + coord = sp_input['coord'] + out_sh = sp_input['out_sh'] + batch_size = sp_input['batch_size'] + + pgrid_coords = pgrid_coords[:, None, None] + + code = self.c(torch.arange(0, 6890).to(tgrid_coords.device)) + xyzc = spconv.SparseConvTensor(code, coord, out_sh, batch_size) + + xyzc_features = self.xyzc_net(xyzc, tgrid_coords, pgrid_coords) + + net = self.actvn(self.fc_0(xyzc_features)) + net = self.actvn(self.fc_1(net)) + net = self.actvn(self.fc_2(net)) + + alpha = self.alpha_fc(net) + + features = self.feature_fc(net) + + latent = self.latent(sp_input['i']) + latent = latent[..., None].expand(*latent.shape, net.size(2)) + features = torch.cat((features, latent), dim=1) + features = self.latent_fc(features) + + viewdir = viewdir.transpose(1, 2) + light_pts = light_pts.transpose(1, 2) + features = torch.cat((features, viewdir, light_pts), dim=1) + net = self.actvn(self.view_fc(features)) + rgb = self.rgb_fc(net) + + raw = torch.cat((rgb, alpha), dim=1) + raw = raw.transpose(1, 2) + + return raw + + +class SparseConvNet(nn.Module): + def __init__(self): + super(SparseConvNet, self).__init__() + + self.conv0 = double_conv(16, 16, 'subm0') + self.down0 = stride_conv(16, 32, 'down0') + + self.conv1 = double_conv(32, 32, 'subm1') + self.down1 = stride_conv(32, 64, 'down1') + + self.conv2 = triple_conv(64, 64, 'subm2') + self.down2 = stride_conv(64, 128, 'down2') + + self.conv3 = triple_conv(128, 128, 'subm3') + self.down3 = stride_conv(128, 128, 'down3') + + self.conv4 = triple_conv(128, 128, 'subm4') + + def forward(self, x, tgrid_coords, pgrid_coords): + net = self.conv0(x) + net = self.down0(net) + + net = self.conv1(net) + net1 = net.dense() + feature_1 = F.grid_sample(net1, + tgrid_coords, + padding_mode='zeros', + align_corners=True) + feature_1 = F.grid_sample(feature_1, + pgrid_coords, + padding_mode='zeros', + align_corners=True) + net = self.down1(net) + + net = self.conv2(net) + net2 = net.dense() + feature_2 = F.grid_sample(net2, + tgrid_coords, + padding_mode='zeros', + align_corners=True) + feature_2 = F.grid_sample(feature_2, + pgrid_coords, + padding_mode='zeros', + align_corners=True) + net = self.down2(net) + + net = self.conv3(net) + net3 = net.dense() + feature_3 = F.grid_sample(net3, + tgrid_coords, + padding_mode='zeros', + align_corners=True) + feature_3 = F.grid_sample(feature_3, + pgrid_coords, + padding_mode='zeros', + align_corners=True) + net = self.down3(net) + + net = self.conv4(net) + net4 = net.dense() + feature_4 = F.grid_sample(net4, + tgrid_coords, + padding_mode='zeros', + align_corners=True) + feature_4 = F.grid_sample(feature_4, + pgrid_coords, + padding_mode='zeros', + align_corners=True) + + features = torch.cat((feature_1, feature_2, feature_3, feature_4), + dim=1) + features = features.view(features.size(0), -1, features.size(4)) + + return features + + +def single_conv(in_channels, out_channels, indice_key=None): + return spconv.SparseSequential( + spconv.SubMConv3d(in_channels, + out_channels, + 1, + bias=False, + indice_key=indice_key), + nn.BatchNorm1d(out_channels, eps=1e-3, momentum=0.01), + nn.ReLU(), + ) + + +def double_conv(in_channels, out_channels, indice_key=None): + return spconv.SparseSequential( + spconv.SubMConv3d(in_channels, + out_channels, + 3, + bias=False, + indice_key=indice_key), + nn.BatchNorm1d(out_channels, eps=1e-3, momentum=0.01), + nn.ReLU(), + spconv.SubMConv3d(out_channels, + out_channels, + 3, + bias=False, + indice_key=indice_key), + nn.BatchNorm1d(out_channels, eps=1e-3, momentum=0.01), + nn.ReLU(), + ) + + +def triple_conv(in_channels, out_channels, indice_key=None): + return spconv.SparseSequential( + spconv.SubMConv3d(in_channels, + out_channels, + 3, + bias=False, + indice_key=indice_key), + nn.BatchNorm1d(out_channels, eps=1e-3, momentum=0.01), + nn.ReLU(), + spconv.SubMConv3d(out_channels, + out_channels, + 3, + bias=False, + indice_key=indice_key), + nn.BatchNorm1d(out_channels, eps=1e-3, momentum=0.01), + nn.ReLU(), + spconv.SubMConv3d(out_channels, + out_channels, + 3, + bias=False, + indice_key=indice_key), + nn.BatchNorm1d(out_channels, eps=1e-3, momentum=0.01), + nn.ReLU(), + ) + + +def stride_conv(in_channels, out_channels, indice_key=None): + return spconv.SparseSequential( + spconv.SparseConv3d(in_channels, + out_channels, + 3, + 2, + padding=1, + bias=False, + indice_key=indice_key), + nn.BatchNorm1d(out_channels, eps=1e-3, momentum=0.01), nn.ReLU()) diff --git a/lib/train/__init__.py b/lib/train/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..218deaca451d6a5fffd9dd73681e9d33a2cf26c0 --- /dev/null +++ b/lib/train/__init__.py @@ -0,0 +1,5 @@ +from .trainers import make_trainer +from .optimizer import make_optimizer +from .scheduler import make_lr_scheduler, set_lr_scheduler +from .recorder import make_recorder + diff --git a/lib/train/optimizer.py b/lib/train/optimizer.py new file mode 100644 index 0000000000000000000000000000000000000000..430475bc98df158ed7678117ada7b1bc30793e51 --- /dev/null +++ b/lib/train/optimizer.py @@ -0,0 +1,27 @@ +import torch +from lib.utils.optimizer.radam import RAdam + + +_optimizer_factory = { + 'adam': torch.optim.Adam, + 'radam': RAdam, + 'sgd': torch.optim.SGD +} + + +def make_optimizer(cfg, net, lr=None, weight_decay=None): + params = [] + lr = cfg.train.lr if lr is None else lr + weight_decay = cfg.train.weight_decay if weight_decay is None else weight_decay + + for key, value in net.named_parameters(): + if not value.requires_grad: + continue + params += [{"params": [value], "lr": lr, "weight_decay": weight_decay}] + + if 'adam' in cfg.train.optim: + optimizer = _optimizer_factory[cfg.train.optim](params, lr, weight_decay=weight_decay) + else: + optimizer = _optimizer_factory[cfg.train.optim](params, lr, momentum=0.9) + + return optimizer diff --git a/lib/train/recorder.py b/lib/train/recorder.py new file mode 100644 index 0000000000000000000000000000000000000000..c93c44936838e3524d4ad3253c739d469bac1d41 --- /dev/null +++ b/lib/train/recorder.py @@ -0,0 +1,125 @@ +from collections import deque, defaultdict +import torch +from tensorboardX import SummaryWriter +import os +from lib.config.config import cfg + +from termcolor import colored + + +class SmoothedValue(object): + """Track a series of values and provide access to smoothed values over a + window or the global series average. + """ + + def __init__(self, window_size=20): + self.deque = deque(maxlen=window_size) + self.total = 0.0 + self.count = 0 + + def update(self, value): + self.deque.append(value) + self.count += 1 + self.total += value + + @property + def median(self): + d = torch.tensor(list(self.deque)) + return d.median().item() + + @property + def avg(self): + d = torch.tensor(list(self.deque)) + return d.mean().item() + + @property + def global_avg(self): + return self.total / self.count + + +class Recorder(object): + def __init__(self, cfg): + if cfg.local_rank > 0: + return + + log_dir = cfg.record_dir + if not cfg.resume: + print(colored('remove contents of directory %s' % log_dir, 'red')) + os.system('rm -r %s/*' % log_dir) + self.writer = SummaryWriter(log_dir=log_dir) + + # scalars + self.epoch = 0 + self.step = 0 + self.loss_stats = defaultdict(SmoothedValue) + self.batch_time = SmoothedValue() + self.data_time = SmoothedValue() + + # images + self.image_stats = defaultdict(object) + if 'process_' + cfg.task in globals(): + self.processor = globals()['process_' + cfg.task] + else: + self.processor = None + + def update_loss_stats(self, loss_dict): + if cfg.local_rank > 0: + return + for k, v in loss_dict.items(): + self.loss_stats[k].update(v.detach().cpu()) + + def update_image_stats(self, image_stats): + if cfg.local_rank > 0: + return + if self.processor is None: + return + image_stats = self.processor(image_stats) + for k, v in image_stats.items(): + self.image_stats[k] = v.detach().cpu() + + def record(self, prefix, step=-1, loss_stats=None, image_stats=None): + if cfg.local_rank > 0: + return + + pattern = prefix + '/{}' + step = step if step >= 0 else self.step + loss_stats = loss_stats if loss_stats else self.loss_stats + + for k, v in loss_stats.items(): + if isinstance(v, SmoothedValue): + self.writer.add_scalar(pattern.format(k), v.median, step) + else: + self.writer.add_scalar(pattern.format(k), v, step) + + if self.processor is None: + return + image_stats = self.processor(image_stats) if image_stats else self.image_stats + for k, v in image_stats.items(): + self.writer.add_image(pattern.format(k), v, step) + + def state_dict(self): + if cfg.local_rank > 0: + return + scalar_dict = {} + scalar_dict['step'] = self.step + return scalar_dict + + def load_state_dict(self, scalar_dict): + if cfg.local_rank > 0: + return + self.step = scalar_dict['step'] + + def __str__(self): + if cfg.local_rank > 0: + return + loss_state = [] + for k, v in self.loss_stats.items(): + loss_state.append('{}: {:.4f}'.format(k, v.avg)) + loss_state = ' '.join(loss_state) + + recording_state = ' '.join(['epoch: {}', 'step: {}', '{}', 'data: {:.4f}', 'batch: {:.4f}']) + return recording_state.format(self.epoch, self.step, loss_state, self.data_time.avg, self.batch_time.avg) + + +def make_recorder(cfg): + return Recorder(cfg) diff --git a/lib/train/scheduler.py b/lib/train/scheduler.py new file mode 100644 index 0000000000000000000000000000000000000000..33f2ba8f47b721de492c7899b0e64b8e264ed7a4 --- /dev/null +++ b/lib/train/scheduler.py @@ -0,0 +1,24 @@ +from collections import Counter +from lib.utils.optimizer.lr_scheduler import WarmupMultiStepLR, MultiStepLR, ExponentialLR + + +def make_lr_scheduler(cfg, optimizer): + cfg_scheduler = cfg.train.scheduler + if cfg_scheduler.type == 'multi_step': + scheduler = MultiStepLR(optimizer, + milestones=cfg_scheduler.milestones, + gamma=cfg_scheduler.gamma) + elif cfg_scheduler.type == 'exponential': + scheduler = ExponentialLR(optimizer, + decay_epochs=cfg_scheduler.decay_epochs, + gamma=cfg_scheduler.gamma) + return scheduler + + +def set_lr_scheduler(cfg, scheduler): + cfg_scheduler = cfg.train.scheduler + if cfg_scheduler.type == 'multi_step': + scheduler.milestones = Counter(cfg_scheduler.milestones) + elif cfg_scheduler.type == 'exponential': + scheduler.decay_epochs = cfg_scheduler.decay_epochs + scheduler.gamma = cfg_scheduler.gamma diff --git a/lib/train/trainers/__init__.py b/lib/train/trainers/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..55c9e08127e6a9ea98f389384e989f08b52e0671 --- /dev/null +++ b/lib/train/trainers/__init__.py @@ -0,0 +1 @@ +from .make_trainer import make_trainer diff --git a/lib/train/trainers/if_nerf_clight.py b/lib/train/trainers/if_nerf_clight.py new file mode 100644 index 0000000000000000000000000000000000000000..91fd3e663b079f98f2ea0567b8edb8646a7770ce --- /dev/null +++ b/lib/train/trainers/if_nerf_clight.py @@ -0,0 +1,37 @@ +import torch.nn as nn +from lib.config import cfg +import torch +from lib.networks.renderer import if_clight_renderer +from lib.train import make_optimizer + + +class NetworkWrapper(nn.Module): + def __init__(self, net): + super(NetworkWrapper, self).__init__() + + self.net = net + self.renderer = if_clight_renderer.Renderer(self.net) + + self.img2mse = lambda x, y : torch.mean((x - y) ** 2) + self.acc_crit = torch.nn.functional.smooth_l1_loss + + def forward(self, batch): + ret = self.renderer.render(batch) + + scalar_stats = {} + loss = 0 + + mask = batch['mask_at_box'] + img_loss = self.img2mse(ret['rgb_map'][mask], batch['rgb'][mask]) + scalar_stats.update({'img_loss': img_loss}) + loss += img_loss + + if 'rgb0' in ret: + img_loss0 = self.img2mse(ret['rgb0'], batch['rgb']) + scalar_stats.update({'img_loss0': img_loss0}) + loss += img_loss0 + + scalar_stats.update({'loss': loss}) + image_stats = {} + + return ret, loss, scalar_stats, image_stats diff --git a/lib/train/trainers/make_trainer.py b/lib/train/trainers/make_trainer.py new file mode 100644 index 0000000000000000000000000000000000000000..e8fb62e3189b19577db37e0c24007f7001ae9818 --- /dev/null +++ b/lib/train/trainers/make_trainer.py @@ -0,0 +1,14 @@ +from .trainer import Trainer +import imp + + +def _wrapper_factory(cfg, network): + module = cfg.trainer_module + path = cfg.trainer_path + network_wrapper = imp.load_source(module, path).NetworkWrapper(network) + return network_wrapper + + +def make_trainer(cfg, network): + network = _wrapper_factory(cfg, network) + return Trainer(network) diff --git a/lib/train/trainers/nerf.py b/lib/train/trainers/nerf.py new file mode 100644 index 0000000000000000000000000000000000000000..239bd89fbe588607f27ecabef193b85e10303119 --- /dev/null +++ b/lib/train/trainers/nerf.py @@ -0,0 +1,37 @@ +import torch.nn as nn +from lib.config import cfg +import torch +from lib.networks.renderer import volume_renderer +from lib.train import make_optimizer + + +class NetworkWrapper(nn.Module): + def __init__(self, net): + super(NetworkWrapper, self).__init__() + + self.net = net + self.renderer = volume_renderer.Renderer(self.net) + + self.img2mse = lambda x, y : torch.mean((x - y) ** 2) + self.acc_crit = torch.nn.functional.smooth_l1_loss + + def forward(self, batch): + ret = self.renderer.render(batch) + + scalar_stats = {} + loss = 0 + + mask = batch['mask_at_box'] + img_loss = self.img2mse(ret['rgb_map'][mask], batch['rgb'][mask]) + scalar_stats.update({'img_loss': img_loss}) + loss += img_loss + + if 'rgb0' in ret: + img_loss0 = self.img2mse(ret['rgb0'], batch['rgb']) + scalar_stats.update({'img_loss0': img_loss0}) + loss += img_loss0 + + scalar_stats.update({'loss': loss}) + image_stats = {} + + return ret, loss, scalar_stats, image_stats diff --git a/lib/train/trainers/tpose.py b/lib/train/trainers/tpose.py new file mode 100644 index 0000000000000000000000000000000000000000..f7405d14157fd69aad57b6336a3324bdad06f6b5 --- /dev/null +++ b/lib/train/trainers/tpose.py @@ -0,0 +1,37 @@ +import torch.nn as nn +from lib.config import cfg +import torch +from lib.networks.renderer import tpose_renderer +from lib.train import make_optimizer + + +class NetworkWrapper(nn.Module): + def __init__(self, net): + super(NetworkWrapper, self).__init__() + + self.net = net + self.renderer = tpose_renderer.Renderer(self.net) + + self.img2mse = lambda x, y : torch.mean((x - y) ** 2) + self.acc_crit = torch.nn.functional.smooth_l1_loss + + def forward(self, batch): + ret = self.renderer.render(batch) + + scalar_stats = {} + loss = 0 + + mask = batch['mask_at_box'] + img_loss = self.img2mse(ret['rgb_map'][mask], batch['rgb'][mask]) + scalar_stats.update({'img_loss': img_loss}) + loss += img_loss + + if 'rgb0' in ret: + img_loss0 = self.img2mse(ret['rgb0'], batch['rgb']) + scalar_stats.update({'img_loss0': img_loss0}) + loss += img_loss0 + + scalar_stats.update({'loss': loss}) + image_stats = {} + + return ret, loss, scalar_stats, image_stats diff --git a/lib/train/trainers/trainer.py b/lib/train/trainers/trainer.py new file mode 100644 index 0000000000000000000000000000000000000000..03e0dfa9b74462c23bb3b86990cf063198987a3f --- /dev/null +++ b/lib/train/trainers/trainer.py @@ -0,0 +1,113 @@ +import time +import datetime +import torch +import tqdm +from torch.nn import DataParallel +from lib.config import cfg + + +class Trainer(object): + def __init__(self, network): + device = torch.device('cuda:{}'.format(cfg.local_rank)) + network = network.to(device) + if cfg.distributed: + network = torch.nn.parallel.DistributedDataParallel( + network, + device_ids=[cfg.local_rank], + output_device=cfg.local_rank + ) + self.network = network + self.local_rank = cfg.local_rank + self.device = device + + def reduce_loss_stats(self, loss_stats): + reduced_losses = {k: torch.mean(v) for k, v in loss_stats.items()} + return reduced_losses + + def to_cuda(self, batch): + for k in batch: + if k == 'meta': + continue + if isinstance(batch[k], tuple) or isinstance(batch[k], list): + batch[k] = [b.to(self.device) for b in batch[k]] + else: + batch[k] = batch[k].to(self.device) + return batch + + def train(self, epoch, data_loader, optimizer, recorder): + max_iter = len(data_loader) + self.network.train() + end = time.time() + for iteration, batch in enumerate(data_loader): + data_time = time.time() - end + iteration = iteration + 1 + + batch = self.to_cuda(batch) + output, loss, loss_stats, image_stats = self.network(batch) + + # training stage: loss; optimizer; scheduler + optimizer.zero_grad() + loss = loss.mean() + loss.backward() + torch.nn.utils.clip_grad_value_(self.network.parameters(), 40) + optimizer.step() + + if cfg.local_rank > 0: + continue + + # data recording stage: loss_stats, time, image_stats + recorder.step += 1 + + loss_stats = self.reduce_loss_stats(loss_stats) + recorder.update_loss_stats(loss_stats) + + batch_time = time.time() - end + end = time.time() + recorder.batch_time.update(batch_time) + recorder.data_time.update(data_time) + + if iteration % cfg.log_interval == 0 or iteration == (max_iter - 1): + # print training state + eta_seconds = recorder.batch_time.global_avg * (max_iter - iteration) + eta_string = str(datetime.timedelta(seconds=int(eta_seconds))) + lr = optimizer.param_groups[0]['lr'] + memory = torch.cuda.max_memory_allocated() / 1024.0 / 1024.0 + + training_state = ' '.join(['eta: {}', '{}', 'lr: {:.6f}', 'max_mem: {:.0f}']) + training_state = training_state.format(eta_string, str(recorder), lr, memory) + print(training_state) + + if iteration % cfg.record_interval == 0 or iteration == (max_iter - 1): + # record loss_stats and image_dict + recorder.update_image_stats(image_stats) + recorder.record('train') + + def val(self, epoch, data_loader, evaluator=None, recorder=None): + self.network.eval() + torch.cuda.empty_cache() + val_loss_stats = {} + data_size = len(data_loader) + for batch in tqdm.tqdm(data_loader): + batch = self.to_cuda(batch) + with torch.no_grad(): + output, loss, loss_stats, image_stats = self.network(batch) + if evaluator is not None: + evaluator.evaluate(output, batch) + + loss_stats = self.reduce_loss_stats(loss_stats) + for k, v in loss_stats.items(): + val_loss_stats.setdefault(k, 0) + val_loss_stats[k] += v + + loss_state = [] + for k in val_loss_stats.keys(): + val_loss_stats[k] /= data_size + loss_state.append('{}: {:.4f}'.format(k, val_loss_stats[k])) + print(loss_state) + + if evaluator is not None: + result = evaluator.summarize() + val_loss_stats.update(result) + + if recorder: + recorder.record('val', epoch, val_loss_stats, image_stats) diff --git a/lib/utils/base_utils.py b/lib/utils/base_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..1432d682af0d366d9e876f9686b258d92439435f --- /dev/null +++ b/lib/utils/base_utils.py @@ -0,0 +1,46 @@ +import pickle +import os +import numpy as np + + +def read_pickle(pkl_path): + with open(pkl_path, 'rb') as f: + return pickle.load(f) + + +def save_pickle(data, pkl_path): + os.system('mkdir -p {}'.format(os.path.dirname(pkl_path))) + with open(pkl_path, 'wb') as f: + pickle.dump(data, f) + + +def project(xyz, K, RT): + """ + xyz: [N, 3] + K: [3, 3] + RT: [3, 4] + """ + xyz = np.dot(xyz, RT[:, :3].T) + RT[:, 3:].T + xyz = np.dot(xyz, K.T) + xy = xyz[:, :2] / xyz[:, 2:] + return xy + + +def write_K_pose_inf(K, poses, img_root): + K = K.copy() + K[:2] = K[:2] * 8 + K_inf = os.path.join(img_root, 'Intrinsic.inf') + os.system('mkdir -p {}'.format(os.path.dirname(K_inf))) + with open(K_inf, 'w') as f: + for i in range(len(poses)): + f.write('%d\n'%i) + f.write('%f %f %f\n %f %f %f\n %f %f %f\n' % tuple(K.reshape(9).tolist())) + f.write('\n') + + pose_inf = os.path.join(img_root, 'CamPose.inf') + with open(pose_inf, 'w') as f: + for pose in poses: + pose = np.linalg.inv(pose) + A = pose[0:3,:] + tmp = np.concatenate([A[0:3,2].T, A[0:3,0].T,A[0:3,1].T,A[0:3,3].T]) + f.write('%f %f %f %f %f %f %f %f %f %f %f %f\n' % tuple(tmp.tolist())) diff --git a/lib/utils/blend_utils.py b/lib/utils/blend_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..c1679739ddc64f41961ca60ef9c3055bb644541e --- /dev/null +++ b/lib/utils/blend_utils.py @@ -0,0 +1,82 @@ +import torch +import torch.nn.functional as F +import numpy as np + + +def ppts_to_pts(ppts, bw, A): + """transform points from the pose space to the zero space""" + sh = ppts.shape + bw = bw.permute(0, 2, 1) + A = torch.bmm(bw, A.view(sh[0], 24, -1)) + A = A.view(sh[0], -1, 4, 4) + pts = ppts - A[..., :3, 3] + R_inv = torch.inverse(A[..., :3, :3]) + pts = torch.sum(R_inv * pts[:, :, None], dim=3) + return pts + + +def grid_sample_blend_weights(grid_coords, bw): + # the blend weight is indexed by xyz + grid_coords = grid_coords[:, None, None] + bw = F.grid_sample(bw, + grid_coords, + padding_mode='border', + align_corners=True) + bw = bw[:, :, 0, 0] + return bw + + +def bounds_grid_sample_blend_weights(pts, bw, bounds): + """grid sample blend weights""" + pts = pts.clone() + + # interpolate blend weights + min_xyz = bounds[:, 0] + max_xyz = bounds[:, 1] + bounds = max_xyz[:, None] - min_xyz[:, None] + grid_coords = (pts - min_xyz[:, None]) / bounds + grid_coords = grid_coords * 2 - 1 + # convert xyz to zyx, since the blend weight is indexed by xyz + grid_coords = grid_coords[..., [2, 1, 0]] + + # the blend weight is indexed by xyz + bw = bw.permute(0, 4, 1, 2, 3) + grid_coords = grid_coords[:, None, None] + bw = F.grid_sample(bw, + grid_coords, + padding_mode='border', + align_corners=True) + bw = bw[:, :, 0, 0] + + return bw + + +def grid_sample_A_blend_weights(nf_grid_coords, bw): + """ + nf_grid_coords: batch_size x N_samples x 24 x 3 + bw: batch_size x 24 x 64 x 64 x 64 + """ + bws = [] + for i in range(24): + nf_grid_coords_ = nf_grid_coords[:, :, i] + nf_grid_coords_ = nf_grid_coords_[:, None, None] + bw_ = F.grid_sample(bw[:, i:i + 1], + nf_grid_coords_, + padding_mode='border', + align_corners=True) + bw_ = bw_[:, :, 0, 0] + bws.append(bw_) + bw = torch.cat(bws, dim=1) + return bw + + +def ppts_to_pts(pts, bw, A): + """transform points from the pose space to the t pose""" + sh = pts.shape + bw = bw.permute(0, 2, 1) + A = torch.bmm(bw, A.view(sh[0], 24, -1)) + A = A.view(sh[0], -1, 4, 4) + pts = pts - A[..., :3, 3] + R_inv = torch.inverse(A[..., :3, :3]) + pts = torch.sum(R_inv * pts[:, :, None], dim=3) + return pts diff --git a/lib/utils/data_utils.py b/lib/utils/data_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..8a701747a1e7d27e8c62390d6f0731113d51fa10 --- /dev/null +++ b/lib/utils/data_utils.py @@ -0,0 +1,394 @@ +import numpy as np +import cv2 +import random +from torch import nn +import torch +from imgaug import augmenters as iaa +from lib.config import cfg +from plyfile import PlyData + + +def gaussian_radius(det_size, min_overlap=0.7): + height, width = det_size + + a1 = 1 + b1 = (height + width) + c1 = width * height * (1 - min_overlap) / (1 + min_overlap) + sq1 = np.sqrt(b1 ** 2 - 4 * a1 * c1) + r1 = (b1 + sq1) / 2 + + a2 = 4 + b2 = 2 * (height + width) + c2 = (1 - min_overlap) * width * height + sq2 = np.sqrt(b2 ** 2 - 4 * a2 * c2) + r2 = (b2 + sq2) / 2 + + a3 = 4 * min_overlap + b3 = -2 * min_overlap * (height + width) + c3 = (min_overlap - 1) * width * height + if b3 ** 2 - 4 * a3 * c3 < 0: + r3 = min(r1, r2) + else: + sq3 = np.sqrt(b3 ** 2 - 4 * a3 * c3) + r3 = (b3 + sq3) / 2 + return min(r1, r2, r3) + + +def gaussian2D(shape, sigma=(1, 1), rho=0): + if not isinstance(sigma, tuple): + sigma = (sigma, sigma) + sigma_x, sigma_y = sigma + + m, n = [(ss - 1.) / 2. for ss in shape] + y, x = np.ogrid[-m:m+1, -n:n+1] + + energy = (x * x) / (sigma_x * sigma_x) - 2 * rho * x * y / (sigma_x * sigma_y) + (y * y) / (sigma_y * sigma_y) + h = np.exp(-energy / (2 * (1 - rho * rho))) + h[h < np.finfo(h.dtype).eps * h.max()] = 0 + return h + + +def draw_umich_gaussian(heatmap, center, radius, k=1): + diameter = 2 * radius + 1 + gaussian = gaussian2D((diameter, diameter), sigma=diameter / 6) + + x, y = int(center[0]), int(center[1]) + + height, width = heatmap.shape[0:2] + + left, right = min(x, radius), min(width - x, radius + 1) + top, bottom = min(y, radius), min(height - y, radius + 1) + + masked_heatmap = heatmap[y - top:y + bottom, x - left:x + right] + masked_gaussian = gaussian[radius - top:radius + bottom, radius - left:radius + right] + if min(masked_gaussian.shape) > 0 and min(masked_heatmap.shape) > 0: # TODO debug + np.maximum(masked_heatmap, masked_gaussian * k, out=masked_heatmap) + return heatmap + + +def draw_distribution(heatmap, center, sigma_x, sigma_y, rho, radius, k=1): + diameter = 2 * radius + 1 + gaussian = gaussian2D((diameter, diameter), (sigma_x/3, sigma_y/3), rho) + + x, y = int(center[0]), int(center[1]) + + height, width = heatmap.shape[0:2] + + left, right = min(x, radius), min(width - x, radius + 1) + top, bottom = min(y, radius), min(height - y, radius + 1) + + masked_heatmap = heatmap[y - top:y + bottom, x - left:x + right] + masked_gaussian = gaussian[radius - top:radius + bottom, radius - left:radius + right] + if min(masked_gaussian.shape) > 0 and min(masked_heatmap.shape) > 0: # TODO debug + np.maximum(masked_heatmap, masked_gaussian * k, out=masked_heatmap) + return heatmap + + +def draw_heatmap_np(hm, point, box_size): + """point: [x, y]""" + # radius = gaussian_radius(box_size) + radius = box_size[0] + radius = max(0, int(radius)) + ct_int = np.array(point, dtype=np.int32) + draw_umich_gaussian(hm, ct_int, radius) + return hm + + +def get_edge(mask): + kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3)) + return mask - cv2.erode(mask, kernel) + + +def compute_gaussian_1d(dmap, sigma=1): + """dmap: each entry means a distance""" + prob = np.exp(-dmap / (2 * sigma * sigma)) + prob[prob < np.finfo(prob.dtype).eps * prob.max()] = 0 + return prob + + +def get_3rd_point(a, b): + direct = a - b + return b + np.array([-direct[1], direct[0]], dtype=np.float32) + + +def get_dir(src_point, rot_rad): + sn, cs = np.sin(rot_rad), np.cos(rot_rad) + + src_result = [0, 0] + src_result[0] = src_point[0] * cs - src_point[1] * sn + src_result[1] = src_point[0] * sn + src_point[1] * cs + + return src_result + + +def get_affine_transform(center, + scale, + rot, + output_size, + shift=np.array([0, 0], dtype=np.float32), + inv=0): + if not isinstance(scale, np.ndarray) and not isinstance(scale, list): + scale = np.array([scale, scale], dtype=np.float32) + + scale_tmp = scale + src_w = scale_tmp[0] + dst_w = output_size[0] + dst_h = output_size[1] + + rot_rad = np.pi * rot / 180 + src_dir = get_dir([0, src_w * -0.5], rot_rad) + dst_dir = np.array([0, dst_w * -0.5], np.float32) + + src = np.zeros((3, 2), dtype=np.float32) + dst = np.zeros((3, 2), dtype=np.float32) + src[0, :] = center + scale_tmp * shift + src[1, :] = center + src_dir + scale_tmp * shift + dst[0, :] = [dst_w * 0.5, dst_h * 0.5] + dst[1, :] = np.array([dst_w * 0.5, dst_h * 0.5], np.float32) + dst_dir + + src[2:, :] = get_3rd_point(src[0, :], src[1, :]) + dst[2:, :] = get_3rd_point(dst[0, :], dst[1, :]) + + if inv: + trans = cv2.getAffineTransform(np.float32(dst), np.float32(src)) + else: + trans = cv2.getAffineTransform(np.float32(src), np.float32(dst)) + + return trans + + +def affine_transform(pt, t): + """pt: [n, 2]""" + new_pt = np.dot(np.array(pt), t[:, :2].T) + t[:, 2] + return new_pt + + +def homography_transform(pt, H): + """pt: [n, 2]""" + pt = np.concatenate([pt, np.ones([len(pt), 1])], axis=1) + pt = np.dot(pt, H.T) + pt = pt[..., :2] / pt[..., 2:] + return pt + + +def get_border(border, size): + i = 1 + while np.any(size - border // i <= border // i): + i *= 2 + return border // i + + +def grayscale(image): + return cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) + + +def lighting_(data_rng, image, alphastd, eigval, eigvec): + alpha = data_rng.normal(scale=alphastd, size=(3, )) + image += np.dot(eigvec, eigval * alpha) + + +def blend_(alpha, image1, image2): + image1 *= alpha + image2 *= (1 - alpha) + image1 += image2 + + +def saturation_(data_rng, image, gs, gs_mean, var): + alpha = 1. + data_rng.uniform(low=-var, high=var) + blend_(alpha, image, gs[:, :, None]) + + +def brightness_(data_rng, image, gs, gs_mean, var): + alpha = 1. + data_rng.uniform(low=-var, high=var) + image *= alpha + + +def contrast_(data_rng, image, gs, gs_mean, var): + alpha = 1. + data_rng.uniform(low=-var, high=var) + blend_(alpha, image, gs_mean) + + +def color_aug(data_rng, image, eig_val, eig_vec): + functions = [brightness_, contrast_, saturation_] + random.shuffle(functions) + + gs = grayscale(image) + gs_mean = gs.mean() + for f in functions: + f(data_rng, image, gs, gs_mean, 0.4) + lighting_(data_rng, image, 0.1, eig_val, eig_vec) + + +def blur_aug(inp): + if np.random.random() < 0.1: + if np.random.random() < 0.8: + inp = iaa.blur_gaussian_(inp, abs(np.clip(np.random.normal(0, 1.5), -3, 3))) + else: + inp = iaa.MotionBlur((3, 15), (-45, 45))(images=[inp])[0] + + +def gaussian_blur(image, sigma): + from scipy import ndimage + if image.ndim == 2: + image[:, :] = ndimage.gaussian_filter(image[:, :], sigma, mode="mirror") + else: + nb_channels = image.shape[2] + for channel in range(nb_channels): + image[:, :, channel] = ndimage.gaussian_filter(image[:, :, channel], sigma, mode="mirror") + + +def inter_from_mask(pred, gt): + pred = pred.astype(np.bool) + gt = gt.astype(np.bool) + intersection = np.logical_and(gt, pred).sum() + return intersection + + +def draw_poly(mask, poly): + cv2.fillPoly(mask, [poly], 255) + return mask + + +def inter_from_poly(poly, gt, width, height): + mask_small = np.zeros((1, height, width), dtype=np.uint8) + mask_small = draw_poly(mask_small, poly) + mask_gt = gt[..., 0] + + return inter_from_mask(mask_small, mask_gt) + + +def inter_from_polys(poly, w, h, gt_mask): + inter = inter_from_poly(poly, gt_mask, w, h) + if inter > 0: + return False + return True + + +def select_point(shape, poly, gt_mask): + for i in range(cfg.max_iter): + y = np.random.randint(shape[0] - poly['bbox'][3]) + x = np.random.randint(shape[1] - poly['bbox'][2]) + delta = np.array([poly['bbox'][0] - x, poly['bbox'][1] - y]) + poly_move = np.array(poly['poly']) - delta + inter = inter_from_polys(poly_move, shape[1], shape[0], gt_mask) + if inter: + return x, y + x, y = -1, -1 + return x, y + + +def transform_small_gt(poly, box, x, y): + delta = np.array([poly['bbox'][0] - x, poly['bbox'][1] - y]) + poly['poly'] -= delta + box[:2] -= delta + box[2:] -= delta + return poly, box + + +def get_mask_img(img, poly): + mask = np.zeros(img.shape[:2])[..., np.newaxis] + cv2.fillPoly(mask, [np.round(poly['poly']).astype(int)], 1) + poly_img = img * mask + mask = mask[..., 0] + return poly_img, mask + + +def add_small_obj(img, gt_mask, poly, box, polys_gt): + poly_img, mask = get_mask_img(img, poly) + x, y = select_point(img.shape, poly.copy(), gt_mask) + if x == -1: + box = [] + return img, poly, box + poly, box = transform_small_gt(poly, box, x, y) + _, mask_ori = get_mask_img(img, poly) + gt_mask += mask_ori[..., np.newaxis] + img[mask_ori == 1] = poly_img[mask == 1] + return img, poly, box[np.newaxis, :], gt_mask + + +def get_gt_mask(img, poly): + mask = np.zeros(img.shape[:2])[..., np.newaxis] + for i in range(len(poly)): + for j in range(len(poly[i])): + cv2.fillPoly(mask, [np.round(poly[i][j]['poly']).astype(int)], 1) + return mask + + +def small_aug(img, poly, box, label, num): + N = len(poly) + gt_mask = get_gt_mask(img, poly) + for i in range(N): + if len(poly[i]) > 1: + continue + if poly[i][0]['area'] < 32*32: + for k in range(num): + img, poly_s, box_s, gt_mask = add_small_obj(img, gt_mask, poly[i][0].copy(), box[i].copy(), poly) + if len(box_s) == 0: + continue + poly.append([poly_s]) + box = np.concatenate((box, box_s)) + label.append(label[i]) + return img, poly, box, label + + +def truncated_normal(mean, sigma, low, high, data_rng=None): + if data_rng is None: + data_rng = np.random.RandomState() + value = data_rng.normal(mean, sigma) + return np.clip(value, low, high) + + +def _nms(heat, kernel=3): + """heat: [b, c, h, w]""" + pad = (kernel - 1) // 2 + + # find the local minimum of heat within the neighborhood kernel x kernel + hmax = nn.functional.max_pool2d( + heat, (kernel, kernel), stride=1, padding=pad) + keep = (hmax == heat).float() + return heat * keep + + +def _gather_feat(feat, ind, mask=None): + dim = feat.size(2) + ind = ind.unsqueeze(2).expand(ind.size(0), ind.size(1), dim) + feat = feat.gather(1, ind) + if mask is not None: + mask = mask.unsqueeze(2).expand_as(feat) + feat = feat[mask] + feat = feat.view(-1, dim) + return feat + + +def _topk(scores, K=40): + batch, cat, height, width = scores.size() + + topk_scores, topk_inds = torch.topk(scores.view(batch, cat, -1), K) + + topk_inds = topk_inds % (height * width) + topk_ys = (topk_inds / width).int().float() + topk_xs = (topk_inds % width).int().float() + + topk_score, topk_ind = torch.topk(topk_scores.view(batch, -1), K) + topk_clses = (topk_ind / K).int() + topk_inds = _gather_feat( + topk_inds.view(batch, -1, 1), topk_ind).view(batch, K) + topk_ys = _gather_feat(topk_ys.view(batch, -1, 1), topk_ind).view(batch, K) + topk_xs = _gather_feat(topk_xs.view(batch, -1, 1), topk_ind).view(batch, K) + + return topk_score, topk_inds, topk_clses, topk_ys, topk_xs + + +def clip_to_image(bbox, h, w): + bbox[..., :2] = torch.clamp(bbox[..., :2], min=0) + bbox[..., 2] = torch.clamp(bbox[..., 2], max=w-1) + bbox[..., 3] = torch.clamp(bbox[..., 3], max=h-1) + return bbox + + +def load_ply(path): + ply = PlyData.read(path) + data = ply.elements[0].data + x, y, z = data['x'], data['y'], data['z'] + model = np.stack([x, y, z], axis=-1) + return model diff --git a/lib/utils/if_nerf/if_nerf_data_utils.py b/lib/utils/if_nerf/if_nerf_data_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..47bb233c4e8dc0589d93b16c7e4f2a8088077d7f --- /dev/null +++ b/lib/utils/if_nerf/if_nerf_data_utils.py @@ -0,0 +1,406 @@ +import numpy as np +from lib.utils import base_utils +import cv2 +from lib.config import cfg +import trimesh + + +def get_rays(H, W, K, R, T): + # calculate the camera origin + rays_o = -np.dot(R.T, T).ravel() + # calculate the world coodinates of pixels + i, j = np.meshgrid(np.arange(W, dtype=np.float32), + np.arange(H, dtype=np.float32), + indexing='xy') + xy1 = np.stack([i, j, np.ones_like(i)], axis=2) + pixel_camera = np.dot(xy1, np.linalg.inv(K).T) + pixel_world = np.dot(pixel_camera - T.ravel(), R) + # calculate the ray direction + rays_d = pixel_world - rays_o[None, None] + rays_o = np.broadcast_to(rays_o, rays_d.shape) + return rays_o, rays_d + + +def get_bound_corners(bounds): + min_x, min_y, min_z = bounds[0] + max_x, max_y, max_z = bounds[1] + corners_3d = np.array([ + [min_x, min_y, min_z], + [min_x, min_y, max_z], + [min_x, max_y, min_z], + [min_x, max_y, max_z], + [max_x, min_y, min_z], + [max_x, min_y, max_z], + [max_x, max_y, min_z], + [max_x, max_y, max_z], + ]) + return corners_3d + + +def get_bound_2d_mask(bounds, K, pose, H, W): + corners_3d = get_bound_corners(bounds) + corners_2d = base_utils.project(corners_3d, K, pose) + corners_2d = np.round(corners_2d).astype(int) + mask = np.zeros((H, W), dtype=np.uint8) + cv2.fillPoly(mask, [corners_2d[[0, 1, 3, 2, 0]]], 1) + cv2.fillPoly(mask, [corners_2d[[4, 5, 7, 6, 5]]], 1) + cv2.fillPoly(mask, [corners_2d[[0, 1, 5, 4, 0]]], 1) + cv2.fillPoly(mask, [corners_2d[[2, 3, 7, 6, 2]]], 1) + cv2.fillPoly(mask, [corners_2d[[0, 2, 6, 4, 0]]], 1) + cv2.fillPoly(mask, [corners_2d[[1, 3, 7, 5, 1]]], 1) + return mask + + +def get_near_far(bounds, ray_o, ray_d): + """calculate intersections with 3d bounding box""" + norm_d = np.linalg.norm(ray_d, axis=-1, keepdims=True) + viewdir = ray_d / norm_d + viewdir[(viewdir < 1e-5) & (viewdir > -1e-10)] = 1e-5 + viewdir[(viewdir > -1e-5) & (viewdir < 1e-10)] = -1e-5 + tmin = (bounds[:1] - ray_o[:1]) / viewdir + tmax = (bounds[1:2] - ray_o[:1]) / viewdir + t1 = np.minimum(tmin, tmax) + t2 = np.maximum(tmin, tmax) + near = np.max(t1, axis=-1) + far = np.min(t2, axis=-1) + mask_at_box = near < far + near = near[mask_at_box] / norm_d[mask_at_box, 0] + far = far[mask_at_box] / norm_d[mask_at_box, 0] + return near, far, mask_at_box + + +def sample_ray(img, msk, K, R, T, bounds, nrays, split): + H, W = img.shape[:2] + ray_o, ray_d = get_rays(H, W, K, R, T) + + pose = np.concatenate([R, T], axis=1) + bound_mask = get_bound_2d_mask(bounds, K, pose, H, W) + + msk = msk * bound_mask + + if split == 'train': + nsampled_rays = 0 + face_sample_ratio = cfg.face_sample_ratio + body_sample_ratio = cfg.body_sample_ratio + ray_o_list = [] + ray_d_list = [] + rgb_list = [] + near_list = [] + far_list = [] + coord_list = [] + mask_at_box_list = [] + + while nsampled_rays < nrays: + n_body = int((nrays - nsampled_rays) * body_sample_ratio) + n_face = int((nrays - nsampled_rays) * face_sample_ratio) + n_rand = (nrays - nsampled_rays) - n_body - n_face + + # sample rays on body + coord_body = np.argwhere(msk != 0) + coord_body = coord_body[np.random.randint(0, len(coord_body), + n_body)] + # sample rays on face + coord_face = np.argwhere(msk == 13) + if len(coord_face) > 0: + coord_face = coord_face[np.random.randint( + 0, len(coord_face), n_face)] + # sample rays in the bound mask + coord = np.argwhere(bound_mask == 1) + coord = coord[np.random.randint(0, len(coord), n_rand)] + + if len(coord_face) > 0: + coord = np.concatenate([coord_body, coord_face, coord], axis=0) + else: + coord = np.concatenate([coord_body, coord], axis=0) + + ray_o_ = ray_o[coord[:, 0], coord[:, 1]] + ray_d_ = ray_d[coord[:, 0], coord[:, 1]] + rgb_ = img[coord[:, 0], coord[:, 1]] + + near_, far_, mask_at_box = get_near_far(bounds, ray_o_, ray_d_) + + ray_o_list.append(ray_o_[mask_at_box]) + ray_d_list.append(ray_d_[mask_at_box]) + rgb_list.append(rgb_[mask_at_box]) + near_list.append(near_) + far_list.append(far_) + coord_list.append(coord[mask_at_box]) + mask_at_box_list.append(mask_at_box[mask_at_box]) + nsampled_rays += len(near_) + + ray_o = np.concatenate(ray_o_list).astype(np.float32) + ray_d = np.concatenate(ray_d_list).astype(np.float32) + rgb = np.concatenate(rgb_list).astype(np.float32) + near = np.concatenate(near_list).astype(np.float32) + far = np.concatenate(far_list).astype(np.float32) + coord = np.concatenate(coord_list) + mask_at_box = np.concatenate(mask_at_box_list) + else: + rgb = img.reshape(-1, 3).astype(np.float32) + ray_o = ray_o.reshape(-1, 3).astype(np.float32) + ray_d = ray_d.reshape(-1, 3).astype(np.float32) + near, far, mask_at_box = get_near_far(bounds, ray_o, ray_d) + near = near.astype(np.float32) + far = far.astype(np.float32) + rgb = rgb[mask_at_box] + ray_o = ray_o[mask_at_box] + ray_d = ray_d[mask_at_box] + coord = np.zeros([len(rgb), 2]).astype(np.int64) + + return rgb, ray_o, ray_d, near, far, coord, mask_at_box + + +def sample_ray_h36m(img, msk, K, R, T, bounds, nrays, split): + H, W = img.shape[:2] + ray_o, ray_d = get_rays(H, W, K, R, T) + + pose = np.concatenate([R, T], axis=1) + bound_mask = get_bound_2d_mask(bounds, K, pose, H, W) + + msk = msk * bound_mask + bound_mask[msk == 100] = 0 + + if split == 'train': + nsampled_rays = 0 + face_sample_ratio = cfg.face_sample_ratio + body_sample_ratio = cfg.body_sample_ratio + ray_o_list = [] + ray_d_list = [] + rgb_list = [] + near_list = [] + far_list = [] + coord_list = [] + mask_at_box_list = [] + + while nsampled_rays < nrays: + n_body = int((nrays - nsampled_rays) * body_sample_ratio) + n_face = int((nrays - nsampled_rays) * face_sample_ratio) + n_rand = (nrays - nsampled_rays) - n_body - n_face + + # sample rays on body + coord_body = np.argwhere(msk == 1) + coord_body = coord_body[np.random.randint(0, len(coord_body), + n_body)] + # sample rays on face + coord_face = np.argwhere(msk == 13) + if len(coord_face) > 0: + coord_face = coord_face[np.random.randint( + 0, len(coord_face), n_face)] + # sample rays in the bound mask + coord = np.argwhere(bound_mask == 1) + coord = coord[np.random.randint(0, len(coord), n_rand)] + + if len(coord_face) > 0: + coord = np.concatenate([coord_body, coord_face, coord], axis=0) + else: + coord = np.concatenate([coord_body, coord], axis=0) + + ray_o_ = ray_o[coord[:, 0], coord[:, 1]] + ray_d_ = ray_d[coord[:, 0], coord[:, 1]] + rgb_ = img[coord[:, 0], coord[:, 1]] + + near_, far_, mask_at_box = get_near_far(bounds, ray_o_, ray_d_) + + ray_o_list.append(ray_o_[mask_at_box]) + ray_d_list.append(ray_d_[mask_at_box]) + rgb_list.append(rgb_[mask_at_box]) + near_list.append(near_) + far_list.append(far_) + coord_list.append(coord[mask_at_box]) + mask_at_box_list.append(mask_at_box[mask_at_box]) + nsampled_rays += len(near_) + + ray_o = np.concatenate(ray_o_list).astype(np.float32) + ray_d = np.concatenate(ray_d_list).astype(np.float32) + rgb = np.concatenate(rgb_list).astype(np.float32) + near = np.concatenate(near_list).astype(np.float32) + far = np.concatenate(far_list).astype(np.float32) + coord = np.concatenate(coord_list) + mask_at_box = np.concatenate(mask_at_box_list) + else: + rgb = img.reshape(-1, 3).astype(np.float32) + ray_o = ray_o.reshape(-1, 3).astype(np.float32) + ray_d = ray_d.reshape(-1, 3).astype(np.float32) + near, far, mask_at_box = get_near_far(bounds, ray_o, ray_d) + near = near.astype(np.float32) + far = far.astype(np.float32) + rgb = rgb[mask_at_box] + ray_o = ray_o[mask_at_box] + ray_d = ray_d[mask_at_box] + coord = np.zeros([len(rgb), 2]).astype(np.int64) + + return rgb, ray_o, ray_d, near, far, coord, mask_at_box + + +def get_smpl_data(ply_path): + ply = trimesh.load(ply_path) + xyz = np.array(ply.vertices) + nxyz = np.array(ply.vertex_normals) + + if cfg.add_pointcloud: + # add random points + xyz_, ind_ = trimesh.sample.sample_surface_even(ply, 5000) + nxyz_ = ply.face_normals[ind_] + xyz = np.concatenate([xyz, xyz_], axis=0) + nxyz = np.concatenate([nxyz, nxyz_], axis=0) + + xyz = xyz.astype(np.float32) + nxyz = nxyz.astype(np.float32) + + return xyz, nxyz + + +def get_acc(coord, msk): + border = 25 + kernel = np.ones((border, border), np.uint8) + msk = cv2.dilate(msk.copy(), kernel) + acc = msk[coord[:, 0], coord[:, 1]] + acc = (acc != 0).astype(np.uint8) + return acc + + +def rotate_smpl(xyz, nxyz, t): + """ + t: rotation angle + """ + xyz = xyz.copy() + nxyz = nxyz.copy() + center = (np.min(xyz, axis=0) + np.max(xyz, axis=0)) / 2 + xyz = xyz - center + R = np.array([[np.cos(t), -np.sin(t)], [np.sin(t), np.cos(t)]]) + R = R.astype(np.float32) + xyz[:, :2] = np.dot(xyz[:, :2], R.T) + xyz = xyz + center + # nxyz[:, :2] = np.dot(nxyz[:, :2], R.T) + return xyz, nxyz, center + + +def transform_can_smpl(xyz): + center = np.array([0, 0, 0]).astype(np.float32) + rot = np.array([[np.cos(0), -np.sin(0)], [np.sin(0), np.cos(0)]]) + rot = rot.astype(np.float32) + trans = np.array([0, 0, 0]).astype(np.float32) + if np.random.uniform() > cfg.rot_ratio: + return xyz, center, rot, trans + + xyz = xyz.copy() + + # rotate the smpl + rot_range = np.pi / 32 + t = np.random.uniform(-rot_range, rot_range) + rot = np.array([[np.cos(t), -np.sin(t)], [np.sin(t), np.cos(t)]]) + rot = rot.astype(np.float32) + center = np.mean(xyz, axis=0) + xyz = xyz - center + xyz[:, [0, 2]] = np.dot(xyz[:, [0, 2]], rot.T) + xyz = xyz + center + + # translate the smpl + x_range = 0.05 + z_range = 0.025 + x_trans = np.random.uniform(-x_range, x_range) + z_trans = np.random.uniform(-z_range, z_range) + trans = np.array([x_trans, 0, z_trans]).astype(np.float32) + xyz = xyz + trans + + return xyz, center, rot, trans + + +def unproject(depth, K, R, T): + H, W = depth.shape + i, j = np.meshgrid(np.arange(W, dtype=np.float32), + np.arange(H, dtype=np.float32), + indexing='xy') + xy1 = np.stack([i, j, np.ones_like(i)], axis=2) + xyz = xy1 * depth[..., None] + pts3d = np.dot(xyz, np.linalg.inv(K).T) + pts3d = np.dot(pts3d - T.ravel(), R) + return pts3d + + +def sample_world_points(ray_o, ray_d, near, far, split): + # calculate the steps for each ray + t_vals = np.linspace(0., 1., num=cfg.N_samples) + z_vals = near[..., None] * (1. - t_vals) + far[..., None] * t_vals + + if cfg.perturb > 0. and split == 'train': + # get intervals between samples + mids = .5 * (z_vals[..., 1:] + z_vals[..., :-1]) + upper = np.concatenate([mids, z_vals[..., -1:]], -1) + lower = np.concatenate([z_vals[..., :1], mids], -1) + # stratified samples in those intervals + t_rand = np.random.rand(*z_vals.shape) + z_vals = lower + (upper - lower) * t_rand + + pts = ray_o[:, None] + ray_d[:, None] * z_vals[..., None] + pts = pts.astype(np.float32) + z_vals = z_vals.astype(np.float32) + + return pts, z_vals + + +def barycentric_interpolation(val, coords): + """ + :param val: verts x 3 x d input matrix + :param coords: verts x 3 barycentric weights array + :return: verts x d weighted matrix + """ + t = val * coords[..., np.newaxis] + ret = t.sum(axis=1) + return ret + + +def batch_rodrigues(poses): + """ poses: N x 3 + """ + batch_size = poses.shape[0] + angle = np.linalg.norm(poses + 1e-8, axis=1, keepdims=True) + rot_dir = poses / angle + + cos = np.cos(angle)[:, None] + sin = np.sin(angle)[:, None] + + rx, ry, rz = np.split(rot_dir, 3, axis=1) + zeros = np.zeros([batch_size, 1]) + K = np.concatenate([zeros, -rz, ry, rz, zeros, -rx, -ry, rx, zeros], axis=1) + K = K.reshape([batch_size, 3, 3]) + + ident = np.eye(3)[None] + rot_mat = ident + sin * K + (1 - cos) * np.matmul(K, K) + + return rot_mat + + +def get_rigid_transformation(poses, joints, parents): + """ + poses: 24 x 3 + joints: 24 x 3 + parents: 24 + """ + rot_mats = batch_rodrigues(poses) + + # obtain the relative joints + rel_joints = joints.copy() + rel_joints[1:] -= joints[parents[1:]] + + # create the transformation matrix + transforms_mat = np.concatenate([rot_mats, rel_joints[..., None]], axis=2) + padding = np.zeros([24, 1, 4]) + padding[..., 3] = 1 + transforms_mat = np.concatenate([transforms_mat, padding], axis=1) + + # rotate each part + transform_chain = [transforms_mat[0]] + for i in range(1, parents.shape[0]): + curr_res = np.dot(transform_chain[parents[i]], transforms_mat[i]) + transform_chain.append(curr_res) + transforms = np.stack(transform_chain, axis=0) + + # obtain the rigid transformation + padding = np.zeros([24, 1]) + joints_homogen = np.concatenate([joints, padding], axis=1) + transformed_joints = np.sum(transforms * joints_homogen[:, None], axis=2) + transforms[..., 3] = transforms[..., 3] - transformed_joints + transforms = transforms.astype(np.float32) + + return transforms diff --git a/lib/utils/if_nerf/if_nerf_net_utils.py b/lib/utils/if_nerf/if_nerf_net_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..8676b1ac587fb3ab9969ea717fa556a5a14603ad --- /dev/null +++ b/lib/utils/if_nerf/if_nerf_net_utils.py @@ -0,0 +1,57 @@ +import torch +import numpy as np +import os +from lib.config import cfg +import trimesh + + +def update_loss_img(output, batch): + mse = torch.mean((output['rgb_map'] - batch['rgb'])**2, dim=2)[0] + mse = mse.detach().cpu().numpy().astype(np.float32) + + # load the loss img + img_path = batch['meta']['img_path'][0] + paths = img_path.split('/') + paths[-1] = os.path.basename(img_path).replace('.jpg', '.npy') + loss_img_path = os.path.join(paths[0], 'loss', *paths[1:]) + if os.path.exists(loss_img_path): + loss_img = np.load(loss_img_path) + else: + os.system("mkdir -p '{}'".format(os.path.dirname(loss_img_path))) + H, W = int(cfg.H * cfg.ratio), int(cfg.W * cfg.ratio) + loss_img = mse.mean() * np.ones([H, W]).astype(np.float32) + + coord = batch['img_coord'][0] + coord = coord.detach().cpu().numpy() + loss_img[coord[:, 0], coord[:, 1]] = mse + np.save(loss_img_path, loss_img) + + +def init_smpl(smpl): + data_root = 'data/light_stage' + smpl_dir = os.path.join(data_root, cfg.smpl, cfg.human) + for i in range(cfg.ni): + smpl_path = os.path.join(smpl_dir, '{}.ply'.format(i + 1)) + ply = trimesh.load(smpl_path) + xyz = np.array(ply.vertices).ravel() + smpl.weight.data[i] = torch.FloatTensor(xyz) + return smpl + + +def pts_to_can_pts(pts, batch): + """transform pts from the world coordinate to the smpl coordinate""" + Th = batch['Th'] + pts = pts - Th + R = batch['R'] + pts = torch.matmul(pts, batch['R']) + return pts + + +def pts_to_coords(pts, min_xyz): + pts = pts.clone().detach() + # convert xyz to the voxel coordinate dhw + dhw = pts[..., [2, 1, 0]] + min_dhw = min_xyz[:, [2, 1, 0]] + dhw = dhw - min_dhw[:, None] + dhw = dhw / torch.tensor(cfg.voxel_size).to(dhw) + return dhw diff --git a/lib/utils/if_nerf/voxels.py b/lib/utils/if_nerf/voxels.py new file mode 100644 index 0000000000000000000000000000000000000000..0ffc48c71eaac5b725beb723d08c19c9a53502dd --- /dev/null +++ b/lib/utils/if_nerf/voxels.py @@ -0,0 +1,196 @@ +import numpy as np +import trimesh + + +class VoxelGrid: + def __init__(self, data, loc=(0., 0., 0.), scale=1): + assert(data.shape[0] == data.shape[1] == data.shape[2]) + data = np.asarray(data, dtype=np.bool) + loc = np.asarray(loc) + self.data = data + self.loc = loc + self.scale = scale + + @classmethod + def from_mesh(cls, mesh, resolution, loc=None, scale=None, method='ray'): + bounds = mesh.bounds + # Default location is center + if loc is None: + loc = (bounds[0] + bounds[1]) / 2 + + # Default scale, scales the mesh to [-0.45, 0.45]^3 + if scale is None: + scale = (bounds[1] - bounds[0]).max()/0.9 + + loc = np.asarray(loc) + scale = float(scale) + + # Transform mesh + mesh = mesh.copy() + mesh.apply_translation(-loc) + mesh.apply_scale(1/scale) + + # Apply method + if method == 'ray': + voxel_data = voxelize_ray(mesh, resolution) + elif method == 'fill': + voxel_data = voxelize_fill(mesh, resolution) + + voxels = cls(voxel_data, loc, scale) + return voxels + + def down_sample(self, factor=2): + if not (self.resolution % factor) == 0: + raise ValueError('Resolution must be divisible by factor.') + new_data = block_reduce(self.data, (factor,) * 3, np.max) + return VoxelGrid(new_data, self.loc, self.scale) + + def to_mesh(self): + # Shorthand + occ = self.data + + # Shape of voxel grid + nx, ny, nz = occ.shape + # Shape of corresponding occupancy grid + grid_shape = (nx + 1, ny + 1, nz + 1) + + # Convert values to occupancies + occ = np.pad(occ, 1, 'constant') + + # Determine if face present + f1_r = (occ[:-1, 1:-1, 1:-1] & ~occ[1:, 1:-1, 1:-1]) + f2_r = (occ[1:-1, :-1, 1:-1] & ~occ[1:-1, 1:, 1:-1]) + f3_r = (occ[1:-1, 1:-1, :-1] & ~occ[1:-1, 1:-1, 1:]) + + f1_l = (~occ[:-1, 1:-1, 1:-1] & occ[1:, 1:-1, 1:-1]) + f2_l = (~occ[1:-1, :-1, 1:-1] & occ[1:-1, 1:, 1:-1]) + f3_l = (~occ[1:-1, 1:-1, :-1] & occ[1:-1, 1:-1, 1:]) + + f1 = f1_r | f1_l + f2 = f2_r | f2_l + f3 = f3_r | f3_l + + assert(f1.shape == (nx + 1, ny, nz)) + assert(f2.shape == (nx, ny + 1, nz)) + assert(f3.shape == (nx, ny, nz + 1)) + + # Determine if vertex present + v = np.full(grid_shape, False) + + v[:, :-1, :-1] |= f1 + v[:, :-1, 1:] |= f1 + v[:, 1:, :-1] |= f1 + v[:, 1:, 1:] |= f1 + + v[:-1, :, :-1] |= f2 + v[:-1, :, 1:] |= f2 + v[1:, :, :-1] |= f2 + v[1:, :, 1:] |= f2 + + v[:-1, :-1, :] |= f3 + v[:-1, 1:, :] |= f3 + v[1:, :-1, :] |= f3 + v[1:, 1:, :] |= f3 + + # Calculate indices for vertices + n_vertices = v.sum() + v_idx = np.full(grid_shape, -1) + v_idx[v] = np.arange(n_vertices) + + # Vertices + v_x, v_y, v_z = np.where(v) + v_x = v_x / nx - 0.5 + v_y = v_y / ny - 0.5 + v_z = v_z / nz - 0.5 + vertices = np.stack([v_x, v_y, v_z], axis=1) + + # Face indices + f1_l_x, f1_l_y, f1_l_z = np.where(f1_l) + f2_l_x, f2_l_y, f2_l_z = np.where(f2_l) + f3_l_x, f3_l_y, f3_l_z = np.where(f3_l) + + f1_r_x, f1_r_y, f1_r_z = np.where(f1_r) + f2_r_x, f2_r_y, f2_r_z = np.where(f2_r) + f3_r_x, f3_r_y, f3_r_z = np.where(f3_r) + + faces_1_l = np.stack([ + v_idx[f1_l_x, f1_l_y, f1_l_z], + v_idx[f1_l_x, f1_l_y, f1_l_z + 1], + v_idx[f1_l_x, f1_l_y + 1, f1_l_z + 1], + v_idx[f1_l_x, f1_l_y + 1, f1_l_z], + ], axis=1) + + faces_1_r = np.stack([ + v_idx[f1_r_x, f1_r_y, f1_r_z], + v_idx[f1_r_x, f1_r_y + 1, f1_r_z], + v_idx[f1_r_x, f1_r_y + 1, f1_r_z + 1], + v_idx[f1_r_x, f1_r_y, f1_r_z + 1], + ], axis=1) + + faces_2_l = np.stack([ + v_idx[f2_l_x, f2_l_y, f2_l_z], + v_idx[f2_l_x + 1, f2_l_y, f2_l_z], + v_idx[f2_l_x + 1, f2_l_y, f2_l_z + 1], + v_idx[f2_l_x, f2_l_y, f2_l_z + 1], + ], axis=1) + + faces_2_r = np.stack([ + v_idx[f2_r_x, f2_r_y, f2_r_z], + v_idx[f2_r_x, f2_r_y, f2_r_z + 1], + v_idx[f2_r_x + 1, f2_r_y, f2_r_z + 1], + v_idx[f2_r_x + 1, f2_r_y, f2_r_z], + ], axis=1) + + faces_3_l = np.stack([ + v_idx[f3_l_x, f3_l_y, f3_l_z], + v_idx[f3_l_x, f3_l_y + 1, f3_l_z], + v_idx[f3_l_x + 1, f3_l_y + 1, f3_l_z], + v_idx[f3_l_x + 1, f3_l_y, f3_l_z], + ], axis=1) + + faces_3_r = np.stack([ + v_idx[f3_r_x, f3_r_y, f3_r_z], + v_idx[f3_r_x + 1, f3_r_y, f3_r_z], + v_idx[f3_r_x + 1, f3_r_y + 1, f3_r_z], + v_idx[f3_r_x, f3_r_y + 1, f3_r_z], + ], axis=1) + + faces = np.concatenate([ + faces_1_l, faces_1_r, + faces_2_l, faces_2_r, + faces_3_l, faces_3_r, + ], axis=0) + + vertices = self.loc + self.scale * vertices + mesh = trimesh.Trimesh(vertices, faces, process=False) + return mesh + + @property + def resolution(self): + assert(self.data.shape[0] == self.data.shape[1] == self.data.shape[2]) + return self.data.shape[0] + + def contains(self, points): + nx = self.resolution + + # Rescale bounding box to [-0.5, 0.5]^3 + points = (points - self.loc) / self.scale + # Discretize points to [0, nx-1]^3 + points_i = ((points + 0.5) * nx).astype(np.int32) + # i1, i2, i3 have sizes (batch_size, T) + i1, i2, i3 = points_i[..., 0], points_i[..., 1], points_i[..., 2] + # Only use indices inside bounding box + mask = ( + (i1 >= 0) & (i2 >= 0) & (i3 >= 0) + & (nx > i1) & (nx > i2) & (nx > i3) + ) + # Prevent out of bounds error + i1 = i1[mask] + i2 = i2[mask] + i3 = i3[mask] + + # Compute values, default value outside box is 0 + occ = np.zeros(points.shape[:-1], dtype=np.bool) + occ[mask] = self.data[i1, i2, i3] + + return occ diff --git a/lib/utils/img_utils.py b/lib/utils/img_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..09b747e1c5bd8555aded627d9d94ebcf87ff6d33 --- /dev/null +++ b/lib/utils/img_utils.py @@ -0,0 +1,157 @@ +import torch +from matplotlib import cm +import matplotlib.pyplot as plt +import matplotlib.patches as patches +import numpy as np +import cv2 + + +def unnormalize_img(img, mean, std): + """ + img: [3, h, w] + """ + img = img.detach().cpu().clone() + # img = img / 255. + img *= torch.tensor(std).view(3, 1, 1) + img += torch.tensor(mean).view(3, 1, 1) + min_v = torch.min(img) + img = (img - min_v) / (torch.max(img) - min_v) + return img + + +def bgr_to_rgb(img): + return img[:, :, [2, 1, 0]] + + +def horizon_concate(inp0, inp1): + h0, w0 = inp0.shape[:2] + h1, w1 = inp1.shape[:2] + if inp0.ndim == 3: + inp = np.zeros((max(h0, h1), w0 + w1, 3), dtype=inp0.dtype) + inp[:h0, :w0, :] = inp0 + inp[:h1, w0:(w0 + w1), :] = inp1 + else: + inp = np.zeros((max(h0, h1), w0 + w1), dtype=inp0.dtype) + inp[:h0, :w0] = inp0 + inp[:h1, w0:(w0 + w1)] = inp1 + return inp + + +def vertical_concate(inp0, inp1): + h0, w0 = inp0.shape[:2] + h1, w1 = inp1.shape[:2] + if inp0.ndim == 3: + inp = np.zeros((h0 + h1, max(w0, w1), 3), dtype=inp0.dtype) + inp[:h0, :w0, :] = inp0 + inp[h0:(h0 + h1), :w1, :] = inp1 + else: + inp = np.zeros((h0 + h1, max(w0, w1)), dtype=inp0.dtype) + inp[:h0, :w0] = inp0 + inp[h0:(h0 + h1), :w1] = inp1 + return inp + + +def transparent_cmap(cmap): + """Copy colormap and set alpha values""" + mycmap = cmap + mycmap._init() + mycmap._lut[:,-1] = 0.3 + return mycmap + +cmap = transparent_cmap(plt.get_cmap('jet')) + + +def set_grid(ax, h, w, interval=8): + ax.set_xticks(np.arange(0, w, interval)) + ax.set_yticks(np.arange(0, h, interval)) + ax.grid() + ax.set_yticklabels([]) + ax.set_xticklabels([]) + + +color_list = np.array( + [ + 0.000, 0.447, 0.741, + 0.850, 0.325, 0.098, + 0.929, 0.694, 0.125, + 0.494, 0.184, 0.556, + 0.466, 0.674, 0.188, + 0.301, 0.745, 0.933, + 0.635, 0.078, 0.184, + 0.300, 0.300, 0.300, + 0.600, 0.600, 0.600, + 1.000, 0.000, 0.000, + 1.000, 0.500, 0.000, + 0.749, 0.749, 0.000, + 0.000, 1.000, 0.000, + 0.000, 0.000, 1.000, + 0.667, 0.000, 1.000, + 0.333, 0.333, 0.000, + 0.333, 0.667, 0.000, + 0.333, 1.000, 0.000, + 0.667, 0.333, 0.000, + 0.667, 0.667, 0.000, + 0.667, 1.000, 0.000, + 1.000, 0.333, 0.000, + 1.000, 0.667, 0.000, + 1.000, 1.000, 0.000, + 0.000, 0.333, 0.500, + 0.000, 0.667, 0.500, + 0.000, 1.000, 0.500, + 0.333, 0.000, 0.500, + 0.333, 0.333, 0.500, + 0.333, 0.667, 0.500, + 0.333, 1.000, 0.500, + 0.667, 0.000, 0.500, + 0.667, 0.333, 0.500, + 0.667, 0.667, 0.500, + 0.667, 1.000, 0.500, + 1.000, 0.000, 0.500, + 1.000, 0.333, 0.500, + 1.000, 0.667, 0.500, + 1.000, 1.000, 0.500, + 0.000, 0.333, 1.000, + 0.000, 0.667, 1.000, + 0.000, 1.000, 1.000, + 0.333, 0.000, 1.000, + 0.333, 0.333, 1.000, + 0.333, 0.667, 1.000, + 0.333, 1.000, 1.000, + 0.667, 0.000, 1.000, + 0.667, 0.333, 1.000, + 0.667, 0.667, 1.000, + 0.667, 1.000, 1.000, + 1.000, 0.000, 1.000, + 1.000, 0.333, 1.000, + 1.000, 0.667, 1.000, + 0.167, 0.000, 0.000, + 0.333, 0.000, 0.000, + 0.500, 0.000, 0.000, + 0.667, 0.000, 0.000, + 0.833, 0.000, 0.000, + 1.000, 0.000, 0.000, + 0.000, 0.167, 0.000, + 0.000, 0.333, 0.000, + 0.000, 0.500, 0.000, + 0.000, 0.667, 0.000, + 0.000, 0.833, 0.000, + 0.000, 1.000, 0.000, + 0.000, 0.000, 0.167, + 0.000, 0.000, 0.333, + 0.000, 0.000, 0.500, + 0.000, 0.000, 0.667, + 0.000, 0.000, 0.833, + 0.000, 0.000, 1.000, + 0.000, 0.000, 0.000, + 0.143, 0.143, 0.143, + 0.286, 0.286, 0.286, + 0.429, 0.429, 0.429, + 0.571, 0.571, 0.571, + 0.714, 0.714, 0.714, + 0.857, 0.857, 0.857, + 1.000, 1.000, 1.000, + 0.50, 0.5, 0 + ] +).astype(np.float32) +colors = color_list.reshape((-1, 3)) * 255 +colors = np.array(colors, dtype=np.uint8).reshape(len(colors), 1, 1, 3) diff --git a/lib/utils/light_stage/ply_to_occupancy.py b/lib/utils/light_stage/ply_to_occupancy.py new file mode 100644 index 0000000000000000000000000000000000000000..4cc21b54b3c1c46a1c81bc077d4b5fc964fb11dd --- /dev/null +++ b/lib/utils/light_stage/ply_to_occupancy.py @@ -0,0 +1,79 @@ +import numpy as np +from scipy.spatial import cKDTree as KDTree +import os +import tqdm +from lib.utils import data_utils +import glob +from lib.utils.if_nerf.voxels import VoxelGrid +from lib.config import cfg + + +def get_scaled_model(model): + min_xyz = np.min(model, axis=0) + max_xyz = np.max(model, axis=0) + bounds = np.stack([min_xyz, max_xyz], axis=0) + center = (min_xyz + max_xyz) / 2 + scale = np.max(max_xyz - min_xyz) + model = (model - center) / scale + return model, bounds + + +def create_grid_points_from_bounds(minimun, maximum, res): + x = np.linspace(minimun, maximum, res) + X, Y, Z = np.meshgrid(x, x, x, indexing='ij') + X = X.reshape((np.prod(X.shape), )) + Y = Y.reshape((np.prod(Y.shape), )) + Z = Z.reshape((np.prod(Z.shape), )) + + points_list = np.column_stack((X, Y, Z)) + del X, Y, Z, x + return points_list + + +def voxelized_pointcloud(model, kdtree, res): + occupancies = np.zeros(res ** 3, dtype=np.int8) + _, idx = kdtree.query(model) + occupancies[idx] = 1 + compressed_occupancies = np.packbits(occupancies) + return compressed_occupancies + + +def ply_to_occupancy(): + data_root = 'data/light_stage' + point_cloud_dir = os.path.join(data_root, 'point_cloud') + voxel_dir = os.path.join(data_root, 'voxel') + os.system('mkdir -p {}'.format(voxel_dir)) + + bb_min = -0.5 + bb_max = 0.5 + res = 256 + grid_points = create_grid_points_from_bounds(bb_min, bb_max, res) + kdtree = KDTree(grid_points) + + humans = os.listdir(point_cloud_dir) + for human in humans: + current_pc_dir = os.path.join(point_cloud_dir, human) + current_voxel_dir = os.path.join(voxel_dir, human) + os.system('mkdir -p {}'.format(current_voxel_dir)) + paths = sorted(os.listdir(current_pc_dir)) + for path in tqdm.tqdm(paths): + model = data_utils.load_ply(os.path.join(current_pc_dir, path)) + model, bounds = get_scaled_model(model) + compressed_occupancies = voxelized_pointcloud(model, kdtree, res) + i = int(path.split('.')[0]) + np.savez(os.path.join(current_voxel_dir, '{}.npz'.format(i)), + compressed_occupancies=compressed_occupancies, + bounds=bounds) + + +def create_voxel_off(): + data_root = 'data/light_stage/voxel/CoreView_313' + voxel_paths = glob.glob(os.path.join(data_root, '*.npz')) + res = 256 + for voxel_path in voxel_paths: + voxel_data = np.load(voxel_path) + occupancy = np.unpackbits(voxel_data['compressed_occupancies']) + occupancy = occupancy.reshape(res, res, res).astype(np.float32) + i = int(os.path.basename(voxel_path).split('.')[0]) + VoxelGrid(occupancy).to_mesh().export(f'/home/pengsida/{i}.off') + __import__('ipdb').set_trace() diff --git a/lib/utils/net_utils.py b/lib/utils/net_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..3b57e73a2d29c6781adf51eb4ef1ce373c2285b1 --- /dev/null +++ b/lib/utils/net_utils.py @@ -0,0 +1,415 @@ +import torch +import os +from torch import nn +import numpy as np +import torch.nn.functional +from collections import OrderedDict +from termcolor import colored + + +def sigmoid(x): + y = torch.clamp(x.sigmoid(), min=1e-4, max=1 - 1e-4) + return y + + +def _neg_loss(pred, gt): + ''' Modified focal loss. Exactly the same as CornerNet. + Runs faster and costs a little bit more memory + Arguments: + pred (batch x c x h x w) + gt_regr (batch x c x h x w) + ''' + pos_inds = gt.eq(1).float() + neg_inds = gt.lt(1).float() + + neg_weights = torch.pow(1 - gt, 4) + + loss = 0 + + pos_loss = torch.log(pred) * torch.pow(1 - pred, 2) * pos_inds + neg_loss = torch.log(1 - pred) * torch.pow(pred, + 2) * neg_weights * neg_inds + + num_pos = pos_inds.float().sum() + pos_loss = pos_loss.sum() + neg_loss = neg_loss.sum() + + if num_pos == 0: + loss = loss - neg_loss + else: + loss = loss - (pos_loss + neg_loss) / num_pos + return loss + + +class FocalLoss(nn.Module): + '''nn.Module warpper for focal loss''' + def __init__(self): + super(FocalLoss, self).__init__() + self.neg_loss = _neg_loss + + def forward(self, out, target): + return self.neg_loss(out, target) + + +def smooth_l1_loss(vertex_pred, + vertex_targets, + vertex_weights, + sigma=1.0, + normalize=True, + reduce=True): + """ + :param vertex_pred: [b, vn*2, h, w] + :param vertex_targets: [b, vn*2, h, w] + :param vertex_weights: [b, 1, h, w] + :param sigma: + :param normalize: + :param reduce: + :return: + """ + b, ver_dim, _, _ = vertex_pred.shape + sigma_2 = sigma**2 + vertex_diff = vertex_pred - vertex_targets + diff = vertex_weights * vertex_diff + abs_diff = torch.abs(diff) + smoothL1_sign = (abs_diff < 1. / sigma_2).detach().float() + in_loss = torch.pow(diff, 2) * (sigma_2 / 2.) * smoothL1_sign \ + + (abs_diff - (0.5 / sigma_2)) * (1. - smoothL1_sign) + + if normalize: + in_loss = torch.sum(in_loss.view(b, -1), 1) / ( + ver_dim * torch.sum(vertex_weights.view(b, -1), 1) + 1e-3) + + if reduce: + in_loss = torch.mean(in_loss) + + return in_loss + + +class SmoothL1Loss(nn.Module): + def __init__(self): + super(SmoothL1Loss, self).__init__() + self.smooth_l1_loss = smooth_l1_loss + + def forward(self, + preds, + targets, + weights, + sigma=1.0, + normalize=True, + reduce=True): + return self.smooth_l1_loss(preds, targets, weights, sigma, normalize, + reduce) + + +class AELoss(nn.Module): + def __init__(self): + super(AELoss, self).__init__() + + def forward(self, ae, ind, ind_mask): + """ + ae: [b, 1, h, w] + ind: [b, max_objs, max_parts] + ind_mask: [b, max_objs, max_parts] + obj_mask: [b, max_objs] + """ + # first index + b, _, h, w = ae.shape + b, max_objs, max_parts = ind.shape + obj_mask = torch.sum(ind_mask, dim=2) != 0 + + ae = ae.view(b, h * w, 1) + seed_ind = ind.view(b, max_objs * max_parts, 1) + tag = ae.gather(1, seed_ind).view(b, max_objs, max_parts) + + # compute the mean + tag_mean = tag * ind_mask + tag_mean = tag_mean.sum(2) / (ind_mask.sum(2) + 1e-4) + + # pull ae of the same object to their mean + pull_dist = (tag - tag_mean.unsqueeze(2)).pow(2) * ind_mask + obj_num = obj_mask.sum(dim=1).float() + pull = (pull_dist.sum(dim=(1, 2)) / (obj_num + 1e-4)).sum() + pull /= b + + # push away the mean of different objects + push_dist = torch.abs(tag_mean.unsqueeze(1) - tag_mean.unsqueeze(2)) + push_dist = 1 - push_dist + push_dist = nn.functional.relu(push_dist, inplace=True) + obj_mask = (obj_mask.unsqueeze(1) + obj_mask.unsqueeze(2)) == 2 + push_dist = push_dist * obj_mask.float() + push = ((push_dist.sum(dim=(1, 2)) - obj_num) / + (obj_num * (obj_num - 1) + 1e-4)).sum() + push /= b + return pull, push + + +class PolyMatchingLoss(nn.Module): + def __init__(self, pnum): + super(PolyMatchingLoss, self).__init__() + + self.pnum = pnum + batch_size = 1 + pidxall = np.zeros(shape=(batch_size, pnum, pnum), dtype=np.int32) + for b in range(batch_size): + for i in range(pnum): + pidx = (np.arange(pnum) + i) % pnum + pidxall[b, i] = pidx + + device = torch.device('cuda') + pidxall = torch.from_numpy( + np.reshape(pidxall, newshape=(batch_size, -1))).to(device) + + self.feature_id = pidxall.unsqueeze_(2).long().expand( + pidxall.size(0), pidxall.size(1), 2).detach() + + def forward(self, pred, gt, loss_type="L2"): + pnum = self.pnum + batch_size = pred.size()[0] + feature_id = self.feature_id.expand(batch_size, + self.feature_id.size(1), 2) + device = torch.device('cuda') + + gt_expand = torch.gather(gt, 1, + feature_id).view(batch_size, pnum, pnum, 2) + + pred_expand = pred.unsqueeze(1) + + dis = pred_expand - gt_expand + + if loss_type == "L2": + dis = (dis**2).sum(3).sqrt().sum(2) + elif loss_type == "L1": + dis = torch.abs(dis).sum(3).sum(2) + + min_dis, min_id = torch.min(dis, dim=1, keepdim=True) + # print(min_id) + + # min_id = torch.from_numpy(min_id.data.cpu().numpy()).to(device) + # min_gt_id_to_gather = min_id.unsqueeze_(2).unsqueeze_(3).long().\ + # expand(min_id.size(0), min_id.size(1), gt_expand.size(2), gt_expand.size(3)) + # gt_right_order = torch.gather(gt_expand, 1, min_gt_id_to_gather).view(batch_size, pnum, 2) + + return torch.mean(min_dis) + + +class AttentionLoss(nn.Module): + def __init__(self, beta=4, gamma=0.5): + super(AttentionLoss, self).__init__() + + self.beta = beta + self.gamma = gamma + + def forward(self, pred, gt): + num_pos = torch.sum(gt) + num_neg = torch.sum(1 - gt) + alpha = num_neg / (num_pos + num_neg) + edge_beta = torch.pow(self.beta, torch.pow(1 - pred, self.gamma)) + bg_beta = torch.pow(self.beta, torch.pow(pred, self.gamma)) + + loss = 0 + loss = loss - alpha * edge_beta * torch.log(pred) * gt + loss = loss - (1 - alpha) * bg_beta * torch.log(1 - pred) * (1 - gt) + return torch.mean(loss) + + +def _gather_feat(feat, ind, mask=None): + dim = feat.size(2) + ind = ind.unsqueeze(2).expand(ind.size(0), ind.size(1), dim) + feat = feat.gather(1, ind) + if mask is not None: + mask = mask.unsqueeze(2).expand_as(feat) + feat = feat[mask] + feat = feat.view(-1, dim) + return feat + + +def _tranpose_and_gather_feat(feat, ind): + feat = feat.permute(0, 2, 3, 1).contiguous() + feat = feat.view(feat.size(0), -1, feat.size(3)) + feat = _gather_feat(feat, ind) + return feat + + +class Ind2dRegL1Loss(nn.Module): + def __init__(self, type='l1'): + super(Ind2dRegL1Loss, self).__init__() + if type == 'l1': + self.loss = torch.nn.functional.l1_loss + elif type == 'smooth_l1': + self.loss = torch.nn.functional.smooth_l1_loss + + def forward(self, output, target, ind, ind_mask): + """ind: [b, max_objs, max_parts]""" + b, max_objs, max_parts = ind.shape + ind = ind.view(b, max_objs * max_parts) + pred = _tranpose_and_gather_feat(output, + ind).view(b, max_objs, max_parts, + output.size(1)) + mask = ind_mask.unsqueeze(3).expand_as(pred) + loss = self.loss(pred * mask, target * mask, reduction='sum') + loss = loss / (mask.sum() + 1e-4) + return loss + + +class IndL1Loss1d(nn.Module): + def __init__(self, type='l1'): + super(IndL1Loss1d, self).__init__() + if type == 'l1': + self.loss = torch.nn.functional.l1_loss + elif type == 'smooth_l1': + self.loss = torch.nn.functional.smooth_l1_loss + + def forward(self, output, target, ind, weight): + """ind: [b, n]""" + output = _tranpose_and_gather_feat(output, ind) + weight = weight.unsqueeze(2) + loss = self.loss(output * weight, target * weight, reduction='sum') + loss = loss / (weight.sum() * output.size(2) + 1e-4) + return loss + + +class GeoCrossEntropyLoss(nn.Module): + def __init__(self): + super(GeoCrossEntropyLoss, self).__init__() + + def forward(self, output, target, poly): + output = torch.nn.functional.softmax(output, dim=1) + output = torch.log(torch.clamp(output, min=1e-4)) + poly = poly.view(poly.size(0), 4, poly.size(1) // 4, 2) + target = target[..., None, None].expand(poly.size(0), poly.size(1), 1, + poly.size(3)) + target_poly = torch.gather(poly, 2, target) + sigma = (poly[:, :, 0] - poly[:, :, 1]).pow(2).sum(2, keepdim=True) + kernel = torch.exp(-(poly - target_poly).pow(2).sum(3) / (sigma / 3)) + loss = -(output * kernel.transpose(2, 1)).sum(1).mean() + return loss + + +def load_model(net, + optim, + scheduler, + recorder, + model_dir, + resume=True, + epoch=-1): + if not resume: + os.system('rm -rf {}'.format(model_dir)) + + if not os.path.exists(model_dir): + return 0 + + pths = [ + int(pth.split('.')[0]) for pth in os.listdir(model_dir) + if pth != 'latest.pth' + ] + if len(pths) == 0 and 'latest.pth' not in os.listdir(model_dir): + return 0 + if epoch == -1: + if 'latest.pth' in os.listdir(model_dir): + pth = 'latest' + else: + pth = max(pths) + else: + pth = epoch + print('load model: {}'.format(os.path.join(model_dir, + '{}.pth'.format(pth)))) + pretrained_model = torch.load( + os.path.join(model_dir, '{}.pth'.format(pth)), 'cpu') + net.load_state_dict(pretrained_model['net']) + optim.load_state_dict(pretrained_model['optim']) + scheduler.load_state_dict(pretrained_model['scheduler']) + recorder.load_state_dict(pretrained_model['recorder']) + return pretrained_model['epoch'] + 1 + + +def save_model(net, optim, scheduler, recorder, model_dir, epoch, last=False): + os.system('mkdir -p {}'.format(model_dir)) + model = { + 'net': net.state_dict(), + 'optim': optim.state_dict(), + 'scheduler': scheduler.state_dict(), + 'recorder': recorder.state_dict(), + 'epoch': epoch + } + if last: + torch.save(model, os.path.join(model_dir, 'latest.pth')) + else: + torch.save(model, os.path.join(model_dir, '{}.pth'.format(epoch))) + + # remove previous pretrained model if the number of models is too big + pths = [ + int(pth.split('.')[0]) for pth in os.listdir(model_dir) + if pth != 'latest.pth' + ] + if len(pths) <= 20: + return + os.system('rm {}'.format( + os.path.join(model_dir, '{}.pth'.format(min(pths))))) + + +def load_network(net, model_dir, resume=True, epoch=-1, strict=True): + if not resume: + return 0 + + if not os.path.exists(model_dir): + print(colored('pretrained model does not exist', 'red')) + return 0 + + if os.path.isdir(model_dir): + pths = [ + int(pth.split('.')[0]) for pth in os.listdir(model_dir) + if pth != 'latest.pth' + ] + if len(pths) == 0 and 'latest.pth' not in os.listdir(model_dir): + return 0 + if epoch == -1: + if 'latest.pth' in os.listdir(model_dir): + pth = 'latest' + else: + pth = max(pths) + else: + pth = epoch + model_path = os.path.join(model_dir, '{}.pth'.format(pth)) + else: + model_path = model_dir + + print('load model: {}'.format(model_path)) + pretrained_model = torch.load(model_path) + net.load_state_dict(pretrained_model['net'], strict=strict) + return pretrained_model['epoch'] + 1 + + +def remove_net_prefix(net, prefix): + net_ = OrderedDict() + for k in net.keys(): + if k.startswith(prefix): + net_[k[len(prefix):]] = net[k] + else: + net_[k] = net[k] + return net_ + + +def add_net_prefix(net, prefix): + net_ = OrderedDict() + for k in net.keys(): + net_[prefix + k] = net[k] + return net_ + + +def replace_net_prefix(net, orig_prefix, prefix): + net_ = OrderedDict() + for k in net.keys(): + if k.startswith(orig_prefix): + net_[prefix + k[len(orig_prefix):]] = net[k] + else: + net_[k] = net[k] + return net_ + + +def remove_net_layer(net, layers): + keys = list(net.keys()) + for k in keys: + for layer in layers: + if k.startswith(layer): + del net[k] + return net diff --git a/lib/utils/optimizer/lr_scheduler.py b/lib/utils/optimizer/lr_scheduler.py new file mode 100644 index 0000000000000000000000000000000000000000..4e6f0cfdaa89a18d3ff910b2177ea9adae27dc6d --- /dev/null +++ b/lib/utils/optimizer/lr_scheduler.py @@ -0,0 +1,75 @@ +from bisect import bisect_right +from collections import Counter + +import torch + + +class WarmupMultiStepLR(torch.optim.lr_scheduler._LRScheduler): + def __init__( + self, + optimizer, + milestones, + gamma=0.1, + warmup_factor=1.0 / 3, + warmup_iters=5, + warmup_method="linear", + last_epoch=-1, + ): + if not list(milestones) == sorted(milestones): + raise ValueError( + "Milestones should be a list of" " increasing integers. Got {}", + milestones, + ) + + if warmup_method not in ("constant", "linear"): + raise ValueError( + "Only 'constant' or 'linear' warmup_method accepted" + "got {}".format(warmup_method) + ) + self.milestones = milestones + self.gamma = gamma + self.warmup_factor = warmup_factor + self.warmup_iters = warmup_iters + self.warmup_method = warmup_method + super(WarmupMultiStepLR, self).__init__(optimizer, last_epoch) + + def get_lr(self): + warmup_factor = 1 + if self.last_epoch < self.warmup_iters: + if self.warmup_method == "constant": + warmup_factor = self.warmup_factor + elif self.warmup_method == "linear": + alpha = float(self.last_epoch) / self.warmup_iters + warmup_factor = self.warmup_factor * (1 - alpha) + alpha + return [ + base_lr + * warmup_factor + * self.gamma ** bisect_right(self.milestones, self.last_epoch) + for base_lr in self.base_lrs + ] + + +class MultiStepLR(torch.optim.lr_scheduler._LRScheduler): + + def __init__(self, optimizer, milestones, gamma=0.1, last_epoch=-1): + self.milestones = Counter(milestones) + self.gamma = gamma + super(MultiStepLR, self).__init__(optimizer, last_epoch) + + def get_lr(self): + if self.last_epoch not in self.milestones: + return [group['lr'] for group in self.optimizer.param_groups] + return [group['lr'] * self.gamma ** self.milestones[self.last_epoch] + for group in self.optimizer.param_groups] + + +class ExponentialLR(torch.optim.lr_scheduler._LRScheduler): + + def __init__(self, optimizer, decay_epochs, gamma=0.1, last_epoch=-1): + self.decay_epochs = decay_epochs + self.gamma = gamma + super(ExponentialLR, self).__init__(optimizer, last_epoch) + + def get_lr(self): + return [base_lr * self.gamma ** (self.last_epoch / self.decay_epochs) + for base_lr in self.base_lrs] diff --git a/lib/utils/optimizer/radam.py b/lib/utils/optimizer/radam.py new file mode 100644 index 0000000000000000000000000000000000000000..2934e8eb1af1e0e8b7c80cbe80ba92c79d402365 --- /dev/null +++ b/lib/utils/optimizer/radam.py @@ -0,0 +1,246 @@ +import math +import torch +from torch.optim.optimizer import Optimizer, required + + +class RAdam(Optimizer): + + def __init__(self, params, lr=1e-3, betas=(0.9, 0.999), eps=1e-8, weight_decay=0, degenerated_to_sgd=True): + if not 0.0 <= lr: + raise ValueError("Invalid learning rate: {}".format(lr)) + if not 0.0 <= eps: + raise ValueError("Invalid epsilon value: {}".format(eps)) + if not 0.0 <= betas[0] < 1.0: + raise ValueError("Invalid beta parameter at index 0: {}".format(betas[0])) + if not 0.0 <= betas[1] < 1.0: + raise ValueError("Invalid beta parameter at index 1: {}".format(betas[1])) + + self.degenerated_to_sgd = degenerated_to_sgd + if isinstance(params, (list, tuple)) and len(params) > 0 and isinstance(params[0], dict): + for param in params: + if 'betas' in param and (param['betas'][0] != betas[0] or param['betas'][1] != betas[1]): + param['buffer'] = [[None, None, None] for _ in range(10)] + defaults = dict(lr=lr, betas=betas, eps=eps, weight_decay=weight_decay, buffer=[[None, None, None] for _ in range(10)]) + super(RAdam, self).__init__(params, defaults) + + def __setstate__(self, state): + super(RAdam, self).__setstate__(state) + + def step(self, closure=None): + + loss = None + if closure is not None: + loss = closure() + + for group in self.param_groups: + + for p in group['params']: + if p.grad is None: + continue + grad = p.grad.data.float() + if grad.is_sparse: + raise RuntimeError('RAdam does not support sparse gradients') + + p_data_fp32 = p.data.float() + + state = self.state[p] + + if len(state) == 0: + state['step'] = 0 + state['exp_avg'] = torch.zeros_like(p_data_fp32) + state['exp_avg_sq'] = torch.zeros_like(p_data_fp32) + else: + state['exp_avg'] = state['exp_avg'].type_as(p_data_fp32) + state['exp_avg_sq'] = state['exp_avg_sq'].type_as(p_data_fp32) + + exp_avg, exp_avg_sq = state['exp_avg'], state['exp_avg_sq'] + beta1, beta2 = group['betas'] + + exp_avg_sq.mul_(beta2).addcmul_(1 - beta2, grad, grad) + exp_avg.mul_(beta1).add_(1 - beta1, grad) + + state['step'] += 1 + buffered = group['buffer'][int(state['step'] % 10)] + if state['step'] == buffered[0]: + N_sma, step_size = buffered[1], buffered[2] + else: + buffered[0] = state['step'] + beta2_t = beta2 ** state['step'] + N_sma_max = 2 / (1 - beta2) - 1 + N_sma = N_sma_max - 2 * state['step'] * beta2_t / (1 - beta2_t) + buffered[1] = N_sma + + # more conservative since it's an approximated value + if N_sma >= 5: + step_size = math.sqrt((1 - beta2_t) * (N_sma - 4) / (N_sma_max - 4) * (N_sma - 2) / N_sma * N_sma_max / (N_sma_max - 2)) / (1 - beta1 ** state['step']) + elif self.degenerated_to_sgd: + step_size = 1.0 / (1 - beta1 ** state['step']) + else: + step_size = -1 + buffered[2] = step_size + + # more conservative since it's an approximated value + if N_sma >= 5: + if group['weight_decay'] != 0: + p_data_fp32.add_(-group['weight_decay'] * group['lr'], p_data_fp32) + denom = exp_avg_sq.sqrt().add_(group['eps']) + p_data_fp32.addcdiv_(-step_size * group['lr'], exp_avg, denom) + p.data.copy_(p_data_fp32) + elif step_size > 0: + if group['weight_decay'] != 0: + p_data_fp32.add_(-group['weight_decay'] * group['lr'], p_data_fp32) + p_data_fp32.add_(-step_size * group['lr'], exp_avg) + p.data.copy_(p_data_fp32) + + return loss + +class PlainRAdam(Optimizer): + + def __init__(self, params, lr=1e-3, betas=(0.9, 0.999), eps=1e-8, weight_decay=0, degenerated_to_sgd=True): + if not 0.0 <= lr: + raise ValueError("Invalid learning rate: {}".format(lr)) + if not 0.0 <= eps: + raise ValueError("Invalid epsilon value: {}".format(eps)) + if not 0.0 <= betas[0] < 1.0: + raise ValueError("Invalid beta parameter at index 0: {}".format(betas[0])) + if not 0.0 <= betas[1] < 1.0: + raise ValueError("Invalid beta parameter at index 1: {}".format(betas[1])) + + self.degenerated_to_sgd = degenerated_to_sgd + defaults = dict(lr=lr, betas=betas, eps=eps, weight_decay=weight_decay) + + super(PlainRAdam, self).__init__(params, defaults) + + def __setstate__(self, state): + super(PlainRAdam, self).__setstate__(state) + + def step(self, closure=None): + + loss = None + if closure is not None: + loss = closure() + + for group in self.param_groups: + + for p in group['params']: + if p.grad is None: + continue + grad = p.grad.data.float() + if grad.is_sparse: + raise RuntimeError('RAdam does not support sparse gradients') + + p_data_fp32 = p.data.float() + + state = self.state[p] + + if len(state) == 0: + state['step'] = 0 + state['exp_avg'] = torch.zeros_like(p_data_fp32) + state['exp_avg_sq'] = torch.zeros_like(p_data_fp32) + else: + state['exp_avg'] = state['exp_avg'].type_as(p_data_fp32) + state['exp_avg_sq'] = state['exp_avg_sq'].type_as(p_data_fp32) + + exp_avg, exp_avg_sq = state['exp_avg'], state['exp_avg_sq'] + beta1, beta2 = group['betas'] + + exp_avg_sq.mul_(beta2).addcmul_(1 - beta2, grad, grad) + exp_avg.mul_(beta1).add_(1 - beta1, grad) + + state['step'] += 1 + beta2_t = beta2 ** state['step'] + N_sma_max = 2 / (1 - beta2) - 1 + N_sma = N_sma_max - 2 * state['step'] * beta2_t / (1 - beta2_t) + + + # more conservative since it's an approximated value + if N_sma >= 5: + if group['weight_decay'] != 0: + p_data_fp32.add_(-group['weight_decay'] * group['lr'], p_data_fp32) + step_size = group['lr'] * math.sqrt((1 - beta2_t) * (N_sma - 4) / (N_sma_max - 4) * (N_sma - 2) / N_sma * N_sma_max / (N_sma_max - 2)) / (1 - beta1 ** state['step']) + denom = exp_avg_sq.sqrt().add_(group['eps']) + p_data_fp32.addcdiv_(-step_size, exp_avg, denom) + p.data.copy_(p_data_fp32) + elif self.degenerated_to_sgd: + if group['weight_decay'] != 0: + p_data_fp32.add_(-group['weight_decay'] * group['lr'], p_data_fp32) + step_size = group['lr'] / (1 - beta1 ** state['step']) + p_data_fp32.add_(-step_size, exp_avg) + p.data.copy_(p_data_fp32) + + return loss + + +class AdamW(Optimizer): + + def __init__(self, params, lr=1e-3, betas=(0.9, 0.999), eps=1e-8, weight_decay=0, warmup = 0): + if not 0.0 <= lr: + raise ValueError("Invalid learning rate: {}".format(lr)) + if not 0.0 <= eps: + raise ValueError("Invalid epsilon value: {}".format(eps)) + if not 0.0 <= betas[0] < 1.0: + raise ValueError("Invalid beta parameter at index 0: {}".format(betas[0])) + if not 0.0 <= betas[1] < 1.0: + raise ValueError("Invalid beta parameter at index 1: {}".format(betas[1])) + + defaults = dict(lr=lr, betas=betas, eps=eps, + weight_decay=weight_decay, warmup = warmup) + super(AdamW, self).__init__(params, defaults) + + def __setstate__(self, state): + super(AdamW, self).__setstate__(state) + + def step(self, closure=None): + loss = None + if closure is not None: + loss = closure() + + for group in self.param_groups: + + for p in group['params']: + if p.grad is None: + continue + grad = p.grad.data.float() + if grad.is_sparse: + raise RuntimeError('Adam does not support sparse gradients, please consider SparseAdam instead') + + p_data_fp32 = p.data.float() + + state = self.state[p] + + if len(state) == 0: + state['step'] = 0 + state['exp_avg'] = torch.zeros_like(p_data_fp32) + state['exp_avg_sq'] = torch.zeros_like(p_data_fp32) + else: + state['exp_avg'] = state['exp_avg'].type_as(p_data_fp32) + state['exp_avg_sq'] = state['exp_avg_sq'].type_as(p_data_fp32) + + exp_avg, exp_avg_sq = state['exp_avg'], state['exp_avg_sq'] + beta1, beta2 = group['betas'] + + state['step'] += 1 + + exp_avg_sq.mul_(beta2).addcmul_(1 - beta2, grad, grad) + exp_avg.mul_(beta1).add_(1 - beta1, grad) + + denom = exp_avg_sq.sqrt().add_(group['eps']) + bias_correction1 = 1 - beta1 ** state['step'] + bias_correction2 = 1 - beta2 ** state['step'] + + if group['warmup'] > state['step']: + scheduled_lr = 1e-8 + state['step'] * group['lr'] / group['warmup'] + else: + scheduled_lr = group['lr'] + + step_size = scheduled_lr * math.sqrt(bias_correction2) / bias_correction1 + + if group['weight_decay'] != 0: + p_data_fp32.add_(-group['weight_decay'] * scheduled_lr, p_data_fp32) + + p_data_fp32.addcdiv_(-step_size, exp_avg, denom) + + p.data.copy_(p_data_fp32) + + return loss + diff --git a/lib/utils/render_utils.py b/lib/utils/render_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..976559868709929db3afef06cae25bb7bf0109da --- /dev/null +++ b/lib/utils/render_utils.py @@ -0,0 +1,170 @@ +import numpy as np +import json +import os +import cv2 + +from lib.config import cfg + +from lib.utils.if_nerf import if_nerf_data_utils as if_nerf_dutils + + +def normalize(x): + return x / np.linalg.norm(x) + + +def viewmatrix(z, up, pos): + vec2 = normalize(z) + vec0_avg = up + vec1 = normalize(np.cross(vec2, vec0_avg)) + vec0 = normalize(np.cross(vec1, vec2)) + m = np.stack([vec0, vec1, vec2, pos], 1) + return m + + +def ptstocam(pts, c2w): + tt = np.matmul(c2w[:3, :3].T, (pts-c2w[:3, 3])[..., np.newaxis])[..., 0] + return tt + + +def load_cam(ann_file): + if ann_file.endswith('.json'): + annots = json.load(open(ann_file, 'r')) + cams = annots['cams']['20190823'] + else: + annots = np.load(ann_file, allow_pickle=True).item() + cams = annots['cams'] + + K = [] + RT = [] + lower_row = np.array([[0., 0., 0., 1.]]) + + for i in range(len(cams['K'])): + K.append(np.array(cams['K'][i])) + K[i][:2] = K[i][:2] * cfg.ratio + + r = np.array(cams['R'][i]) + t = np.array(cams['T'][i]) / 1000. + r_t = np.concatenate([r, t], 1) + RT.append(np.concatenate([r_t, lower_row], 0)) + + return K, RT + + +def get_center_rayd(K, RT): + H, W = int(cfg.H * cfg.ratio), int(cfg.W * cfg.ratio) + RT = np.array(RT) + ray_o, ray_d = if_nerf_dutils.get_rays(H, W, K, + RT[:3, :3], RT[:3, 3]) + return ray_d[H // 2, W // 2] + + +def gen_path(RT, center=None): + lower_row = np.array([[0., 0., 0., 1.]]) + + # transfer RT to camera_to_world matrix + RT = np.array(RT) + RT[:] = np.linalg.inv(RT[:]) + + RT = np.concatenate([RT[:, :, 1:2], RT[:, :, 0:1], + -RT[:, :, 2:3], RT[:, :, 3:4]], 2) + + up = normalize(RT[:, :3, 0].sum(0)) # average up vector + z = normalize(RT[0, :3, 2]) + vec1 = normalize(np.cross(z, up)) + vec2 = normalize(np.cross(up, vec1)) + z_off = 0 + + if center is None: + center = RT[:, :3, 3].mean(0) + z_off = 1.3 + + c2w = np.stack([up, vec1, vec2, center], 1) + + # get radii for spiral path + tt = ptstocam(RT[:, :3, 3], c2w).T + rads = np.percentile(np.abs(tt), 80, -1) + rads = rads * 1.3 + rads = np.array(list(rads) + [1.]) + + render_w2c = [] + for theta in np.linspace(0., 2 * np.pi, cfg.num_render_views + 1)[:-1]: + # camera position + cam_pos = np.array([0, np.sin(theta), np.cos(theta), 1] * rads) + cam_pos_world = np.dot(c2w[:3, :4], cam_pos) + # z axis + z = normalize(cam_pos_world - + np.dot(c2w[:3, :4], np.array([z_off, 0, 0, 1.]))) + # vector -> 3x4 matrix (camera_to_world) + mat = viewmatrix(z, up, cam_pos_world) + + mat = np.concatenate([mat[:, 1:2], mat[:, 0:1], + -mat[:, 2:3], mat[:, 3:4]], 1) + mat = np.concatenate([mat, lower_row], 0) + mat = np.linalg.inv(mat) + render_w2c.append(mat) + + return render_w2c + + +def read_voxel(frame, args): + voxel_path = os.path.join(args['data_root'], 'voxel', args['human'], + '{}.npz'.format(frame)) + voxel_data = np.load(voxel_path) + occupancy = np.unpackbits(voxel_data['compressed_occupancies']) + occupancy = occupancy.reshape(cfg.res, cfg.res, + cfg.res).astype(np.float32) + bounds = voxel_data['bounds'].astype(np.float32) + return occupancy, bounds + + +def image_rays(RT, K, bounds): + H = cfg.H * cfg.ratio + W = cfg.W * cfg.ratio + ray_o, ray_d = if_nerf_dutils.get_rays(H, W, K, + RT[:3, :3], RT[:3, 3]) + + ray_o = ray_o.reshape(-1, 3).astype(np.float32) + ray_d = ray_d.reshape(-1, 3).astype(np.float32) + near, far, mask_at_box = if_nerf_dutils.get_near_far(bounds, ray_o, ray_d) + near = near.astype(np.float32) + far = far.astype(np.float32) + ray_o = ray_o[mask_at_box] + ray_d = ray_d[mask_at_box] + + center = (bounds[0] + bounds[1]) / 2 + scale = np.max(bounds[1] - bounds[0]) + + return ray_o, ray_d, near, far, center, scale, mask_at_box + + +def get_image_rays0(RT0, RT, K, bounds): + """ + Use RT to get the mask_at_box and fill this region with rays emitted from view RT0 + """ + H = cfg.H * cfg.ratio + ray_o, ray_d = if_nerf_dutils.get_rays(H, H, K, + RT[:3, :3], RT[:3, 3]) + + ray_o = ray_o.reshape(-1, 3).astype(np.float32) + ray_d = ray_d.reshape(-1, 3).astype(np.float32) + near, far, mask_at_box = if_nerf_dutils.get_near_far(bounds, ray_o, ray_d) + + ray_o, ray_d = if_nerf_dutils.get_rays(H, H, K, + RT0[:3, :3], RT0[:3, 3]) + ray_d = ray_d.reshape(-1, 3).astype(np.float32) + ray_d = ray_d[mask_at_box] + + return ray_d + + +def save_img(img, frame_root, index, mask_at_box): + H = int(cfg.H * cfg.ratio) + rgb_pred = img['rgb_map'][0].detach().cpu().numpy() + mask_at_box = mask_at_box.reshape(H, H) + + img_pred = np.zeros((H, H, 3)) + img_pred[mask_at_box] = rgb_pred + img_pred[:, :, [0, 1, 2]] = img_pred[:, :, [2, 1, 0]] + + print("saved frame %d" % index) + cv2.imwrite(os.path.join(frame_root, '%d.jpg' % index), img_pred * 255) diff --git a/lib/utils/snapshot_data_utils.py b/lib/utils/snapshot_data_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..0e4450e5494a8ae9a49a1d03b45bacfeb73ce8d2 --- /dev/null +++ b/lib/utils/snapshot_data_utils.py @@ -0,0 +1,23 @@ +import pickle +import numpy as np + + +def read_pickle(pkl_path): + with open(pkl_path, 'rb') as f: + u = pickle._Unpickler(f) + u.encoding = 'latin1' + return u.load() + + +def get_camera(camera_path): + camera = read_pickle(camera_path) + K = np.zeros([3, 3]) + K[0, 0] = camera['camera_f'][0] + K[1, 1] = camera['camera_f'][1] + K[:2, 2] = camera['camera_c'] + K[2, 2] = 1 + R = np.eye(3) + T = np.zeros([3]) + D = camera['camera_k'] + camera = {'K': K, 'R': R, 'T': T, 'D': D} + return camera diff --git a/lib/utils/vis_utils.py b/lib/utils/vis_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..f1ed967b3280d851e16581ff3fd7e0484f80a6cc --- /dev/null +++ b/lib/utils/vis_utils.py @@ -0,0 +1,242 @@ +import numpy as np +import torch +import cv2 +import matplotlib.pyplot as plt +from mpl_toolkits.mplot3d import Axes3D +import os +import matplotlib.patches as patches +from sklearn.manifold import TSNE +import open3d as o3d + +kintree = { + 'kintree': [[1, 0], [2, 1], [3, 2], [4, 3], [5, 1], [6, 5], [7, 6], [8, 1], + [9, 8], [10, 9], [11, 10], [12, 8], [13, 12], [14, 13], + [15, 0], [16, 0], [17, 15], [18, 16], [19, 14], [20, 19], + [21, 14], [22, 11], [23, 22], [24, 11]], + 'color': [ + 'k', 'r', 'r', 'r', 'b', 'b', 'b', 'k', 'r', 'r', 'r', 'b', 'b', 'b', + 'y', 'y', 'y', 'y', 'b', 'b', 'b', 'r', 'r', 'r' + ] +} + + +def plotSkel3D(pts, + config=kintree, + ax=None, + phi=0, + theta=0, + max_range=1, + linewidth=4, + color=None): + multi = False + if torch.is_tensor(pts): + if len(pts.shape) == 3: + print(">>> Visualize multiperson ...") + multi = True + if pts.shape[1] != 3: + pts = pts.transpose(1, 2) + elif len(pts.shape) == 2: + if pts.shape[0] != 3: + pts = pts.transpose(0, 1) + else: + raise RuntimeError('The dimension of the points is wrong!') + pts = pts.detach().cpu().numpy() + else: + if pts.shape[0] != 3: + pts = pts.T + # pts : bn, 3, NumOfPoints or (3, N) + if ax is None: + print('>>> create figure ...') + fig = plt.figure(figsize=[5, 5]) + ax = fig.add_subplot(111, projection='3d') + for idx, (i, j) in enumerate(config['kintree']): + if multi: + for b in range(pts.shape[0]): + ax.plot([pts[b][0][i], pts[b][0][j]], + [pts[b][1][i], pts[b][1][j]], + [pts[b][2][i], pts[b][2][j]], + lw=linewidth, + color=config['color'][idx] if color is None else color, + alpha=1) + else: + ax.plot([pts[0][i], pts[0][j]], [pts[1][i], pts[1][j]], + [pts[2][i], pts[2][j]], + lw=linewidth, + color=config['color'][idx], + alpha=1) + if multi: + for b in range(pts.shape[0]): + ax.scatter(pts[b][0], pts[b][1], pts[b][2], color='r', alpha=1) + else: + ax.scatter(pts[0], pts[1], pts[2], color='r', alpha=1, s=0.5) + ax.view_init(phi, theta) + ax.set_xlim(-max_range, max_range) + ax.set_ylim(-max_range, max_range) + ax.set_zlim(-0.05, 2) + + # ax.axis('equal') + plt.xlabel('x') + plt.ylabel('y') + # plt.zlabel('z') + return ax + + +def plotSkel2D(pts, + config=kintree, + ax=None, + linewidth=2, + alpha=1, + max_range=1, + imgshape=None, + thres=0.1): + if len(pts.shape) == 2: + pts = pts[None, :, :] #(nP, nJ, 2/3) + elif len(pts.shape) == 3: + pass + else: + raise RuntimeError('The dimension of the points is wrong!') + if torch.is_tensor(pts): + pts = pts.detach().cpu().numpy() + if pts.shape[2] == 3 or pts.shape[2] == 2: + pts = pts.transpose((0, 2, 1)) + # pts : bn, 2/3, NumOfPoints or (2/3, N) + if ax is None: + fig = plt.figure(figsize=[5, 5]) + ax = fig.add_subplot(111) + if 'color' in config.keys(): + colors = config['color'] + else: + colors = ['b' for _ in range(len(config['kintree']))] + + def inrange(imgshape, pts): + if pts[0] < 5 or \ + pts[0] > imgshape[1] - 5 or \ + pts[1] < 5 or \ + pts[1] > imgshape[0] - 5: + return False + else: + return True + + for nP in range(pts.shape[0]): + for idx, (i, j) in enumerate(config['kintree']): + if pts.shape[1] == 3: # with confidence + if np.min(pts[nP][2][[i, j]]) < thres: + continue + lw = linewidth * 2 * np.min(pts[nP][2][[i, j]]) + else: + lw = linewidth + if imgshape is not None: + if inrange(imgshape, pts[nP, :, i]) and \ + inrange(imgshape, pts[nP, :, j]): + pass + else: + continue + ax.plot([pts[nP][0][i], pts[nP][0][j]], + [pts[nP][1][i], pts[nP][1][j]], + lw=lw, + color=colors[idx], + alpha=1) + # if pts.shape[1] > 2: + # ax.scatter(pts[nP][0], pts[nP][1], s=10*(pts[nP][2]-thres), c='r') + if False: + ax.axis('equal') + plt.xlabel('x') + plt.ylabel('y') + else: + ax.axis('off') + return ax + + +def draw_skeleton(img, kpts2d): + cv_img = img.copy() + for kp in kpts2d: + if kp.shape[-1] == 2 or (kp.shape[-1] == 3 and kp[-1] > 0): + cv_img = cv2.circle(cv_img, tuple(kp[:2].astype(int)), 10, + (255, 0, 0)) + return cv_img + + +def vis_frame(data_root, im_data, camera): + from external.SMPL_CPP.build.python import pysmplceres + from .smpl_renderer import Renderer + + imgs = [ + cv2.imread(os.path.join(data_root, im_path)) + for im_path in im_data['ims'] + ] + imgs = [cv2.resize(img, (1024, 1024)) for img in imgs] + + Ks = np.array(camera['K']) + Rs = np.array(camera['R']) + Ts = np.array(camera['T']).transpose(0, 2, 1) / 1000 + + faces = np.loadtxt('data/smpl/faces.txt').astype(np.int32) + render = Renderer(height=1024, width=1024, faces=faces) + vertices = pysmplceres.getVertices(im_data['smpl_result']) + + imgsrender = render.render_multiview(vertices[0], Ks, Rs, Ts, imgs) + for img in imgsrender: + plt.imshow(img[..., ::-1]) + plt.show() + + +def vis_skeleton_frame(data_root, im_data, camera): + from external.SMPL_CPP.build.python import pysmplceres + from .smpl_renderer import Renderer + + imgs = [ + cv2.imread(os.path.join(data_root, im_path)) + for im_path in im_data['ims'] + ] + imgs = [cv2.resize(img, (1024, 1024)) for img in imgs] + kpts2d = np.array(im_data['kpts2d']) + + for img, kpts in zip(imgs, kpts2d): + _, ax = plt.subplots(1, 1) + ax.imshow(img[..., ::-1]) + plotSkel2D(kpts, ax=ax) + plt.show() + + +def vis_bbox(img, corners_2d, coord): + _, ax = plt.subplots(1) + ax.imshow(img) + ax.add_patch( + patches.Polygon(xy=corners_2d[[0, 1, 3, 2, 0, 4, 6, 2]], + fill=False, + linewidth=1, + edgecolor='g')) + ax.add_patch( + patches.Polygon(xy=corners_2d[[5, 4, 6, 7, 5, 1, 3, 7]], + fill=False, + linewidth=1, + edgecolor='g')) + ax.plot(coord[:, 1], coord[:, 0], '.') + plt.show() + + +def tsne_colors(data): + """ + N x D np.array data + """ + tsne = TSNE(n_components=1, + verbose=1, + perplexity=40, + n_iter=300, + random_state=0) + tsne_results = tsne.fit_transform(data) + tsne_results = np.squeeze(tsne_results) + tsne_min = np.min(tsne_results) + tsne_max = np.max(tsne_results) + tsne_results = (tsne_results - tsne_min) / (tsne_max - tsne_min) + colors = plt.cm.Spectral(tsne_results)[:, :3] + return colors + + +def get_colored_pc(pts, rgb): + pc = o3d.geometry.PointCloud() + pc.points = o3d.utility.Vector3dVector(pts) + colors = np.zeros_like(pts) + colors += rgb + pc.colors = o3d.utility.Vector3dVector(colors) + return pc diff --git a/lib/visualizers/__init__.py b/lib/visualizers/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..a98978038d9103f47cf1a005f4be460aeec1d237 --- /dev/null +++ b/lib/visualizers/__init__.py @@ -0,0 +1 @@ +from .make_visualizer import make_visualizer diff --git a/lib/visualizers/if_nerf.py b/lib/visualizers/if_nerf.py new file mode 100644 index 0000000000000000000000000000000000000000..3c6b091d8dbc422beb13db582ca0ecafd0a7a9f5 --- /dev/null +++ b/lib/visualizers/if_nerf.py @@ -0,0 +1,29 @@ +import matplotlib.pyplot as plt +import numpy as np +from lib.config import cfg + + +class Visualizer: + def visualize(self, output, batch): + rgb_pred = output['rgb_map'][0].detach().cpu().numpy() + rgb_gt = batch['rgb'][0].detach().cpu().numpy() + print('mse: {}'.format(np.mean((rgb_pred - rgb_gt) ** 2))) + + mask_at_box = batch['mask_at_box'][0].detach().cpu().numpy() + H, W = int(cfg.H * cfg.ratio), int(cfg.W * cfg.ratio) + mask_at_box = mask_at_box.reshape(H, W) + + img_pred = np.zeros((H, W, 3)) + if cfg.white_bkgd: + img_pred = img_pred + 1 + img_pred[mask_at_box] = rgb_pred + + img_gt = np.zeros((H, W, 3)) + if cfg.white_bkgd: + img_gt = img_gt + 1 + img_gt[mask_at_box] = rgb_gt + + _, (ax1, ax2) = plt.subplots(1, 2) + ax1.imshow(img_pred) + ax2.imshow(img_gt) + plt.show() diff --git a/lib/visualizers/if_nerf_demo.py b/lib/visualizers/if_nerf_demo.py new file mode 100644 index 0000000000000000000000000000000000000000..90aea14f8861e44620fe4e51ae98668260145ea5 --- /dev/null +++ b/lib/visualizers/if_nerf_demo.py @@ -0,0 +1,52 @@ +import matplotlib.pyplot as plt +import numpy as np +from lib.config import cfg +import cv2 +import os +from termcolor import colored + + +class Visualizer: + def __init__(self): + data_dir = 'data/render/{}'.format(cfg.exp_name) + print(colored('the results are saved at {}'.format(data_dir), + 'yellow')) + + def visualize(self, output, batch): + rgb_pred = output['rgb_map'][0].detach().cpu().numpy() + + mask_at_box = batch['mask_at_box'][0].detach().cpu().numpy() + H, W = int(cfg.H * cfg.ratio), int(cfg.W * cfg.ratio) + mask_at_box = mask_at_box.reshape(H, W) + + img_pred = np.zeros((H, W, 3)) + if cfg.white_bkgd: + img_pred = img_pred + 1 + img_pred[mask_at_box] = rgb_pred + img_pred = img_pred[..., [2, 1, 0]] + + depth_pred = np.zeros((H, W)) + depth_pred[mask_at_box] = output['depth_map'][0].detach().cpu().numpy() + + img_root = 'data/render/{}/frame_{:04d}'.format( + cfg.exp_name, batch['frame_index'].item()) + os.system('mkdir -p {}'.format(img_root)) + index = batch['view_index'].item() + + # plt.imshow(depth_pred) + # depth_dir = os.path.join(img_root, 'depth') + # os.system('mkdir -p {}'.format(depth_dir)) + # plt.savefig(os.path.join(depth_dir, '{}.jpg'.format(index))) + # plt.close() + + # mask_pred = np.zeros((H, W, 3)) + # mask_pred[acc_pred > 0.5] = 255 + + # acc_dir = os.path.join(img_root, 'mask') + # os.system('mkdir -p {}'.format(acc_dir)) + # mask = cv2.resize(mask_pred, (H * 8, W * 8), interpolation=cv2.INTER_NEAREST) + # mask_path = os.path.join(acc_dir, 'img_{:04d}.jpg'.format(index)) + # cv2.imwrite(mask_path, mask) + + cv2.imwrite(os.path.join(img_root, '{:04d}.png'.format(index)), + img_pred * 255) diff --git a/lib/visualizers/if_nerf_mesh.py b/lib/visualizers/if_nerf_mesh.py new file mode 100644 index 0000000000000000000000000000000000000000..03d039853db959763e838631e7146849a25ad537 --- /dev/null +++ b/lib/visualizers/if_nerf_mesh.py @@ -0,0 +1,34 @@ +from lib.utils.if_nerf import voxels +import numpy as np +from lib.config import cfg +import os +from termcolor import colored + + +class Visualizer: + def __init__(self): + result_dir = os.path.join(cfg.result_dir, 'mesh') + print(colored('the results are saved at {}'.format(result_dir), 'yellow')) + + def visualize_voxel(self, output, batch): + cube = output['cube'] + cube = cube[10:-10, 10:-10, 10:-10] + cube[cube < cfg.mesh_th] = 0 + cube[cube > cfg.mesh_th] = 1 + + sh = cube.shape + square_cube = np.zeros((max(sh), ) * 3) + square_cube[:sh[0], :sh[1], :sh[2]] = cube + voxel_grid = voxels.VoxelGrid(square_cube) + mesh = voxel_grid.to_mesh() + mesh.show() + + def visualize(self, output, batch): + mesh = output['mesh'] + # mesh.show() + + result_dir = os.path.join(cfg.result_dir, 'mesh') + os.system('mkdir -p {}'.format(result_dir)) + i = batch['frame_index'].item() + result_path = os.path.join(result_dir, '{:04d}.ply'.format(i)) + mesh.export(result_path) diff --git a/lib/visualizers/if_nerf_perform.py b/lib/visualizers/if_nerf_perform.py new file mode 100644 index 0000000000000000000000000000000000000000..bbc67a2b50d2af2b3676303b47cb43a67215d620 --- /dev/null +++ b/lib/visualizers/if_nerf_perform.py @@ -0,0 +1,36 @@ +import matplotlib.pyplot as plt +import numpy as np +from lib.config import cfg +import cv2 +import os +from termcolor import colored + + +class Visualizer: + def __init__(self): + data_dir = 'data/perform/{}'.format(cfg.exp_name) + print(colored('the results are saved at {}'.format(data_dir), + 'yellow')) + + def visualize(self, output, batch): + rgb_pred = output['rgb_map'][0].detach().cpu().numpy() + + mask_at_box = batch['mask_at_box'][0].detach().cpu().numpy() + H, W = int(cfg.H * cfg.ratio), int(cfg.W * cfg.ratio) + mask_at_box = mask_at_box.reshape(H, W) + + img_pred = np.zeros((H, W, 3)) + if cfg.white_bkgd: + img_pred = img_pred + 1 + img_pred[mask_at_box] = rgb_pred + img_pred = img_pred[..., [2, 1, 0]] + + frame_root = 'data/perform/{}/{}'.format(cfg.exp_name, 0) + os.system('mkdir -p {}'.format(frame_root)) + frame_index = batch['frame_index'].item() + view_index = batch['view_index'].item() + cv2.imwrite( + os.path.join( + frame_root, + 'frame{:04d}_view{:04d}.png'.format(frame_index, view_index)), + img_pred * 255) diff --git a/lib/visualizers/if_nerf_test.py b/lib/visualizers/if_nerf_test.py new file mode 100644 index 0000000000000000000000000000000000000000..e000d79dfa516c1abd2224f3e0091e5b8d914396 --- /dev/null +++ b/lib/visualizers/if_nerf_test.py @@ -0,0 +1,38 @@ +import matplotlib.pyplot as plt +import numpy as np +from lib.config import cfg +import os +import cv2 + + +class Visualizer: + def visualize(self, output, batch): + rgb_pred = output['rgb_map'][0].detach().cpu().numpy() + + mask_at_box = batch['mask_at_box'][0].detach().cpu().numpy() + H, W = int(cfg.H * cfg.ratio), int(cfg.W * cfg.ratio) + mask_at_box = mask_at_box.reshape(H, W) + + img_pred = np.zeros((H, W, 3)) + img_pred[mask_at_box] = rgb_pred + + result_dir = os.path.join('data/result/if-nerf', cfg.exp_name) + + if cfg.human in [302, 313, 315]: + i = batch['i'].item() + 1 + else: + i = batch['i'].item() + i = i + cfg.begin_i + cam_ind = batch['cam_ind'].item() + frame_dir = os.path.join(result_dir, 'frame_{}'.format(i)) + pred_img_path = os.path.join(frame_dir, + 'pred_{}.jpg'.format(cam_ind + 1)) + + os.system('mkdir -p {}'.format(os.path.dirname(pred_img_path))) + img_pred = (img_pred * 255)[..., [2, 1, 0]] + cv2.imwrite(pred_img_path, img_pred) + + # _, (ax1, ax2) = plt.subplots(1, 2) + # ax1.imshow(img_pred) + # ax2.imshow(img_gt) + # plt.show() diff --git a/lib/visualizers/make_visualizer.py b/lib/visualizers/make_visualizer.py new file mode 100644 index 0000000000000000000000000000000000000000..8df4acff9fdb814704b27510a363690b804a957b --- /dev/null +++ b/lib/visualizers/make_visualizer.py @@ -0,0 +1,9 @@ +import os +import imp + + +def make_visualizer(cfg): + module = cfg.visualizer_module + path = cfg.visualizer_path + visualizer = imp.load_source(module, path).Visualizer() + return visualizer diff --git a/requirements.txt b/requirements.txt new file mode 100644 index 0000000000000000000000000000000000000000..5ce0503e4678956d98a4f2a4524fcff60b82a781 --- /dev/null +++ b/requirements.txt @@ -0,0 +1,14 @@ +open3d>=0.9.0.0 +PyYAML==5.3.1 +tqdm==4.28.1 +tensorboardX==1.2 +termcolor==1.1.0 +scikit-image==0.14.2 +opencv-contrib-python>=3.4.2.17 +opencv-python>=3.4.2.17,<4 +imageio==2.3.0 +trimesh==3.8.15 +plyfile==0.6 +PyMCubes==0.1.0 +pyglet==1.4.0b1 +chumpy diff --git a/run.py b/run.py new file mode 100644 index 0000000000000000000000000000000000000000..ee3ba34d3ea1f35a2e56715ab065ba1217e02517 --- /dev/null +++ b/run.py @@ -0,0 +1,126 @@ +from lib.config import cfg, args + + +def run_dataset(): + from lib.datasets import make_data_loader + import tqdm + + cfg.train.num_workers = 0 + data_loader = make_data_loader(cfg, is_train=False) + for batch in tqdm.tqdm(data_loader): + pass + + +def run_network(): + from lib.networks import make_network + from lib.datasets import make_data_loader + from lib.utils.net_utils import load_network + import tqdm + import torch + import time + + network = make_network(cfg).cuda() + load_network(network, cfg.trained_model_dir, epoch=cfg.test.epoch) + network.eval() + + data_loader = make_data_loader(cfg, is_train=False) + total_time = 0 + for batch in tqdm.tqdm(data_loader): + for k in batch: + if k != 'meta': + batch[k] = batch[k].cuda() + with torch.no_grad(): + torch.cuda.synchronize() + start = time.time() + network(batch) + torch.cuda.synchronize() + total_time += time.time() - start + print(total_time / len(data_loader)) + + +def run_evaluate(): + from lib.datasets import make_data_loader + from lib.evaluators import make_evaluator + import tqdm + import torch + from lib.networks import make_network + from lib.utils import net_utils + from lib.networks.renderer import make_renderer + + cfg.perturb = 0 + + network = make_network(cfg).cuda() + net_utils.load_network(network, + cfg.trained_model_dir, + resume=cfg.resume, + epoch=cfg.test.epoch) + network.train() + + data_loader = make_data_loader(cfg, is_train=False) + renderer = make_renderer(cfg, network) + evaluator = make_evaluator(cfg) + for batch in tqdm.tqdm(data_loader): + for k in batch: + if k != 'meta': + batch[k] = batch[k].cuda() + with torch.no_grad(): + output = renderer.render(batch) + evaluator.evaluate(output, batch) + evaluator.summarize() + + +def run_visualize(): + from lib.networks import make_network + from lib.datasets import make_data_loader + from lib.utils.net_utils import load_network + from lib.utils import net_utils + import tqdm + import torch + from lib.visualizers import make_visualizer + from lib.networks.renderer import make_renderer + + cfg.perturb = 0 + + network = make_network(cfg).cuda() + load_network(network, + cfg.trained_model_dir, + resume=cfg.resume, + epoch=cfg.test.epoch) + network.train() + + data_loader = make_data_loader(cfg, is_train=False) + renderer = make_renderer(cfg, network) + visualizer = make_visualizer(cfg) + for batch in tqdm.tqdm(data_loader): + for k in batch: + if k != 'meta': + batch[k] = batch[k].cuda() + with torch.no_grad(): + output = renderer.render(batch) + visualizer.visualize(output, batch) + + +def run_light_stage(): + from lib.utils.light_stage import ply_to_occupancy + ply_to_occupancy.ply_to_occupancy() + # ply_to_occupancy.create_voxel_off() + + +def run_evaluate_nv(): + from lib.datasets import make_data_loader + from lib.evaluators import make_evaluator + import tqdm + from lib.utils import net_utils + + data_loader = make_data_loader(cfg, is_train=False) + evaluator = make_evaluator(cfg) + for batch in tqdm.tqdm(data_loader): + for k in batch: + if k != 'meta': + batch[k] = batch[k].cuda() + evaluator.evaluate(batch) + evaluator.summarize() + + +if __name__ == '__main__': + globals()['run_' + args.type]() diff --git a/run.sh b/run.sh new file mode 100644 index 0000000000000000000000000000000000000000..f3776a3363b30bf26981c4440045495a729ca5be --- /dev/null +++ b/run.sh @@ -0,0 +1,87 @@ +# training +# python train_net.py --cfg_file configs/latent_xyzc_313.yaml exp_name xyzc_313 resume False +# python train_net.py --cfg_file configs/latent_xyzc_315.yaml exp_name xyzc_315 resume False +# python train_net.py --cfg_file configs/latent_xyzc_392.yaml exp_name xyzc_392 resume False +# python train_net.py --cfg_file configs/latent_xyzc_393.yaml exp_name xyzc_393 resume False +# python train_net.py --cfg_file configs/latent_xyzc_394.yaml exp_name xyzc_394 resume False +# python train_net.py --cfg_file configs/latent_xyzc_377.yaml exp_name xyzc_377 resume False +# python train_net.py --cfg_file configs/latent_xyzc_386.yaml exp_name xyzc_386 resume False +# python train_net.py --cfg_file configs/latent_xyzc_390.yaml exp_name xyzc_390 resume False +# python train_net.py --cfg_file configs/latent_xyzc_387.yaml exp_name xyzc_387 resume False + +# distributed training +# python -m torch.distributed.launch --nproc_per_node=4 train_net.py --cfg_file configs/latent_xyzc_313.yaml exp_name xyzc_313 resume False gpus "0, 1, 2, 3" distributed True +# python -m torch.distributed.launch --nproc_per_node=4 train_net.py --cfg_file configs/latent_xyzc_315.yaml exp_name xyzc_315 resume False gpus "0, 1, 2, 3" distributed True +# python -m torch.distributed.launch --nproc_per_node=4 train_net.py --cfg_file configs/latent_xyzc_392.yaml exp_name xyzc_392 resume False gpus "0, 1, 2, 3" distributed True +# python -m torch.distributed.launch --nproc_per_node=4 train_net.py --cfg_file configs/latent_xyzc_393.yaml exp_name xyzc_393 resume False gpus "0, 1, 2, 3" distributed True +# python -m torch.distributed.launch --nproc_per_node=4 train_net.py --cfg_file configs/latent_xyzc_394.yaml exp_name xyzc_394 resume False gpus "0, 1, 2, 3" distributed True +# python -m torch.distributed.launch --nproc_per_node=4 train_net.py --cfg_file configs/latent_xyzc_377.yaml exp_name xyzc_377 resume False gpus "0, 1, 2, 3" distributed True +# python -m torch.distributed.launch --nproc_per_node=4 train_net.py --cfg_file configs/latent_xyzc_386.yaml exp_name xyzc_386 resume False gpus "0, 1, 2, 3" distributed True +# python -m torch.distributed.launch --nproc_per_node=4 train_net.py --cfg_file configs/latent_xyzc_390.yaml exp_name xyzc_390 resume False gpus "0, 1, 2, 3" distributed True +# python -m torch.distributed.launch --nproc_per_node=4 train_net.py --cfg_file configs/latent_xyzc_387.yaml exp_name xyzc_387 resume False gpus "0, 1, 2, 3" distributed True + +# visualize novel views of single frame +# python run.py --type visualize --cfg_file configs/xyzc_demo_313.yaml exp_name xyzc_313_v1 +# python run.py --type visualize --cfg_file configs/xyzc_demo_315.yaml exp_name xyzc_315 +# python run.py --type visualize --cfg_file configs/xyzc_demo_392.yaml exp_name xyzc_392 +# python run.py --type visualize --cfg_file configs/xyzc_demo_393.yaml exp_name xyzc_393 +# python run.py --type visualize --cfg_file configs/xyzc_demo_394.yaml exp_name xyzc_394 +# python run.py --type visualize --cfg_file configs/xyzc_demo_377.yaml exp_name xyzc_377 +# python run.py --type visualize --cfg_file configs/xyzc_demo_386.yaml exp_name xyzc_386 +# python run.py --type visualize --cfg_file configs/xyzc_demo_390.yaml exp_name xyzc_390 +# python run.py --type visualize --cfg_file configs/xyzc_demo_387.yaml exp_name xyzc_387 + +# visualize novel views of dynamic humans +# python run.py --type visualize --cfg_file configs/xyzc_perform_313.yaml exp_name xyzc_313_v1 +# python run.py --type visualize --cfg_file configs/xyzc_perform_315.yaml exp_name xyzc_315 +# python run.py --type visualize --cfg_file configs/xyzc_perform_392.yaml exp_name xyzc_392 +# python run.py --type visualize --cfg_file configs/xyzc_perform_393.yaml exp_name xyzc_393 +# python run.py --type visualize --cfg_file configs/xyzc_perform_394.yaml exp_name xyzc_394 +# python run.py --type visualize --cfg_file configs/xyzc_perform_377.yaml exp_name xyzc_377 +# python run.py --type visualize --cfg_file configs/xyzc_perform_386.yaml exp_name xyzc_386 +# python run.py --type visualize --cfg_file configs/xyzc_perform_390.yaml exp_name xyzc_390 +# python run.py --type visualize --cfg_file configs/xyzc_perform_387.yaml exp_name xyzc_387 + +# visualize mesh +# python run.py --type visualize --cfg_file configs/latent_xyzc_mesh_313.yaml exp_name xyzc_313_v1 train.num_workers 0 +# python run.py --type visualize --cfg_file configs/latent_xyzc_mesh_315.yaml exp_name xyzc_315 train.num_workers 0 +# python run.py --type visualize --cfg_file configs/latent_xyzc_mesh_392.yaml exp_name xyzc_392 train.num_workers 0 +# python run.py --type visualize --cfg_file configs/latent_xyzc_mesh_393.yaml exp_name xyzc_393 train.num_workers 0 +# python run.py --type visualize --cfg_file configs/latent_xyzc_mesh_394.yaml exp_name xyzc_394 train.num_workers 0 +# python run.py --type visualize --cfg_file configs/latent_xyzc_mesh_377.yaml exp_name xyzc_377 train.num_workers 0 +# python run.py --type visualize --cfg_file configs/latent_xyzc_mesh_386.yaml exp_name xyzc_386 train.num_workers 0 +# python run.py --type visualize --cfg_file configs/latent_xyzc_mesh_390.yaml exp_name xyzc_390 train.num_workers 0 +# python run.py --type visualize --cfg_file configs/latent_xyzc_mesh_387.yaml exp_name xyzc_387 train.num_workers 0 + +# visualize test views +# python run.py --type visualize --cfg_file configs/latent_xyzc_313.yaml exp_name xyzc_313_v1 test_dataset_path 'lib/datasets/light_stage/can_smpl_test.py' visualizer_path 'lib/visualizers/if_nerf_test.py' renderer_path 'lib/networks/renderer/if_clight_renderer_mmsk.py' +# python run.py --type visualize --cfg_file configs/latent_xyzc_315.yaml exp_name xyzc_315 test_dataset_path 'lib/datasets/light_stage/can_smpl_test.py' visualizer_path 'lib/visualizers/if_nerf_test.py' renderer_path 'lib/networks/renderer/if_clight_renderer_mmsk.py' +# python run.py --type visualize --cfg_file configs/latent_xyzc_392.yaml exp_name xyzc_392 test_dataset_path 'lib/datasets/light_stage/can_smpl_test.py' visualizer_path 'lib/visualizers/if_nerf_test.py' renderer_path 'lib/networks/renderer/if_clight_renderer_mmsk.py' +# python run.py --type visualize --cfg_file configs/latent_xyzc_393.yaml exp_name xyzc_393 test_dataset_path 'lib/datasets/light_stage/can_smpl_test.py' visualizer_path 'lib/visualizers/if_nerf_test.py' renderer_path 'lib/networks/renderer/if_clight_renderer_mmsk.py' +# python run.py --type visualize --cfg_file configs/latent_xyzc_394.yaml exp_name xyzc_394 test_dataset_path 'lib/datasets/light_stage/can_smpl_test.py' visualizer_path 'lib/visualizers/if_nerf_test.py' renderer_path 'lib/networks/renderer/if_clight_renderer_mmsk.py' +# python run.py --type visualize --cfg_file configs/latent_xyzc_377.yaml exp_name xyzc_377 test_dataset_path 'lib/datasets/light_stage/can_smpl_test.py' visualizer_path 'lib/visualizers/if_nerf_test.py' renderer_path 'lib/networks/renderer/if_clight_renderer_mmsk.py' +# python run.py --type visualize --cfg_file configs/latent_xyzc_386.yaml exp_name xyzc_386 test_dataset_path 'lib/datasets/light_stage/can_smpl_test.py' visualizer_path 'lib/visualizers/if_nerf_test.py' renderer_path 'lib/networks/renderer/if_clight_renderer_mmsk.py' +# python run.py --type visualize --cfg_file configs/latent_xyzc_390.yaml exp_name xyzc_390 test_dataset_path 'lib/datasets/light_stage/can_smpl_test.py' visualizer_path 'lib/visualizers/if_nerf_test.py' renderer_path 'lib/networks/renderer/if_clight_renderer_mmsk.py' +# python run.py --type visualize --cfg_file configs/latent_xyzc_387.yaml exp_name xyzc_387 test_dataset_path 'lib/datasets/light_stage/can_smpl_test.py' visualizer_path 'lib/visualizers/if_nerf_test.py' renderer_path 'lib/networks/renderer/if_clight_renderer_mmsk.py' + +# visualize test views for NeRF +# python run.py --type visualize --cfg_file configs/nerf_313.yaml exp_name nerf_313 visualizer_path 'lib/visualizers/if_nerf_test.py' +# python run.py --type visualize --cfg_file configs/nerf_315.yaml exp_name nerf_315 visualizer_path 'lib/visualizers/if_nerf_test.py' +# python run.py --type visualize --cfg_file configs/nerf_392.yaml exp_name nerf_392 visualizer_path 'lib/visualizers/if_nerf_test.py' +# python run.py --type visualize --cfg_file configs/nerf_393.yaml exp_name nerf_393 visualizer_path 'lib/visualizers/if_nerf_test.py' +# python run.py --type visualize --cfg_file configs/nerf_394.yaml exp_name nerf_394 visualizer_path 'lib/visualizers/if_nerf_test.py' +# python run.py --type visualize --cfg_file configs/nerf_377.yaml exp_name nerf_377 visualizer_path 'lib/visualizers/if_nerf_test.py' +# python run.py --type visualize --cfg_file configs/nerf_386.yaml exp_name nerf_386 visualizer_path 'lib/visualizers/if_nerf_test.py' +# python run.py --type visualize --cfg_file configs/nerf_390.yaml exp_name nerf_390 visualizer_path 'lib/visualizers/if_nerf_test.py' +# python run.py --type visualize --cfg_file configs/nerf_387.yaml exp_name nerf_387 visualizer_path 'lib/visualizers/if_nerf_test.py' + +# evaluation +# python run.py --type evaluate --cfg_file configs/latent_xyzc_313.yaml exp_name xyzc_313_v1 +# python run.py --type evaluate --cfg_file configs/latent_xyzc_315.yaml exp_name xyzc_315 +# python run.py --type evaluate --cfg_file configs/latent_xyzc_392.yaml exp_name xyzc_392 +# python run.py --type evaluate --cfg_file configs/latent_xyzc_393.yaml exp_name xyzc_393 +# python run.py --type evaluate --cfg_file configs/latent_xyzc_394.yaml exp_name xyzc_394 +# python run.py --type evaluate --cfg_file configs/latent_xyzc_377.yaml exp_name xyzc_377 +# python run.py --type evaluate --cfg_file configs/latent_xyzc_386.yaml exp_name xyzc_386 +# python run.py --type evaluate --cfg_file configs/latent_xyzc_390.yaml exp_name xyzc_390 +# python run.py --type evaluate --cfg_file configs/latent_xyzc_387.yaml exp_name xyzc_387 diff --git a/supplementary_material.md b/supplementary_material.md new file mode 100644 index 0000000000000000000000000000000000000000..41b9460b6f1f053f93bf0d2d42a1666dd697d1db --- /dev/null +++ b/supplementary_material.md @@ -0,0 +1,461 @@ +# Supplementary Material + +## Training and test data + +We provide a [website](https://zju3dv.github.io/zju_mocap/) for visualization. + +The multi-view videos are captured by 23 cameras. We train our model on the "0, 6, 12, 18" cameras and test it on the remaining cameras. + +The following table shows the detailed frame numbers for training and test of each video. Since the video length of each subject is different, we choose the appropriate number of frames for training and test. + +**Note that since rendering is very slow, we test our model every 30 frames. For example, although the frame range of video 313 is "0-59", we only test our model on the 0-th and 30-th frames.** + +| Video | 313 | 315 | 377 | 386 | 387 | 390 | 392 | 393 | 394 | +| :-----: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | +| Number of frames | 1470 | 2185 | 617 | 646 | 654 | 1171 | 556 | 658 | 859 | +| Frame Range (Training) | 0-59 | 0-399 | 0-299 | 0-299 | 0-299 | 700-999 | 0-299 | 0-299 | 0-299 | +| Frame Range (Unseen human poses) | 60-1060 | 400-1400 | 300-617 | 300-646 | 300-654 | 0-700 | 300-556 | 300-658 | 300-859 | + +## Evaluation metrics + +**We save our rendering results on novel views of training frames and unseen human poses at [here](https://zjueducn-my.sharepoint.com/:u:/g/personal/pengsida_zju_edu_cn/Ea3VOUy204VAiVJ-V-OGd9YBxdhbtfpS-U6icD_rDq0mUQ?e=cAcylK).** + +As described in the paper, we evaluate our model in terms of the PSNR and SSIM metrics. + +A straightforward way for evaluation is calculating the metrics on the whole image. Since we already know the 3D bounding box of the target human, we can project the 3D box to obtain a `bound_mask` and make the colors of pixels outside the mask as zero, as shown in the following figure. + +![fig](https://zju3dv.github.io/neuralbody/images/bound_mask.png) + +As a result, the PSNR and SSIM metrics appear very high performances, as shown in the following table. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Training framesUnseen human poses
PSNRSSIMPSNRSSIM
31335.21 0.985 29.02 0.964
31533.07 0.988 25.70 0.957
39235.76 0.984 31.53 0.971
39333.24 0.979 28.40 0.960
39434.31 0.980 29.61 0.961
37733.86 0.985 30.60 0.977
38636.07 0.984 33.05 0.974
39034.48 0.980 30.25 0.964
38731.39 0.975 27.68 0.961
34.15 0.982 29.54 0.966
+ +To overcome this problem, a solution is only calculating the metrics on pixels inside the `bound_mask`. Since the SSIM metric requires the input to have the image format, we first compute the 2D box that bounds the `bound_mask` and then crop the corresponding image region. + +```python +def ssim_metric(rgb_pred, rgb_gt, batch): + mask_at_box = batch['mask_at_box'][0].detach().cpu().numpy() + H, W = int(cfg.H * cfg.ratio), int(cfg.W * cfg.ratio) + mask_at_box = mask_at_box.reshape(H, W) + # convert the pixels into an image + img_pred = np.zeros((H, W, 3)) + img_pred[mask_at_box] = rgb_pred + img_gt = np.zeros((H, W, 3)) + img_gt[mask_at_box] = rgb_gt + # crop the object region + x, y, w, h = cv2.boundingRect(mask_at_box.astype(np.uint8)) + img_pred = img_pred[y:y + h, x:x + w] + img_gt = img_gt[y:y + h, x:x + w] + # compute the ssim + ssim = compare_ssim(img_pred, img_gt, multichannel=True) + return ssim +``` + + +The following table lists corresponding results. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Training framesUnseen human poses
PSNRSSIMPSNRSSIM
31330.56 0.971 23.95 0.905
31527.24 0.962 19.56 0.852
39229.44 0.946 25.76 0.909
39428.44 0.940 23.80 0.878
39327.58 0.939 23.25 0.893
37727.64 0.951 23.91 0.909
38628.60 0.931 25.68 0.881
38725.79 0.928 21.60 0.870
39027.59 0.926 23.90 0.870
28.10 0.944 23.49 0.885
+ +## Results of other methods on ZJU-MoCap + +We save rendering results of other methods on novel views of training frames and unseen human poses at [here](https://zjueducn-my.sharepoint.com/:u:/g/personal/pengsida_zju_edu_cn/EQaPRQww70NDqEXeSG-fOeAB5JXFSWiWDW223h5nmkHvwQ?e=mdofbl), including Neural Volumes, Multi-view Neural Human Rendering, and Deferred Neural Human Rendering. **Note that we only generate novel views of training frames for Neural Volumes.** + +The following table lists quantitative results of Neural Volumes. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
PSNRSSIM
31320.09 0.831
31518.57 0.824
39222.88 0.726
39422.08 0.843
39321.29 0.842
37721.15 0.842
38623.21 0.820
38720.74 0.838
39022.49 0.825
21.39 0.821
+ +The following table lists quantitative results of Multi-view Neural Human Rendering. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Training framesUnseen human poses
PSNRSSIMPSNRSSIM
31326.68 0.935 23.05 0.893
31519.81 0.874 18.88 0.844
39224.73 0.902 23.66 0.893
39425.01 0.906 22.87 0.874
39323.47 0.894 22.27 0.885
37723.79 0.918 21.94 0.885
38625.02 0.879 23.70 0.853
38722.65 0.858 20.97 0.866
39023.72 0.873 22.65 0.858
23.87 0.893 22.22 0.872
+ +The following table lists quantitative results of Deferred Neural Human Rendering. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Training framesUnseen human poses
PSNRSSIMPSNRSSIM
31325.78 0.929 22.56 0.889
31519.44 0.869 18.38 0.841
39224.96 0.905 24.08 0.900
39424.84 0.903 22.67 0.871
39323.50 0.896 22.45 0.888
37723.74 0.917 22.07 0.886
38624.93 0.877 23.70 0.851
38722.44 0.888 20.64 0.862
39024.33 0.881 22.90 0.864
23.77 0.896 22.16 0.872
diff --git a/test.sh b/test.sh new file mode 100644 index 0000000000000000000000000000000000000000..4494a778185923aacad40bb98aad68d23b49f01c --- /dev/null +++ b/test.sh @@ -0,0 +1,10 @@ +# evaluation +# python run.py --type evaluate --cfg_file configs/latent_xyzc_313.yaml exp_name xyzc_313 +# python run.py --type evaluate --cfg_file configs/latent_xyzc_315.yaml exp_name xyzc_315 +# python run.py --type evaluate --cfg_file configs/latent_xyzc_392.yaml exp_name xyzc_392 +# python run.py --type evaluate --cfg_file configs/latent_xyzc_393.yaml exp_name xyzc_393 +# python run.py --type evaluate --cfg_file configs/latent_xyzc_394.yaml exp_name xyzc_394 +# python run.py --type evaluate --cfg_file configs/latent_xyzc_377.yaml exp_name xyzc_377 +# python run.py --type evaluate --cfg_file configs/latent_xyzc_386.yaml exp_name xyzc_386 +# python run.py --type evaluate --cfg_file configs/latent_xyzc_390.yaml exp_name xyzc_390 +# python run.py --type evaluate --cfg_file configs/latent_xyzc_387.yaml exp_name xyzc_387 diff --git a/tools/custom/README.md b/tools/custom/README.md new file mode 100644 index 0000000000000000000000000000000000000000..7a9ef04a8458f643f193be503d092c1ef63539fa --- /dev/null +++ b/tools/custom/README.md @@ -0,0 +1,80 @@ +## Run the code on the custom dataset + +Please inform me if there is any problem to run the code on your own data. + +1. If your data already have SMPL parameters, just export the SMPL parameters and SMPL vertices to two directories `params` and `vertices`. If you do not have SMPL parameters, you could take the following ways: + * For a multi-view video, you could estimate SMPL parameters using [https://github.com/zju3dv/EasyMocap](https://github.com/zju3dv/EasyMocap). The output parameter files can be processed using the [script](https://github.com/zju3dv/neuralbody/blob/master/zju_smpl/easymocap_to_neuralbody.py). + * For a monocular video, you could estimate SMPL parameters using [https://github.com/thmoa/videoavatars](https://github.com/thmoa/videoavatars). The output `reconstructed_poses.hdf5` file can be processed following [the instruction](https://github.com/zju3dv/neuralbody#process-people-snapshot). +2. Organize the dataset as the following structure. Please refer to `CoreView_392` of ZJU-MoCap dataset as an example. + The `annots.npy` is generated by [get_annots.py](get_annots.py). This code is used here to show the format of `annots.npy`. Please revise it according to your input camera parameters and image paths. +Example camera files can be found in [camera_params](camera_params). + + ![file](file_structure.png) + + ``` + ├── /path/to/dataset + │ ├── annots.npy // Store the camera parameters and image paths. + │ ├── params + │ │ ├── 0.npy + │ │ ├── ... + │ │ ├── 1234.npy + │ ├── vertices + │ │ ├── 0.npy + │ │ ├── ... + │ │ ├── 1234.npy + │ ├── Camera_B1 // Store the images. No restrictions on the directory name. + │ │ ├── 00000.jpg + │ │ ├── ... + │ ├── Camera_B2 + │ │ ├── 00000.jpg + │ │ ├── ... + │ ├── ... + │ ├── Camera_B23 + │ │ ├── 00000.jpg + │ │ ├── ... + │ ├── mask_cihp // Store the foreground segmentation. The directory name must be "mask_cihp". + │ │ ├── Camera_B1 + │ │ │ ├── 00000.png + │ │ │ ├── ... + │ │ ├── Camera_B2 + │ │ │ ├── 00000.png + │ │ │ ├── ... + │ │ ├── ... + │ │ ├── Camera_B23 + │ │ │ ├── 00000.png + │ │ │ ├── ... + ``` +4. Use `configs/multi_view_custom.yaml` or `configs/monocular_custom.yaml` for training. **Note that you need to revise the `train_dataset` and `test_dataset` in the yaml file.** + ``` + # train from scratch + python train_net.py --cfg_file configs/multi_view_custom.yaml exp_name resume False + # resume training + python train_net.py --cfg_file configs/multi_view_custom.yaml exp_name resume True + ``` + Revise the `num_train_frame` and `training_view` in `custom.yaml` according to your data. Or you could specify it in the command line: + ``` + python train_net.py --cfg_file configs/custom.yaml exp_name num_train_frame 1000 training_view "0, 1, 2, 3" resume False + ``` +6. Visualization. Please refer to [Visualization on ZJU-MoCap](https://github.com/zju3dv/neuralbody#visualization-on-zju-mocap) as an example. + * Visualize novel views of single frame. + ``` + python run.py --type visualize --cfg_file configs/multi_view_custom.yaml exp_name vis_novel_view True num_render_views 100 + ``` + + * Visualize views of dynamic humans with fixed camera + ``` + python run.py --type visualize --cfg_file configs/multi_view_custom.yaml exp_name vis_novel_pose True num_render_frame 1000 num_render_views 1 + ``` + + * Visualize views of dynamic humans with rotated camera + ``` + python run.py --type visualize --cfg_file configs/multi_view_custom.yaml exp_name vis_novel_pose True num_render_frame 1000 + ``` + + * Visualize mesh. `mesh_th` is the iso-surface threshold of Marching Cube Algorithm. + ``` + # generate meshes + python run.py --type visualize --cfg_file configs/multi_view_custom.yaml exp_name vis_mesh True mesh_th 10 + # visualize a specific mesh + python tools/render_mesh.py --exp_name --dataset zju_mocap --mesh_ind 0 + ``` diff --git a/tools/custom/camera_params/extri.yml b/tools/custom/camera_params/extri.yml new file mode 100755 index 0000000000000000000000000000000000000000..aedb253dc2c83c6091b2251dd68ee08cfa1c646d --- /dev/null +++ b/tools/custom/camera_params/extri.yml @@ -0,0 +1,533 @@ +%YAML:1.0 +--- +names: + - none + - Camera_B1 + - none + - Camera_B2 + - none + - Camera_B3 + - none + - Camera_B4 + - none + - Camera_B5 + - none + - Camera_B6 + - none + - Camera_B7 + - none + - Camera_B8 + - none + - Camera_B9 + - none + - Camera_B10 + - none + - Camera_B11 + - none + - Camera_B12 + - none + - Camera_B13 + - none + - Camera_B14 + - none + - Camera_B15 + - none + - Camera_B16 + - none + - Camera_B17 + - none + - Camera_B18 + - none + - Camera_B19 + - none + - Camera_B20 + - none + - Camera_B21 + - none + - Camera_B22 + - none + - Camera_B23 + - none +R_Camera_B1: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ -1.0157431818222639e-02, -1.8119834898518847e+00, + -2.4277695932170422e+00 ] +Rot_Camera_B1: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ -9.9369406631902846e-01, 9.3695684833436516e-02, + -6.1589132206482239e-02, -8.5699010692500077e-02, + -2.8045343375395226e-01, 9.5603428341348951e-01, + 7.2303403299436547e-02, 9.5528372232505399e-01, + 2.8671454049648093e-01 ] +T_Camera_B1: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ 5.8737218396584051e-02, 9.9344827412482817e-01, + 2.9892599478380979e+00 ] +R_Camera_B2: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ -2.6498412581214376e-01, -1.4698683735150728e+00, + -2.1171712563636311e+00 ] +Rot_Camera_B2: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ -8.3282752186965725e-01, 5.3499983052592159e-01, + -1.4203344730619150e-01, -3.2007311304094377e-01, + -2.5610091249846068e-01, 9.1212144198332201e-01, + 4.5160992141971890e-01, 8.0510092780637799e-01, + 3.8452694953746003e-01 ] +T_Camera_B2: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ 3.8256624895114300e-01, 8.9877512109900848e-01, + 2.9481667918000847e+00 ] +R_Camera_B3: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ -5.0657466315640687e-01, -1.4054542023563528e+00, + -2.0415178387042778e+00 ] +Rot_Camera_B3: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ -7.4567721530982689e-01, 6.6583006176776016e-01, + -2.5215459823741182e-02, -2.6119197624637885e-01, + -2.5728072075439767e-01, 9.3036841212103116e-01, + 6.1297980563171339e-01, 7.0034060254596020e-01, + 3.6575784108241449e-01 ] +T_Camera_B3: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ 3.4011583164100778e-01, 9.4202946367036455e-01, + 2.9252678463177957e+00 ] +R_Camera_B4: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ -4.7634230554915596e-01, -1.3025563482169678e+00, + -1.9296780514301761e+00 ] +Rot_Camera_B4: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ -6.5208102873737805e-01, 7.5158971589844725e-01, + -9.9514978352477801e-02, -3.7336243150549264e-01, + -2.0410893221671755e-01, 9.0495305874374377e-01, + 6.5984151635154720e-01, 6.2725787579355496e-01, + 4.1371092631673190e-01 ] +T_Camera_B4: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ -1.3294999336966856e-01, 7.2448740025075209e-01, + 2.8317461044134982e+00 ] +R_Camera_B5: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ -7.9575443773600174e-01, -1.1741270426142121e+00, + -1.6718796254506934e+00 ] +Rot_Camera_B5: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ -3.7395596596422342e-01, 9.2744273247127684e-01, + 2.6293546690936975e-03, -3.1230844521755147e-01, + -1.2859494995581272e-01, 9.4123683198845209e-01, + 8.7328138109406650e-01, 3.5115995903871688e-01, + 3.3773704653526881e-01 ] +T_Camera_B5: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ -1.8207543635763396e-01, 9.5599955877385367e-01, + 2.8428920344618533e+00 ] +R_Camera_B6: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ -8.4722866982167666e-01, -1.1085271808577302e+00, + -1.5424950150707974e+00 ] +Rot_Camera_B6: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ -2.4058131540377747e-01, 9.7049619028429257e-01, + -1.6055382968423237e-02, -3.2466228434979799e-01, + -6.4871740164537395e-02, 9.4360270159045001e-01, + 9.1472128640332806e-01, 2.3222575647782778e-01, + 3.3069043867179115e-01 ] +T_Camera_B6: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ -6.0855968422898266e-01, 9.1186960607290601e-01, + 2.9081423591600344e+00 ] +R_Camera_B7: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ -9.7413214900474809e-01, -9.5696427944553175e-01, + -1.1896289760028611e+00 ] +Rot_Camera_B7: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ 1.2020271786741699e-01, 9.8984892933133972e-01, + -7.5831416437295962e-02, -2.8615409291393507e-01, + 1.0768972691812445e-01, 9.5211278629419871e-01, + 9.5061408664393832e-01, -9.2747074443888688e-02, + 2.9619392035659786e-01 ] +T_Camera_B7: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ -7.3084441482120399e-01, 1.0100073577002882e+00, + 2.8560978726881094e+00 ] +R_Camera_B8: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ -1.0727015623342666e+00, -7.0155027183519114e-01, + -8.5650612676276017e-01 ] +Rot_Camera_B8: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ 4.9927034373235069e-01, 8.6278474536354599e-01, + -7.9571395848895610e-02, -2.4794763501407857e-01, + 2.3026642632838457e-01, 9.4100974660036341e-01, + 8.3021147556529584e-01, -4.5008872022515278e-01, + 3.2889063496210780e-01 ] +T_Camera_B8: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ -3.0312123428845061e-01, 9.2551016253656249e-01, + 3.1281740229175838e+00 ] +R_Camera_B9: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ -1.1544352147885666e+00, -5.1731833986646780e-01, + -7.7616653865289875e-01 ] +Rot_Camera_B9: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ 6.3919064339817011e-01, 7.6866323256559910e-01, + 2.4334261731909224e-02, -2.7333892760121486e-01, + 1.9749454259865443e-01, 9.4142537479163257e-01, + 7.1883318791639772e-01, -6.0840179203019451e-01, + 3.3634224742722535e-01 ] +T_Camera_B9: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ -2.9604117367652522e-01, 8.7774878701214942e-01, + 3.2213795462834329e+00 ] +R_Camera_B10: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ -1.1768638110173730e+00, -4.5423705460187203e-01, + -7.9420530341535855e-01 ] +Rot_Camera_B10: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ 6.5343926770196104e-01, 7.5239095460791616e-01, + 8.3216433769501941e-02, -3.0975751981183591e-01, + 1.6546119344399968e-01, 9.3630810761421235e-01, + 6.9070066044933331e-01, -6.3759740031486656e-01, + 3.4117759710534740e-01 ] +T_Camera_B10: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ -8.3634546096416273e-01, 8.7658233253083906e-01, + 3.2414047208577066e+00 ] +R_Camera_B11: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ -1.1889448299539405e+00, -2.5957502246213721e-01, + -2.7524110809037550e-01 ] +Rot_Camera_B11: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ 9.3724733304333130e-01, 3.4448777505400957e-01, + -5.3810868247049864e-02, -7.3883096441049512e-02, + 3.4705479604747469e-01, 9.3493008112945453e-01, + 3.4074730338396786e-01, -8.7228501155229310e-01, + 3.5072800552247085e-01 ] +T_Camera_B11: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ -2.7986327503384889e-01, 6.5934210718566144e-01, + 3.4694335515258805e+00 ] +R_Camera_B12: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ -1.1768323286193190e+00, -1.5776198383979881e-01, + -1.2925054625372004e-01 ] +Rot_Camera_B12: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ 9.8156043861537634e-01, 1.8294540523896968e-01, + -5.5408339147997449e-02, -1.8333542342302570e-02, + 3.7862943240389962e-01, 9.2536675655800604e-01, + 1.9027082427522443e-01, -9.0728756831527801e-01, + 3.7500170907634101e-01 ] +T_Camera_B12: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ -7.8432384675468036e-01, 6.2082150683638848e-01, + 3.5802845361042865e+00 ] +R_Camera_B13: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ -1.2106459846792941e+00, 3.1911648158890707e-02, + -5.9680288819572054e-02 ] +Rot_Camera_B13: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ 9.9797713658999676e-01, 2.9031228617416371e-02, + 5.6558134769407166e-02, -6.3157506912109154e-02, + 3.5109522460315651e-01, 9.3420729636494815e-01, + 7.2638945668286534e-03, -9.3588959339540811e-01, + 3.5221854694195137e-01 ] +T_Camera_B13: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ -7.3275234879487838e-01, 7.8537944288726591e-01, + 3.5530318637343004e+00 ] +R_Camera_B14: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ -1.2444296784703539e+00, 2.3384248419494394e-01, + 3.3181167157812608e-01 ] +Rot_Camera_B14: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ 9.2872157122203436e-01, -3.7072845462039916e-01, + -6.0544266158408799e-03, 1.1897584345548054e-01, + 2.8250453747581455e-01, 9.5185919913589800e-01, + -3.5117088692107290e-01, -8.8473250151689742e-01, + 3.0647578850330143e-01 ] +T_Camera_B14: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ -3.7802560390590317e-01, 8.3635121916847432e-01, + 3.6112877508294594e+00 ] +R_Camera_B15: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ -1.1402697638079613e+00, 4.5720228673217972e-01, + 6.3488191956056694e-01 ] +Rot_Camera_B15: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ 7.3971341208594021e-01, -6.7270830699534245e-01, + 1.6958823061634987e-02, 2.2933551836960134e-01, + 2.7571134679377118e-01, 9.3348190837493528e-01, + -6.3263677414004471e-01, -6.8661982708674008e-01, + 3.5822328938394421e-01 ] +T_Camera_B15: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ -1.4930663014699680e-01, 5.0441783577076338e-01, + 3.8078782070104418e+00 ] +R_Camera_B16: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ -1.1977668558980064e+00, 6.6607892295711491e-01, + 8.5754253275916859e-01 ] +Rot_Camera_B16: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ 5.2820206861940344e-01, -8.4911852327227799e-01, + 5.5510547527348741e-04, 2.1062668376909111e-01, + 1.3165554133154933e-01, 9.6866052800820357e-01, + -8.2258067980631422e-01, -5.1153157465849797e-01, + 2.4838774796027949e-01 ] +T_Camera_B16: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ 1.6239930612689555e-02, 9.8207511550770243e-01, + 3.6416671727469030e+00 ] +R_Camera_B17: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ -1.0581940663614438e+00, 7.1448613814934570e-01, + 1.1455742331098344e+00 ] +Rot_Camera_B17: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ 2.9127246927433770e-01, -9.5480904477047202e-01, + -5.9161107725679785e-02, 3.6688384073485869e-01, + 5.4379103465369949e-02, 9.2867602559447060e-01, + -8.8349114090095404e-01, -2.9220301355530531e-01, + 3.6614314525705538e-01 ] +T_Camera_B17: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ 1.1318658371574535e-01, 6.1524749467268891e-01, + 3.7159004652897081e+00 ] +R_Camera_B18: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ -8.1909761246035173e-01, 8.9167296585547051e-01, + 1.3454520632879332e+00 ] +Rot_Camera_B18: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ 1.6343289219694884e-02, -9.9791074978181671e-01, + 6.2506258625626421e-02, 4.4640071481686355e-01, + 6.3220643832371498e-02, 8.9259708267751670e-01, + -8.9468390994164781e-01, 1.3314866252152779e-02, + 4.4650130529284304e-01 ] +T_Camera_B18: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ 2.2693416806505856e-01, 2.8728149672331377e-01, + 3.7170973700617598e+00 ] +R_Camera_B19: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ -7.3692336458169161e-01, 1.1166617176074543e+00, + 1.5546146520403106e+00 ] +Rot_Camera_B19: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ -2.7327658560715307e-01, -9.5819918821622774e-01, + 8.4700787839657621e-02, 3.8623390015883913e-01, + -2.8656163474944929e-02, 9.2195563812093984e-01, + -8.8098994439606870e-01, 2.8466320450079186e-01, + 3.7792006810482059e-01 ] +T_Camera_B19: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ 3.6607238428894961e-01, 4.9325895616100979e-01, + 3.6523076119301265e+00 ] +R_Camera_B20: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ -7.5694183853042840e-01, 1.2485279861834504e+00, + 1.7563653696958588e+00 ] +Rot_Camera_B20: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ -4.7255787283520956e-01, -8.8126171097205241e-01, + -8.1764048378546450e-03, 2.8187862776533645e-01, + -1.5992872876953002e-01, 9.4602708255276324e-01, + -8.3500508742797042e-01, 4.4474779199988457e-01, + 3.2398442166350283e-01 ] +T_Camera_B20: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ 2.9985587470155822e-01, 7.1567244275047370e-01, + 3.4199299204272928e+00 ] +R_Camera_B21: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ -4.9651782071086553e-01, 1.4429338664279105e+00, + 1.9919294324456940e+00 ] +Rot_Camera_B21: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ -7.3591122421574962e-01, -6.7475144278663068e-01, + 5.6081730809926622e-02, 2.6360663238251536e-01, + -2.0923391231599864e-01, 9.4166486251791848e-01, + -6.2365552465862606e-01, 7.0776525797350454e-01, + 3.3184654008814474e-01 ] +T_Camera_B21: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ 4.8657367803091828e-01, 8.4313184086029924e-01, + 3.3877875462866647e+00 ] +R_Camera_B22: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ -3.8620799660187960e-01, 1.6147993271228649e+00, + 2.2832937337389305e+00 ] +Rot_Camera_B22: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ -9.1323562908109401e-01, -4.0578130108388955e-01, + -3.6636340804237999e-02, 1.0065784517117823e-01, + -3.1183773633328271e-01, 9.4478845484268548e-01, + -3.9480208203903083e-01, 8.5912674378650311e-01, + 3.2562640268941123e-01 ] +T_Camera_B22: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ 5.9410264596521756e-01, 9.6407221131777809e-01, + 3.1816813707748723e+00 ] +R_Camera_B23: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ -1.5693483301206787e-01, 1.6652171099122448e+00, + 2.3493509219822375e+00 ] +Rot_Camera_B23: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ -9.6116239793089009e-01, -2.6939507990653033e-01, + 5.9942770422257333e-02, 1.4578507005629729e-01, + -3.1118141911887276e-01, 9.3910214446770646e-01, + -2.3433642088335699e-01, 9.1136843006399915e-01, + 3.3836965958882848e-01 ] +T_Camera_B23: !!opencv-matrix + rows: 3 + cols: 1 + dt: d + data: [ 5.1699977554246146e-01, 8.6970786303031611e-01, + 3.1529895995849588e+00 ] diff --git a/tools/custom/camera_params/intri.yml b/tools/custom/camera_params/intri.yml new file mode 100755 index 0000000000000000000000000000000000000000..65fac4638f6aa6597317ec2e7241b0aee406a72a --- /dev/null +++ b/tools/custom/camera_params/intri.yml @@ -0,0 +1,349 @@ +%YAML:1.0 +--- +names: + - none + - Camera_B12 + - none + - Camera_B11 + - none + - Camera_B10 + - none + - Camera_B9 + - none + - Camera_B8 + - none + - Camera_B7 + - none + - Camera_B6 + - none + - Camera_B5 + - none + - Camera_B4 + - none + - Camera_B3 + - none + - Camera_B2 + - none + - Camera_B1 + - none + - Camera_B23 + - none + - Camera_B22 + - none + - Camera_B21 + - none + - Camera_B20 + - none + - Camera_B19 + - none + - Camera_B18 + - none + - Camera_B17 + - none + - Camera_B16 + - none + - Camera_B15 + - none + - Camera_B14 + - none + - Camera_B13 + - none +K_Camera_B12: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ 1.1091917255833230e+03, 0., 5.6440896220196316e+02, 0., + 1.1106791422331817e+03, 5.1030668905721933e+02, 0., 0., 1. ] +dist_Camera_B12: !!opencv-matrix + rows: 1 + cols: 5 + dt: d + data: [ -2.7596060192696187e-01, 1.0354242729541484e-01, + -9.1925361738261335e-03, 1.4000854069313937e-02, + 1.4758992736542245e-01 ] +K_Camera_B11: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ 1.0879891048955728e+03, 0., 5.3498293829391309e+02, 0., + 1.0944882794156008e+03, 4.8178458669392478e+02, 0., 0., 1. ] +dist_Camera_B11: !!opencv-matrix + rows: 1 + cols: 5 + dt: d + data: [ -2.7471367432460081e-01, 2.3267594307165820e-01, + -7.3418368300264903e-03, 9.9444716593946672e-03, + -1.2501374407508456e-01 ] +K_Camera_B10: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ 1.0923847397901995e+03, 0., 5.3667369287475469e+02, 0., + 1.0970820622714029e+03, 4.9056644177497316e+02, 0., 0., 1. ] +dist_Camera_B10: !!opencv-matrix + rows: 1 + cols: 5 + dt: d + data: [ -2.9133192351381282e-01, 3.5989844103847662e-01, + -3.3297429306282851e-03, 5.8250557924471869e-03, + -6.2357224776620912e-01 ] +K_Camera_B9: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ 1.0719991414386100e+03, 0., 5.0378496152963129e+02, 0., + 1.0757069508277768e+03, 4.7828759612230118e+02, 0., 0., 1. ] +dist_Camera_B9: !!opencv-matrix + rows: 1 + cols: 5 + dt: d + data: [ -2.3715928341449133e-01, 6.7687972228621482e-02, + -3.2693033642682734e-03, 7.8986947616669191e-04, + 1.4430452495425825e-01 ] +K_Camera_B8: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ 1.0683282866376944e+03, 0., 4.9098907844069333e+02, 0., + 1.0735170932863271e+03, 4.8531224424358606e+02, 0., 0., 1. ] +dist_Camera_B8: !!opencv-matrix + rows: 1 + cols: 5 + dt: d + data: [ -2.3230762402360033e-01, -5.6021504615374397e-02, + -2.9043233422317754e-03, -4.5792847218769066e-04, + 5.5691309725599503e-01 ] +K_Camera_B7: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ 1.0451707619285985e+03, 0., 5.2135450507813403e+02, 0., + 1.0476294390372880e+03, 4.9714630078796682e+02, 0., 0., 1. ] +dist_Camera_B7: !!opencv-matrix + rows: 1 + cols: 5 + dt: d + data: [ -2.3991901452065789e-01, 1.0028162232178148e-01, + -1.4004328305969374e-03, 1.5674643166853814e-03, + 1.7609633100139910e-02 ] +K_Camera_B6: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ 1.0921328430152896e+03, 0., 5.0316989661352198e+02, 0., + 1.0968298827658252e+03, 5.1132692738666515e+02, 0., 0., 1. ] +dist_Camera_B6: !!opencv-matrix + rows: 1 + cols: 5 + dt: d + data: [ -3.4209828581703700e-01, 4.7399716881876530e-01, + -1.0286751038968161e-02, 3.6411136656284363e-04, + -4.2691971916953059e-01 ] +K_Camera_B5: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ 1.0477129462732446e+03, 0., 5.3203075489997184e+02, 0., + 1.0519329274287652e+03, 5.0943473445495744e+02, 0., 0., 1. ] +dist_Camera_B5: !!opencv-matrix + rows: 1 + cols: 5 + dt: d + data: [ -2.9951973657515224e-01, 4.3701818545972110e-01, + -2.5291764103498836e-03, 7.6954620504772050e-03, + -5.6733311470890424e-01 ] +K_Camera_B4: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ 1.0266556123297910e+03, 0., 4.9849893971541900e+02, 0., + 1.0278705021764456e+03, 4.8197086070716034e+02, 0., 0., 1. ] +dist_Camera_B4: !!opencv-matrix + rows: 1 + cols: 5 + dt: d + data: [ -2.6265465356611856e-01, 3.2583690539586657e-01, + -3.5981526997848449e-03, 1.1778846072444931e-03, + -3.9125625702785977e-01 ] +K_Camera_B3: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ 1.0752489748712026e+03, 0., 5.5416891733694786e+02, 0., + 1.0785259638295427e+03, 4.6778419622319768e+02, 0., 0., 1. ] +dist_Camera_B3: !!opencv-matrix + rows: 1 + cols: 5 + dt: d + data: [ -2.9431203532646960e-01, 3.3886423815673955e-01, + -8.0502938335475366e-03, 8.5393524714125445e-03, + -3.0444232716674463e-01 ] +K_Camera_B2: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ 1.0762495955418644e+03, 0., 5.5719033350802124e+02, 0., + 1.0782845876384815e+03, 4.8160733204314971e+02, 0., 0., 1. ] +dist_Camera_B2: !!opencv-matrix + rows: 1 + cols: 5 + dt: d + data: [ -2.9837767696860923e-01, 2.9003113739834530e-01, + -8.0755830867455282e-03, 6.8978603473662473e-03, + -1.8321637588194126e-01 ] +K_Camera_B1: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ 1.0742813883666900e+03, 0., 5.4283420574219178e+02, 0., + 1.0754229146070927e+03, 4.8488356114136775e+02, 0., 0., 1. ] +dist_Camera_B1: !!opencv-matrix + rows: 1 + cols: 5 + dt: d + data: [ -2.2035998058960343e-01, 3.0851358278917364e-02, + -5.3232147935248367e-03, 2.2231941160571482e-03, + 6.4312270930721421e-02 ] +K_Camera_B23: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ 1.0695518031500828e+03, 0., 5.3413942770226390e+02, 0., + 1.0765773881701659e+03, 4.7876395699978480e+02, 0., 0., 1. ] +dist_Camera_B23: !!opencv-matrix + rows: 1 + cols: 5 + dt: d + data: [ -2.2267021840260626e-01, 4.6187474476358417e-02, + -5.2573433191461577e-03, 2.6515918061338823e-03, + 7.5658711730721959e-02 ] +K_Camera_B22: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ 1.0690944073686794e+03, 0., 5.5181239448770657e+02, 0., + 1.0696844247548354e+03, 4.7738836153380157e+02, 0., 0., 1. ] +dist_Camera_B22: !!opencv-matrix + rows: 1 + cols: 5 + dt: d + data: [ -2.9297724932343910e-01, 4.2753425706840437e-01, + -3.6577694885466805e-03, 7.5197902366176839e-03, + -4.8095986960655795e-01 ] +K_Camera_B21: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ 1.0917212734247041e+03, 0., 5.1905190319068072e+02, 0., + 1.0992158091453805e+03, 4.8665145090203868e+02, 0., 0., 1. ] +dist_Camera_B21: !!opencv-matrix + rows: 1 + cols: 5 + dt: d + data: [ -2.5644116785054966e-01, 1.0707815072954464e-01, + -1.1240724434681666e-02, 3.0547252231559216e-03, + 3.7937373644412867e-02 ] +K_Camera_B20: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ 1.0505857699662017e+03, 0., 5.0742956290045515e+02, 0., + 1.0576821247898235e+03, 4.6227355626949571e+02, 0., 0., 1. ] +dist_Camera_B20: !!opencv-matrix + rows: 1 + cols: 5 + dt: d + data: [ -2.6728066042101878e-01, 2.2349690566460717e-01, + -4.1809893915359013e-03, 7.2555675619196142e-03, + -1.1382752990157738e-01 ] +K_Camera_B19: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ 1.0986233565779598e+03, 0., 5.3912718348159649e+02, 0., + 1.1087198856669340e+03, 4.8828012803333479e+02, 0., 0., 1. ] +dist_Camera_B19: !!opencv-matrix + rows: 1 + cols: 5 + dt: d + data: [ -3.0792280632994773e-01, 3.0907496810626744e-01, + -5.8244120671945418e-03, 8.8773563659444842e-03, + -1.3805158232527251e-01 ] +K_Camera_B18: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ 1.0755799525343564e+03, 0., 5.3127308787449522e+02, 0., + 1.0813842563213816e+03, 4.9138554444306124e+02, 0., 0., 1. ] +dist_Camera_B18: !!opencv-matrix + rows: 1 + cols: 5 + dt: d + data: [ -2.9779699363839646e-01, 3.9061745119950020e-01, + -7.1197248865818227e-03, 8.8580120286107338e-03, + -4.1368206417192671e-01 ] +K_Camera_B17: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ 1.0702849599120098e+03, 0., 5.2317211095319965e+02, 0., + 1.0760313155283059e+03, 4.7828715478756283e+02, 0., 0., 1. ] +dist_Camera_B17: !!opencv-matrix + rows: 1 + cols: 5 + dt: d + data: [ -3.0782940349778748e-01, 4.1894296916086871e-01, + -7.3161506980120158e-03, 4.9119359268566193e-03, + -3.8879901910338321e-01 ] +K_Camera_B16: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ 1.0570864073900682e+03, 0., 5.1043643239360500e+02, 0., + 1.0602368015090767e+03, 4.7724569058951175e+02, 0., 0., 1. ] +dist_Camera_B16: !!opencv-matrix + rows: 1 + cols: 5 + dt: d + data: [ -2.3718170788761658e-01, 1.6068174960880816e-01, + -2.5401237201511241e-03, -1.1691277567362087e-03, + -1.3774692051662846e-01 ] +K_Camera_B15: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ 1.0915518073309697e+03, 0., 5.1538077479030119e+02, 0., + 1.0932638635228452e+03, 4.8001732857878619e+02, 0., 0., 1. ] +dist_Camera_B15: !!opencv-matrix + rows: 1 + cols: 5 + dt: d + data: [ -2.2427900327432443e-01, -9.0369946329388467e-03, + -6.3399725198581194e-03, 2.4611003775847091e-03, + 2.5947880262802958e-01 ] +K_Camera_B14: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ 1.0677936480828080e+03, 0., 5.5662323847553887e+02, 0., + 1.0712424274514806e+03, 5.0264563394686542e+02, 0., 0., 1. ] +dist_Camera_B14: !!opencv-matrix + rows: 1 + cols: 5 + dt: d + data: [ -2.7989863588778285e-01, 2.6862609961536077e-01, + -3.1226769517469855e-03, 4.3734192605385039e-03, + -2.1388252753882769e-01 ] +K_Camera_B13: !!opencv-matrix + rows: 3 + cols: 3 + dt: d + data: [ 1.0844058116048157e+03, 0., 4.8055535387290013e+02, 0., + 1.0875885252383673e+03, 4.8013293996705187e+02, 0., 0., 1. ] +dist_Camera_B13: !!opencv-matrix + rows: 1 + cols: 5 + dt: d + data: [ -3.2942169636683050e-01, 6.3412856205651347e-01, + -5.4488189827586889e-03, 3.5748771064801435e-03, + -9.2619987130355874e-01 ] diff --git a/tools/custom/file_structure.png b/tools/custom/file_structure.png new file mode 100644 index 0000000000000000000000000000000000000000..db2595c8f7d2f224a14dee56c22db42bde5e548a Binary files /dev/null and b/tools/custom/file_structure.png differ diff --git a/tools/custom/get_annots.py b/tools/custom/get_annots.py new file mode 100644 index 0000000000000000000000000000000000000000..ab3f80f4d9040a0b6e1bdb0c9d5b768620feef93 --- /dev/null +++ b/tools/custom/get_annots.py @@ -0,0 +1,49 @@ +import cv2 +import numpy as np +import glob +import os +import json + + +def get_cams(): + intri = cv2.FileStorage('intri.yml', cv2.FILE_STORAGE_READ) + extri = cv2.FileStorage('extri.yml', cv2.FILE_STORAGE_READ) + cams = {'K': [], 'D': [], 'R': [], 'T': []} + for i in range(23): + cams['K'].append(intri.getNode('K_Camera_B{}'.format(i + 1)).mat()) + cams['D'].append( + intri.getNode('dist_Camera_B{}'.format(i + 1)).mat().T) + cams['R'].append(extri.getNode('Rot_Camera_B{}'.format(i + 1)).mat()) + cams['T'].append(extri.getNode('T_Camera_B{}'.format(i + 1)).mat() * 1000) + return cams + + +def get_img_paths(): + all_ims = [] + for i in range(23): + i = i + 1 + data_root = 'Camera_B{}'.format(i) + ims = glob.glob(os.path.join(data_root, '*.jpg')) + ims = np.array(sorted(ims)) + all_ims.append(ims) + num_img = min([len(ims) for ims in all_ims]) + all_ims = [ims[:num_img] for ims in all_ims] + all_ims = np.stack(all_ims, axis=1) + return all_ims + + +cams = get_cams() +img_paths = get_img_paths() + +annot = {} +annot['cams'] = cams + +ims = [] +for img_path, kpt in zip(img_paths, kpts2d): + data = {} + data['ims'] = img_path.tolist() + ims.append(data) +annot['ims'] = ims + +np.save('annots.npy', annot) +np.save('annots_python2.npy', annot, fix_imports=True) diff --git a/tools/prepare_warping.py b/tools/prepare_warping.py new file mode 100644 index 0000000000000000000000000000000000000000..c34b3b0188844ede50827f025a464fa9d1586b58 --- /dev/null +++ b/tools/prepare_warping.py @@ -0,0 +1,224 @@ +""" +Prepare blend weights of grid points +""" + +import os +import json +import numpy as np +import cv2 +import sys +sys.path.append('/mnt/data/home/pengsida/Codes/SMPL_CPP/build/python') +import pysmplceres +import open3d as o3d +import pyskeleton +from psbody.mesh import Mesh +import pickle + +# initialize a smpl model +pysmplceres.loadSMPL('/mnt/data/home/pengsida/Codes/SMPL_CPP/model/smpl/', + 'smpl') + + +def read_pickle(pkl_path): + with open(pkl_path, 'rb') as f: + u = pickle._Unpickler(f) + u.encoding = 'latin1' + return u.load() + + +def get_o3d_mesh(vertices, faces): + mesh = o3d.geometry.TriangleMesh() + mesh.vertices = o3d.utility.Vector3dVector(vertices) + mesh.triangles = o3d.utility.Vector3iVector(faces) + mesh.compute_vertex_normals() + return mesh + + +def barycentric_interpolation(val, coords): + """ + :param val: verts x 3 x d input matrix + :param coords: verts x 3 barycentric weights array + :return: verts x d weighted matrix + """ + t = val * coords[..., np.newaxis] + ret = t.sum(axis=1) + return ret + + +def process_shapedirs(shapedirs, vert_ids, bary_coords): + arr = [] + for i in range(3): + t = barycentric_interpolation(shapedirs[:, i, :][vert_ids], + bary_coords) + arr.append(t[:, np.newaxis, :]) + arr = np.concatenate(arr, axis=1) + return arr + + +def batch_rodrigues(poses): + """ poses: N x 3 + """ + batch_size = poses.shape[0] + angle = np.linalg.norm(poses + 1e-8, axis=1, keepdims=True) + rot_dir = poses / angle + + cos = np.cos(angle)[:, None] + sin = np.sin(angle)[:, None] + + rx, ry, rz = np.split(rot_dir, 3, axis=1) + zeros = np.zeros([batch_size, 1]) + K = np.concatenate([zeros, -rz, ry, rz, zeros, -rx, -ry, rx, zeros], + axis=1) + K = K.reshape([batch_size, 3, 3]) + + ident = np.eye(3)[None] + rot_mat = ident + sin * K + (1 - cos) * np.matmul(K, K) + + return rot_mat + + +def get_rigid_transformation(rot_mats, joints, parents): + """ + rot_mats: 24 x 3 x 3 + joints: 24 x 3 + parents: 24 + """ + # obtain the relative joints + rel_joints = joints.copy() + rel_joints[1:] -= joints[parents[1:]] + + # create the transformation matrix + transforms_mat = np.concatenate([rot_mats, rel_joints[..., None]], axis=2) + padding = np.zeros([24, 1, 4]) + padding[..., 3] = 1 + transforms_mat = np.concatenate([transforms_mat, padding], axis=1) + + # rotate each part + transform_chain = [transforms_mat[0]] + for i in range(1, parents.shape[0]): + curr_res = np.dot(transform_chain[parents[i]], transforms_mat[i]) + transform_chain.append(curr_res) + transforms = np.stack(transform_chain, axis=0) + + # obtain the rigid transformation + padding = np.zeros([24, 1]) + joints_homogen = np.concatenate([joints, padding], axis=1) + transformed_joints = np.sum(transforms * joints_homogen[:, None], axis=2) + transforms[..., 3] = transforms[..., 3] - transformed_joints + + return transforms + + +def get_transform_params(smpl, params): + """ obtain the transformation parameters for linear blend skinning + """ + v_template = np.array(smpl['v_template']) + + # add shape blend shapes + shapedirs = np.array(smpl['shapedirs']) + betas = params['shapes'] + v_shaped = v_template + np.sum(shapedirs * betas[None], axis=2) + + # add pose blend shapes + poses = params['poses'].reshape(-1, 3) + # 24 x 3 x 3 + rot_mats = batch_rodrigues(poses) + # 23 x 3 x 3 + pose_feature = rot_mats[1:].reshape(23, 3, 3) - np.eye(3)[None] + pose_feature = pose_feature.reshape(1, 1, 207) + posedirs = np.array(smpl['posedirs']) + # v_posed = v_shaped + np.sum(posedirs * pose_feature, axis=2) + v_posed = v_shaped + + # obtain the joints + joints = smpl['J_regressor'].dot(v_shaped) + + # obtain the rigid transformation + parents = smpl['kintree_table'][0] + A = get_rigid_transformation(rot_mats, joints, parents) + + # apply global transformation + R = cv2.Rodrigues(params['Rh'][0])[0] + Th = params['Th'] + + return A, R, Th + + +def get_colored_pc(pts, rgb): + pc = o3d.geometry.PointCloud() + pc.points = o3d.utility.Vector3dVector(pts) + colors = np.zeros_like(pts) + colors += rgb + pc.colors = o3d.utility.Vector3dVector(colors) + return pc + + +def get_grid_points(xyz): + min_xyz = np.min(xyz, axis=0) + max_xyz = np.max(xyz, axis=0) + min_xyz -= 0.05 + max_xyz += 0.05 + bounds = np.stack([min_xyz, max_xyz], axis=0) + vsize = 0.025 + voxel_size = [vsize, vsize, vsize] + x = np.arange(bounds[0, 0], bounds[1, 0] + voxel_size[0], voxel_size[0]) + y = np.arange(bounds[0, 1], bounds[1, 1] + voxel_size[1], voxel_size[1]) + z = np.arange(bounds[0, 2], bounds[1, 2] + voxel_size[2], voxel_size[2]) + pts = np.stack(np.meshgrid(x, y, z, indexing='ij'), axis=-1) + return pts + + +def get_canpts(param_path): + params = np.load(param_path, allow_pickle=True).item() + vertices = pysmplceres.getVertices(params)[0] + faces = pysmplceres.getFaces() + mesh = get_o3d_mesh(vertices, faces) + + smpl = read_pickle( + '/mnt/data/home/pengsida/Codes/EasyMocap/data/smplx/smpl/SMPL_NEUTRAL.pkl' + ) + # obtain the transformation parameters for linear blend skinning + A, R, Th = get_transform_params(smpl, params) + + # transform points from the world space to the pose space + pxyz = np.dot(vertices - Th, R) + smpl_mesh = Mesh(pxyz, faces) + + # create grid points in the pose space + pts = get_grid_points(pxyz) + sh = pts.shape + pts = pts.reshape(-1, 3) + + # obtain the blending weights for grid points + closest_face, closest_points = smpl_mesh.closest_faces_and_points(pts) + vert_ids, bary_coords = smpl_mesh.barycentric_coordinates_for_points( + closest_points, closest_face.astype('int32')) + bweights = barycentric_interpolation(smpl['weights'][vert_ids], + bary_coords) + + A = np.dot(bweights, A.reshape(24, -1)).reshape(-1, 4, 4) + can_pts = pts - A[:, :3, 3] + R_inv = np.linalg.inv(A[:, :3, :3]) + can_pts = np.sum(R_inv * can_pts[:, None], axis=2) + + can_pts = can_pts.reshape(*sh).astype(np.float32) + + return can_pts + + +def prepare_tpose(): + data_root = '/home/pengsida/Datasets/light_stage' + human = 'CoreView_315' + param_dir = os.path.join(data_root, human, 'params') + canpts_dir = os.path.join(data_root, human, 'canpts') + os.system('mkdir -p {}'.format(canpts_dir)) + + for i in range(len(os.listdir(param_dir))): + i = i + 1 + param_path = os.path.join(param_dir, '{}.npy'.format(i)) + canpts = get_canpts(param_path) + canpts_path = os.path.join(canpts_dir, '{}.npy'.format(i)) + np.save(canpts_path, canpts) + + +prepare_tpose() diff --git a/tools/process_snapshot.py b/tools/process_snapshot.py new file mode 100644 index 0000000000000000000000000000000000000000..a16acd3b457351187f8cce439c8c19328bc49381 --- /dev/null +++ b/tools/process_snapshot.py @@ -0,0 +1,146 @@ +import pickle +import os +import h5py +import sys +import numpy as np +import open3d as o3d +from snapshot_smpl.smpl import Smpl +import cv2 +import tqdm + + +def read_pickle(pkl_path): + with open(pkl_path, 'rb') as f: + u = pickle._Unpickler(f) + u.encoding = 'latin1' + return u.load() + + +def get_KRTD(camera): + K = np.zeros([3, 3]) + K[0, 0] = camera['camera_f'][0] + K[1, 1] = camera['camera_f'][1] + K[:2, 2] = camera['camera_c'] + K[2, 2] = 1 + R = np.eye(3) + T = np.zeros([3]) + D = camera['camera_k'] + return K, R, T, D + + +def get_o3d_mesh(vertices, faces): + mesh = o3d.geometry.TriangleMesh() + mesh.vertices = o3d.utility.Vector3dVector(vertices) + mesh.triangles = o3d.utility.Vector3iVector(faces) + mesh.compute_vertex_normals() + return mesh + + +def get_smpl(base_smpl, betas, poses, trans): + base_smpl.betas = betas + base_smpl.pose = poses + base_smpl.trans = trans + vertices = np.array(base_smpl) + + faces = base_smpl.f + mesh = get_o3d_mesh(vertices, faces) + + return vertices, mesh + + +def render_smpl(mesh, img, K, R, T): + vertices = np.array(mesh.vertices) + rendered_img = renderer.render_multiview(vertices, K[None], R[None], + T[None, None], [img])[0] + return rendered_img + + +def extract_image(data_path): + data_root = os.path.dirname(data_path) + img_dir = os.path.join(data_root, 'image') + os.system('mkdir -p {}'.format(img_dir)) + + if len(os.listdir(img_dir)) >= 200: + return + + cap = cv2.VideoCapture(data_path) + + ret, frame = cap.read() + i = 0 + + while ret: + cv2.imwrite(os.path.join(img_dir, '{}.jpg'.format(i)), frame) + ret, frame = cap.read() + i = i + 1 + + cap.release() + + +def extract_mask(masks, mask_dir): + if len(os.listdir(mask_dir)) >= len(masks): + return + + for i in tqdm.tqdm(range(len(masks))): + mask = masks[i].astype(np.uint8) + + # erode the mask + border = 4 + kernel = np.ones((border, border), np.uint8) + mask = cv2.erode(mask.copy(), kernel) + + cv2.imwrite(os.path.join(mask_dir, '{}.png'.format(i)), mask) + + +data_root = 'data/people_snapshot' +videos = ['female-3-casual'] + +model_paths = [ + 'basicModel_f_lbs_10_207_0_v1.0.0.pkl', + 'basicmodel_m_lbs_10_207_0_v1.0.0.pkl' +] + +for video in videos: + camera_path = os.path.join(data_root, video, 'camera.pkl') + camera = read_pickle(camera_path) + K, R, T, D = get_KRTD(camera) + + # process video + video_path = os.path.join(data_root, video, video + '.mp4') + extract_image(video_path) + + # process mask + mask_path = os.path.join(data_root, video, 'masks.hdf5') + masks = h5py.File(mask_path)['masks'] + mask_dir = os.path.join(data_root, video, 'mask') + os.system('mkdir -p {}'.format(mask_dir)) + extract_mask(masks, mask_dir) + + smpl_path = os.path.join(data_root, video, 'reconstructed_poses.hdf5') + smpl = h5py.File(smpl_path) + betas = smpl['betas'] + pose = smpl['pose'] + trans = smpl['trans'] + + pose = pose[len(pose) - len(masks):] + trans = trans[len(trans) - len(masks):] + + # process smpl parameters + params = {'beta': np.array(betas), 'pose': pose, 'trans': trans} + params_path = os.path.join(data_root, video, 'params.npy') + np.save(params_path, params) + + if 'female' in video: + model_path = model_paths[0] + else: + model_path = model_paths[1] + model_data = read_pickle(model_path) + + img_dir = os.path.join(data_root, video, 'image') + vertices_dir = os.path.join(data_root, video, 'vertices') + os.system('mkdir -p {}'.format(vertices_dir)) + + num_img = len(os.listdir(img_dir)) + for i in tqdm.tqdm(range(num_img)): + base_smpl = Smpl(model_data) + vertices, mesh = get_smpl(base_smpl, betas, pose[i], trans[i]) + np.save(os.path.join(vertices_dir, '{}.npy'.format(i)), vertices) diff --git a/tools/render/cam_render.py b/tools/render/cam_render.py new file mode 100644 index 0000000000000000000000000000000000000000..2b3575efbf4ad8c9466cc3fd2ba361b282c40e1a --- /dev/null +++ b/tools/render/cam_render.py @@ -0,0 +1,72 @@ +''' +MIT License + +Copyright (c) 2019 Shunsuke Saito, Zeng Huang, and Ryota Natsume + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. +''' +from OpenGL.GLUT import * + +from .render import Render + + +class CamRender(Render): + def __init__(self, width=1600, height=1200, name='Cam Renderer', + program_files=['simple.fs', 'simple.vs'], color_size=1, ms_rate=1): + Render.__init__(self, width, height, name, program_files, color_size, ms_rate) + self.camera = None + + glutDisplayFunc(self.display) + glutKeyboardFunc(self.keyboard) + + def set_camera(self, camera): + self.camera = camera + self.projection_matrix, self.model_view_matrix = camera.get_gl_matrix() + + def set_matrices(self, projection, modelview): + self.projection_matrix = projection + self.model_view_matrix = modelview + + def keyboard(self, key, x, y): + # up + eps = 1 + # print(key) + if key == b'w': + self.camera.center += eps * self.camera.direction + elif key == b's': + self.camera.center -= eps * self.camera.direction + if key == b'a': + self.camera.center -= eps * self.camera.right + elif key == b'd': + self.camera.center += eps * self.camera.right + if key == b' ': + self.camera.center += eps * self.camera.up + elif key == b'x': + self.camera.center -= eps * self.camera.up + elif key == b'i': + self.camera.near += 0.1 * eps + self.camera.far += 0.1 * eps + elif key == b'o': + self.camera.near -= 0.1 * eps + self.camera.far -= 0.1 * eps + + self.projection_matrix, self.model_view_matrix = self.camera.get_gl_matrix() + + def show(self): + glutMainLoop() diff --git a/tools/render/camera.py b/tools/render/camera.py new file mode 100644 index 0000000000000000000000000000000000000000..df5a7ece63f05e7b7c103c4598fce60cffeb6809 --- /dev/null +++ b/tools/render/camera.py @@ -0,0 +1,240 @@ +''' +MIT License + +Copyright (c) 2019 Shunsuke Saito, Zeng Huang, and Ryota Natsume + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. +''' +import cv2 +import numpy as np + +from .glm import ortho + + +class Camera: + def __init__(self, width=1600, height=1200): + # Focal Length + # equivalent 50mm + focal = np.sqrt(width * width + height * height) + self.focal_x = focal + self.focal_y = focal + # Principal Point Offset + self.principal_x = width / 2 + self.principal_y = height / 2 + # Axis Skew + self.skew = 0 + # Image Size + self.width = width + self.height = height + + self.near = 1 + self.far = 10 + + # Camera Center + self.eye = np.array([0, 0, -3.6]) + self.center = np.array([0, 0, 0]) + self.direction = np.array([0, 0, -1]) + self.right = np.array([1, 0, 0]) + self.up = np.array([0, 1, 0]) + + self.ortho_ratio = None + + def sanity_check(self): + self.center = self.center.reshape([-1]) + self.direction = self.direction.reshape([-1]) + self.right = self.right.reshape([-1]) + self.up = self.up.reshape([-1]) + + assert len(self.center) == 3 + assert len(self.direction) == 3 + assert len(self.right) == 3 + assert len(self.up) == 3 + + @staticmethod + def normalize_vector(v): + v_norm = np.linalg.norm(v) + return v if v_norm == 0 else v / v_norm + + def get_real_z_value(self, z): + z_near = self.near + z_far = self.far + z_n = 2.0 * z - 1.0 + z_e = 2.0 * z_near * z_far / (z_far + z_near - z_n * (z_far - z_near)) + return z_e + + def get_rotation_matrix(self): + rot_mat = np.eye(3) + d = self.eye - self.center + d = -self.normalize_vector(d) + u = self.up + self.right = -np.cross(u, d) + u = np.cross(d, self.right) + rot_mat[0, :] = self.right + rot_mat[1, :] = u + rot_mat[2, :] = d + + # s = self.right + # s = self.normalize_vector(s) + # rot_mat[0, :] = s + # u = self.up + # u = self.normalize_vector(u) + # rot_mat[1, :] = -u + # rot_mat[2, :] = self.normalize_vector(self.direction) + + return rot_mat + + def get_translation_vector(self): + rot_mat = self.get_rotation_matrix() + trans = -np.dot(rot_mat.T, self.eye) + return trans + + def get_intrinsic_matrix(self): + int_mat = np.eye(3) + + int_mat[0, 0] = self.focal_x + int_mat[1, 1] = self.focal_y + int_mat[0, 1] = self.skew + int_mat[0, 2] = self.principal_x + int_mat[1, 2] = self.principal_y + + return int_mat + + def get_projection_matrix(self): + ext_mat = self.get_extrinsic_matrix() + int_mat = self.get_intrinsic_matrix() + + return np.matmul(int_mat, ext_mat) + + def get_extrinsic_matrix(self): + rot_mat = self.get_rotation_matrix() + int_mat = self.get_intrinsic_matrix() + trans = self.get_translation_vector() + + extrinsic = np.eye(4) + extrinsic[:3, :3] = rot_mat + extrinsic[:3, 3] = trans + + return extrinsic[:3, :] + + def set_rotation_matrix(self, rot_mat): + self.direction = rot_mat[2, :] + self.up = -rot_mat[1, :] + self.right = rot_mat[0, :] + + def set_intrinsic_matrix(self, int_mat): + self.focal_x = int_mat[0, 0] + self.focal_y = int_mat[1, 1] + self.skew = int_mat[0, 1] + self.principal_x = int_mat[0, 2] + self.principal_y = int_mat[1, 2] + + def set_projection_matrix(self, proj_mat): + res = cv2.decomposeProjectionMatrix(proj_mat) + int_mat, rot_mat, camera_center_homo = res[0], res[1], res[2] + camera_center = camera_center_homo[0:3] / camera_center_homo[3] + camera_center = camera_center.reshape(-1) + int_mat = int_mat / int_mat[2][2] + + self.set_intrinsic_matrix(int_mat) + self.set_rotation_matrix(rot_mat) + self.center = camera_center + + self.sanity_check() + + def get_gl_matrix(self): + z_near = self.near + z_far = self.far + rot_mat = self.get_rotation_matrix() + int_mat = self.get_intrinsic_matrix() + trans = self.get_translation_vector() + + extrinsic = np.eye(4) + extrinsic[:3, :3] = rot_mat + extrinsic[:3, 3] = trans + axis_adj = np.eye(4) + axis_adj[2, 2] = -1 + axis_adj[1, 1] = -1 + model_view = np.matmul(axis_adj, extrinsic) + + projective = np.zeros([4, 4]) + projective[:2, :2] = int_mat[:2, :2] + projective[:2, 2:3] = -int_mat[:2, 2:3] + projective[3, 2] = -1 + projective[2, 2] = (z_near + z_far) + projective[2, 3] = (z_near * z_far) + + if self.ortho_ratio is None: + ndc = ortho(0, self.width, 0, self.height, z_near, z_far) + perspective = np.matmul(ndc, projective) + else: + perspective = ortho(-self.width * self.ortho_ratio / 2, self.width * self.ortho_ratio / 2, + -self.height * self.ortho_ratio / 2, self.height * self.ortho_ratio / 2, + z_near, z_far) + + return perspective, model_view + + +def KRT_from_P(proj_mat, normalize_K=True): + res = cv2.decomposeProjectionMatrix(proj_mat) + K, Rot, camera_center_homog = res[0], res[1], res[2] + camera_center = camera_center_homog[0:3] / camera_center_homog[3] + trans = -Rot.dot(camera_center) + if normalize_K: + K = K / K[2][2] + return K, Rot, trans + + +def MVP_from_P(proj_mat, width, height, near=0.1, far=10000): + ''' + Convert OpenCV camera calibration matrix to OpenGL projection and model view matrix + :param proj_mat: OpenCV camera projeciton matrix + :param width: Image width + :param height: Image height + :param near: Z near value + :param far: Z far value + :return: OpenGL projection matrix and model view matrix + ''' + res = cv2.decomposeProjectionMatrix(proj_mat) + K, Rot, camera_center_homog = res[0], res[1], res[2] + camera_center = camera_center_homog[0:3] / camera_center_homog[3] + trans = -Rot.dot(camera_center) + K = K / K[2][2] + + extrinsic = np.eye(4) + extrinsic[:3, :3] = Rot + extrinsic[:3, 3:4] = trans + axis_adj = np.eye(4) + axis_adj[2, 2] = -1 + axis_adj[1, 1] = -1 + model_view = np.matmul(axis_adj, extrinsic) + + zFar = far + zNear = near + projective = np.zeros([4, 4]) + projective[:2, :2] = K[:2, :2] + projective[:2, 2:3] = -K[:2, 2:3] + projective[3, 2] = -1 + projective[2, 2] = (zNear + zFar) + projective[2, 3] = (zNear * zFar) + + ndc = ortho(0, width, 0, height, zNear, zFar) + + perspective = np.matmul(ndc, projective) + + return perspective, model_view diff --git a/tools/render/color.fs b/tools/render/color.fs new file mode 100644 index 0000000000000000000000000000000000000000..dba204279565f0464c2f15cb0a976afb012dc013 --- /dev/null +++ b/tools/render/color.fs @@ -0,0 +1,10 @@ +#version 330 core + +out vec4 FragColor; + +in vec3 Color; + +void main() +{ + FragColor = vec4(Color,1.0); +} diff --git a/tools/render/color.vs b/tools/render/color.vs new file mode 100644 index 0000000000000000000000000000000000000000..d3d4eafee05de84db079b99dd53628793188de0c --- /dev/null +++ b/tools/render/color.vs @@ -0,0 +1,17 @@ +#version 330 core + +layout (location = 0) in vec3 a_Position; +layout (location = 1) in vec3 a_Color; + +out vec3 CamNormal; +out vec3 CamPos; +out vec3 Color; + +uniform mat4 ModelMat; +uniform mat4 PerspMat; + +void main() +{ + gl_Position = PerspMat * ModelMat * vec4(a_Position, 1.0); + Color = a_Color; +} diff --git a/tools/render/color_render.py b/tools/render/color_render.py new file mode 100644 index 0000000000000000000000000000000000000000..c44bb04dbe13ceec9f00518a496cecfb0126bdd3 --- /dev/null +++ b/tools/render/color_render.py @@ -0,0 +1,113 @@ +''' +MIT License + +Copyright (c) 2019 Shunsuke Saito, Zeng Huang, and Ryota Natsume + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. +''' +import numpy as np +import random + +from .framework import * +from .cam_render import CamRender + + +class ColorRender(CamRender): + def __init__(self, width=1600, height=1200, name='Color Renderer'): + program_files = ['color.vs', 'color.fs'] + CamRender.__init__(self, width, height, name, program_files=program_files) + + # WARNING: this differs from vertex_buffer and vertex_data in Render + self.vert_buffer = {} + self.vert_data = {} + + self.color_buffer = {} + self.color_data = {} + + self.vertex_dim = {} + self.n_vertices = {} + + def set_mesh(self, vertices, faces, color, faces_clr, mat_name='all'): + self.vert_data[mat_name] = vertices[faces.reshape([-1])] + self.n_vertices[mat_name] = self.vert_data[mat_name].shape[0] + self.vertex_dim[mat_name] = self.vert_data[mat_name].shape[1] + + if mat_name not in self.vert_buffer.keys(): + self.vert_buffer[mat_name] = glGenBuffers(1) + glBindBuffer(GL_ARRAY_BUFFER, self.vert_buffer[mat_name]) + glBufferData(GL_ARRAY_BUFFER, self.vert_data[mat_name], GL_STATIC_DRAW) + + self.color_data[mat_name] = color[faces_clr.reshape([-1])] + if mat_name not in self.color_buffer.keys(): + self.color_buffer[mat_name] = glGenBuffers(1) + glBindBuffer(GL_ARRAY_BUFFER, self.color_buffer[mat_name]) + glBufferData(GL_ARRAY_BUFFER, self.color_data[mat_name], GL_STATIC_DRAW) + + glBindBuffer(GL_ARRAY_BUFFER, 0) + + def cleanup(self): + + glBindBuffer(GL_ARRAY_BUFFER, 0) + for key in self.vert_data: + glDeleteBuffers(1, [self.vert_buffer[key]]) + glDeleteBuffers(1, [self.color_buffer[key]]) + + self.vert_buffer = {} + self.vert_data = {} + + self.color_buffer = {} + self.color_data = {} + + self.render_texture_mat = {} + + self.vertex_dim = {} + self.n_vertices = {} + + def draw(self): + self.draw_init() + + glEnable(GL_MULTISAMPLE) + + glUseProgram(self.program) + glUniformMatrix4fv(self.model_mat_unif, 1, GL_FALSE, self.model_view_matrix.transpose()) + glUniformMatrix4fv(self.persp_mat_unif, 1, GL_FALSE, self.projection_matrix.transpose()) + + for mat in self.vert_buffer: + # Handle vertex buffer + glBindBuffer(GL_ARRAY_BUFFER, self.vert_buffer[mat]) + glEnableVertexAttribArray(0) + glVertexAttribPointer(0, self.vertex_dim[mat], GL_DOUBLE, GL_FALSE, 0, None) + + # Handle normal buffer + glBindBuffer(GL_ARRAY_BUFFER, self.color_buffer[mat]) + glEnableVertexAttribArray(1) + glVertexAttribPointer(1, 3, GL_DOUBLE, GL_FALSE, 0, None) + + glDrawArrays(GL_TRIANGLES, 0, self.n_vertices[mat]) + + glDisableVertexAttribArray(1) + glDisableVertexAttribArray(0) + + glBindBuffer(GL_ARRAY_BUFFER, 0) + + glUseProgram(0) + + glDisable(GL_MULTISAMPLE) + + self.draw_end() diff --git a/tools/render/framework.py b/tools/render/framework.py new file mode 100644 index 0000000000000000000000000000000000000000..8f97bcca2abfc8387e77138acb22a87aff90a20b --- /dev/null +++ b/tools/render/framework.py @@ -0,0 +1,93 @@ +# Mario Rosasco, 2016 +# adapted from framework.cpp, Copyright (C) 2010-2012 by Jason L. McKesson +# This file is licensed under the MIT License. +# +# NB: Unlike in the framework.cpp organization, the main loop is contained +# in the tutorial files, not in this framework file. Additionally, a copy of +# this module file must exist in the same directory as the tutorial files +# to be imported properly. + + +import os + +from OpenGL.GL import * + + +# Function that creates and compiles shaders according to the given type (a GL enum value) and +# shader program (a file containing a GLSL program). +def loadShader(shaderType, shaderFile): + # check if file exists, get full path name + strFilename = findFileOrThrow(shaderFile) + shaderData = None + with open(strFilename, 'r') as f: + shaderData = f.read() + + shader = glCreateShader(shaderType) + glShaderSource(shader, shaderData) # note that this is a simpler function call than in C + + # This shader compilation is more explicit than the one used in + # framework.cpp, which relies on a glutil wrapper function. + # This is made explicit here mainly to decrease dependence on pyOpenGL + # utilities and wrappers, which docs caution may change in future versions. + glCompileShader(shader) + + status = glGetShaderiv(shader, GL_COMPILE_STATUS) + if status == GL_FALSE: + # Note that getting the error log is much simpler in Python than in C/C++ + # and does not require explicit handling of the string buffer + strInfoLog = glGetShaderInfoLog(shader) + strShaderType = "" + if shaderType is GL_VERTEX_SHADER: + strShaderType = "vertex" + elif shaderType is GL_GEOMETRY_SHADER: + strShaderType = "geometry" + elif shaderType is GL_FRAGMENT_SHADER: + strShaderType = "fragment" + + print("Compilation failure for " + strShaderType + " shader:\n" + str(strInfoLog)) + + return shader + + +# Function that accepts a list of shaders, compiles them, and returns a handle to the compiled program +def createProgram(shaderList): + program = glCreateProgram() + + for shader in shaderList: + glAttachShader(program, shader) + + glLinkProgram(program) + + status = glGetProgramiv(program, GL_LINK_STATUS) + if status == GL_FALSE: + # Note that getting the error log is much simpler in Python than in C/C++ + # and does not require explicit handling of the string buffer + strInfoLog = glGetProgramInfoLog(program) + print("Linker failure: \n" + str(strInfoLog)) + + for shader in shaderList: + glDetachShader(program, shader) + + return program + + +# Helper function to locate and open the target file (passed in as a string). +# Returns the full path to the file as a string. +def findFileOrThrow(strBasename): + # Keep constant names in C-style convention, for readability + # when comparing to C(/C++) code. + if os.path.isfile(strBasename): + return strBasename + + LOCAL_FILE_DIR = "." + os.sep + GLOBAL_FILE_DIR = os.path.dirname(os.path.abspath(__file__)) + os.sep + + strFilename = LOCAL_FILE_DIR + strBasename + if os.path.isfile(strFilename): + return strFilename + + strFilename = GLOBAL_FILE_DIR + strBasename + if os.path.isfile(strFilename): + return strFilename + + raise IOError('Could not find target file ' + strBasename) diff --git a/tools/render/glm.py b/tools/render/glm.py new file mode 100644 index 0000000000000000000000000000000000000000..685805aaf2c264f4b9dacbcddc464f6bf0279832 --- /dev/null +++ b/tools/render/glm.py @@ -0,0 +1,148 @@ +''' +MIT License + +Copyright (c) 2019 Shunsuke Saito, Zeng Huang, and Ryota Natsume + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. +''' +import numpy as np + + +def vec3(x, y, z): + return np.array([x, y, z], dtype=np.float32) + + +def radians(v): + return np.radians(v) + + +def identity(): + return np.identity(4, dtype=np.float32) + + +def empty(): + return np.zeros([4, 4], dtype=np.float32) + + +def magnitude(v): + return np.linalg.norm(v) + + +def normalize(v): + m = magnitude(v) + return v if m == 0 else v / m + + +def dot(u, v): + return np.sum(u * v) + + +def cross(u, v): + res = vec3(0, 0, 0) + res[0] = u[1] * v[2] - u[2] * v[1] + res[1] = u[2] * v[0] - u[0] * v[2] + res[2] = u[0] * v[1] - u[1] * v[0] + return res + + +# below functions can be optimized + +def translate(m, v): + res = np.copy(m) + res[:, 3] = m[:, 0] * v[0] + m[:, 1] * v[1] + m[:, 2] * v[2] + m[:, 3] + return res + + +def rotate(m, angle, v): + a = angle + c = np.cos(a) + s = np.sin(a) + + axis = normalize(v) + temp = (1 - c) * axis + + rot = empty() + rot[0][0] = c + temp[0] * axis[0] + rot[0][1] = temp[0] * axis[1] + s * axis[2] + rot[0][2] = temp[0] * axis[2] - s * axis[1] + + rot[1][0] = temp[1] * axis[0] - s * axis[2] + rot[1][1] = c + temp[1] * axis[1] + rot[1][2] = temp[1] * axis[2] + s * axis[0] + + rot[2][0] = temp[2] * axis[0] + s * axis[1] + rot[2][1] = temp[2] * axis[1] - s * axis[0] + rot[2][2] = c + temp[2] * axis[2] + + res = empty() + res[:, 0] = m[:, 0] * rot[0][0] + m[:, 1] * rot[0][1] + m[:, 2] * rot[0][2] + res[:, 1] = m[:, 0] * rot[1][0] + m[:, 1] * rot[1][1] + m[:, 2] * rot[1][2] + res[:, 2] = m[:, 0] * rot[2][0] + m[:, 1] * rot[2][1] + m[:, 2] * rot[2][2] + res[:, 3] = m[:, 3] + return res + + +def perspective(fovy, aspect, zNear, zFar): + tanHalfFovy = np.tan(fovy / 2) + + res = empty() + res[0][0] = 1 / (aspect * tanHalfFovy) + res[1][1] = 1 / (tanHalfFovy) + res[2][3] = -1 + res[2][2] = - (zFar + zNear) / (zFar - zNear) + res[3][2] = -(2 * zFar * zNear) / (zFar - zNear) + + return res.T + + +def ortho(left, right, bottom, top, zNear, zFar): + # res = np.ones([4, 4], dtype=np.float32) + res = identity() + res[0][0] = 2 / (right - left) + res[1][1] = 2 / (top - bottom) + res[2][2] = - 2 / (zFar - zNear) + res[3][0] = - (right + left) / (right - left) + res[3][1] = - (top + bottom) / (top - bottom) + res[3][2] = - (zFar + zNear) / (zFar - zNear) + return res.T + + +def lookat(eye, center, up): + f = normalize(center - eye) + s = normalize(cross(f, up)) + u = cross(s, f) + + res = identity() + res[0][0] = s[0] + res[1][0] = s[1] + res[2][0] = s[2] + res[0][1] = u[0] + res[1][1] = u[1] + res[2][1] = u[2] + res[0][2] = -f[0] + res[1][2] = -f[1] + res[2][2] = -f[2] + res[3][0] = -dot(s, eye) + res[3][1] = -dot(u, eye) + res[3][2] = -dot(f, eye) + return res.T + + +def transform(d, m): + return np.dot(m, d.T).T diff --git a/tools/render/quad.fs b/tools/render/quad.fs new file mode 100644 index 0000000000000000000000000000000000000000..de64baccbcc3ecff27517585f9e9acef6dd93001 --- /dev/null +++ b/tools/render/quad.fs @@ -0,0 +1,12 @@ +#version 330 core + +out vec4 FragColor; + +in vec2 TexCoord; + +uniform sampler2D screenTexture; + +void main() +{ + FragColor = texture(screenTexture, TexCoord); +} diff --git a/tools/render/quad.vs b/tools/render/quad.vs new file mode 100644 index 0000000000000000000000000000000000000000..2b6e3bece882fdedf220ee2fef45cdd3a3d4c947 --- /dev/null +++ b/tools/render/quad.vs @@ -0,0 +1,12 @@ +#version 330 core + +layout (location = 0) in vec2 aPos; +layout (location = 1) in vec2 aTexCoord; + +out vec2 TexCoord; + +void main() +{ + gl_Position = vec4(aPos.x, aPos.y, 0.0, 1.0); + TexCoord = aTexCoord; +} diff --git a/tools/render/render.py b/tools/render/render.py new file mode 100644 index 0000000000000000000000000000000000000000..8f26dc5ed7a7e3d429e62792d85e5b0d55c4c1f4 --- /dev/null +++ b/tools/render/render.py @@ -0,0 +1,337 @@ +''' +MIT License + +Copyright (c) 2019 Shunsuke Saito, Zeng Huang, and Ryota Natsume + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. +''' +import numpy as np +from OpenGL.GLUT import * +from .framework import * + +_glut_window = None + +class Render: + def __init__(self, width=1600, height=1200, name='GL Renderer', + program_files=['simple.fs', 'simple.vs'], color_size=1, ms_rate=1): + self.width = width + self.height = height + self.name = name + self.display_mode = GLUT_DOUBLE | GLUT_RGB | GLUT_DEPTH + self.use_inverse_depth = False + + global _glut_window + if _glut_window is None: + glutInit() + glutInitDisplayMode(self.display_mode) + glutInitWindowSize(self.width, self.height) + glutInitWindowPosition(0, 0) + _glut_window = glutCreateWindow("My Render.") + + # glEnable(GL_DEPTH_CLAMP) + glEnable(GL_DEPTH_TEST) + + glClampColor(GL_CLAMP_READ_COLOR, GL_FALSE) + glClampColor(GL_CLAMP_FRAGMENT_COLOR, GL_FALSE) + glClampColor(GL_CLAMP_VERTEX_COLOR, GL_FALSE) + + # init program + shader_list = [] + + for program_file in program_files: + _, ext = os.path.splitext(program_file) + if ext == '.vs': + shader_list.append(loadShader(GL_VERTEX_SHADER, program_file)) + elif ext == '.fs': + shader_list.append(loadShader(GL_FRAGMENT_SHADER, program_file)) + elif ext == '.gs': + shader_list.append(loadShader(GL_GEOMETRY_SHADER, program_file)) + + self.program = createProgram(shader_list) + + for shader in shader_list: + glDeleteShader(shader) + + # Init uniform variables + self.model_mat_unif = glGetUniformLocation(self.program, 'ModelMat') + self.persp_mat_unif = glGetUniformLocation(self.program, 'PerspMat') + + self.vertex_buffer = glGenBuffers(1) + + # Init screen quad program and buffer + self.quad_program, self.quad_buffer = self.init_quad_program() + + # Configure frame buffer + self.frame_buffer = glGenFramebuffers(1) + glBindFramebuffer(GL_FRAMEBUFFER, self.frame_buffer) + + self.intermediate_fbo = None + if ms_rate > 1: + # Configure texture buffer to render to + self.color_buffer = [] + for i in range(color_size): + color_buffer = glGenTextures(1) + multi_sample_rate = ms_rate + glBindTexture(GL_TEXTURE_2D_MULTISAMPLE, color_buffer) + glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE) + glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE) + glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR) + glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR) + glTexImage2DMultisample(GL_TEXTURE_2D_MULTISAMPLE, multi_sample_rate, GL_RGBA32F, self.width, self.height, GL_TRUE) + glBindTexture(GL_TEXTURE_2D_MULTISAMPLE, 0) + glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0 + i, GL_TEXTURE_2D_MULTISAMPLE, color_buffer, 0) + self.color_buffer.append(color_buffer) + + self.render_buffer = glGenRenderbuffers(1) + glBindRenderbuffer(GL_RENDERBUFFER, self.render_buffer) + glRenderbufferStorageMultisample(GL_RENDERBUFFER, multi_sample_rate, GL_DEPTH24_STENCIL8, self.width, self.height) + glBindRenderbuffer(GL_RENDERBUFFER, 0) + glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT, GL_RENDERBUFFER, self.render_buffer) + + attachments = [] + for i in range(color_size): + attachments.append(GL_COLOR_ATTACHMENT0 + i) + glDrawBuffers(color_size, attachments) + glBindFramebuffer(GL_FRAMEBUFFER, 0) + + self.intermediate_fbo = glGenFramebuffers(1) + glBindFramebuffer(GL_FRAMEBUFFER, self.intermediate_fbo) + + self.screen_texture = [] + for i in range(color_size): + screen_texture = glGenTextures(1) + glBindTexture(GL_TEXTURE_2D, screen_texture) + glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA32F, self.width, self.height, 0, GL_RGBA, GL_FLOAT, None) + glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR) + glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR) + glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0 + i, GL_TEXTURE_2D, screen_texture, 0) + self.screen_texture.append(screen_texture) + + glDrawBuffers(color_size, attachments) + glBindFramebuffer(GL_FRAMEBUFFER, 0) + else: + self.color_buffer = [] + for i in range(color_size): + color_buffer = glGenTextures(1) + glBindTexture(GL_TEXTURE_2D, color_buffer) + glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE) + glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE) + glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST) + glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST) + glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA32F, self.width, self.height, 0, GL_RGBA, GL_FLOAT, None) + glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0 + i, GL_TEXTURE_2D, color_buffer, 0) + self.color_buffer.append(color_buffer) + + # Configure depth texture map to render to + self.depth_buffer = glGenTextures(1) + glBindTexture(GL_TEXTURE_2D, self.depth_buffer) + glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_REPEAT) + glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_REPEAT) + glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST) + glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST) + glTexParameteri(GL_TEXTURE_2D, GL_DEPTH_TEXTURE_MODE, GL_INTENSITY) + glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_COMPARE_MODE, GL_COMPARE_R_TO_TEXTURE) + glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_COMPARE_FUNC, GL_LEQUAL) + glTexImage2D(GL_TEXTURE_2D, 0, GL_DEPTH_COMPONENT, self.width, self.height, 0, GL_DEPTH_COMPONENT, GL_FLOAT, None) + glFramebufferTexture2D(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_TEXTURE_2D, self.depth_buffer, 0) + + attachments = [] + for i in range(color_size): + attachments.append(GL_COLOR_ATTACHMENT0 + i) + glDrawBuffers(color_size, attachments) + self.screen_texture = self.color_buffer + + glBindFramebuffer(GL_FRAMEBUFFER, 0) + + + # Configure texture buffer if needed + self.render_texture = None + + # NOTE: original render_texture only support one input + # this is tentative member of this issue + self.render_texture_v2 = {} + + # Inner storage for buffer data + self.vertex_data = None + self.vertex_dim = None + self.n_vertices = None + + self.model_view_matrix = None + self.projection_matrix = None + + glutDisplayFunc(self.display) + + + def init_quad_program(self): + shader_list = [] + + shader_list.append(loadShader(GL_VERTEX_SHADER, "quad.vs")) + shader_list.append(loadShader(GL_FRAGMENT_SHADER, "quad.fs")) + + the_program = createProgram(shader_list) + + for shader in shader_list: + glDeleteShader(shader) + + # vertex attributes for a quad that fills the entire screen in Normalized Device Coordinates. + # positions # texCoords + quad_vertices = np.array( + [-1.0, 1.0, 0.0, 1.0, + -1.0, -1.0, 0.0, 0.0, + 1.0, -1.0, 1.0, 0.0, + + -1.0, 1.0, 0.0, 1.0, + 1.0, -1.0, 1.0, 0.0, + 1.0, 1.0, 1.0, 1.0] + ) + + quad_buffer = glGenBuffers(1) + glBindBuffer(GL_ARRAY_BUFFER, quad_buffer) + glBufferData(GL_ARRAY_BUFFER, quad_vertices, GL_STATIC_DRAW) + + glBindBuffer(GL_ARRAY_BUFFER, 0) + + return the_program, quad_buffer + + def set_mesh(self, vertices, faces): + self.vertex_data = vertices[faces.reshape([-1])] + self.vertex_dim = self.vertex_data.shape[1] + self.n_vertices = self.vertex_data.shape[0] + + glBindBuffer(GL_ARRAY_BUFFER, self.vertex_buffer) + glBufferData(GL_ARRAY_BUFFER, self.vertex_data, GL_STATIC_DRAW) + + glBindBuffer(GL_ARRAY_BUFFER, 0) + + def set_viewpoint(self, projection, model_view): + self.projection_matrix = projection + self.model_view_matrix = model_view + + def draw_init(self): + glBindFramebuffer(GL_FRAMEBUFFER, self.frame_buffer) + glEnable(GL_DEPTH_TEST) + + # glClearColor(0.0, 0.0, 0.0, 0.0) + glClearColor(1.0, 1.0, 1.0, 0.0) #Black background + + if self.use_inverse_depth: + glDepthFunc(GL_GREATER) + glClearDepth(0.0) + else: + glDepthFunc(GL_LESS) + glClearDepth(1.0) + glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT) + + def draw_end(self): + if self.intermediate_fbo is not None: + for i in range(len(self.color_buffer)): + glBindFramebuffer(GL_READ_FRAMEBUFFER, self.frame_buffer) + glReadBuffer(GL_COLOR_ATTACHMENT0 + i) + glBindFramebuffer(GL_DRAW_FRAMEBUFFER, self.intermediate_fbo) + glDrawBuffer(GL_COLOR_ATTACHMENT0 + i) + glBlitFramebuffer(0, 0, self.width, self.height, 0, 0, self.width, self.height, GL_COLOR_BUFFER_BIT, GL_NEAREST) + + glBindFramebuffer(GL_FRAMEBUFFER, 0) + glDepthFunc(GL_LESS) + glClearDepth(1.0) + + def draw(self): + self.draw_init() + + glUseProgram(self.program) + glUniformMatrix4fv(self.model_mat_unif, 1, GL_FALSE, self.model_view_matrix.transpose()) + glUniformMatrix4fv(self.persp_mat_unif, 1, GL_FALSE, self.projection_matrix.transpose()) + + glBindBuffer(GL_ARRAY_BUFFER, self.vertex_buffer) + + glEnableVertexAttribArray(0) + glVertexAttribPointer(0, self.vertex_dim, GL_DOUBLE, GL_FALSE, 0, None) + + glDrawArrays(GL_TRIANGLES, 0, self.n_vertices) + + glDisableVertexAttribArray(0) + + glBindBuffer(GL_ARRAY_BUFFER, 0) + + glUseProgram(0) + + self.draw_end() + + def get_color(self, color_id=0): + glBindFramebuffer(GL_FRAMEBUFFER, self.intermediate_fbo if self.intermediate_fbo is not None else self.frame_buffer) + glReadBuffer(GL_COLOR_ATTACHMENT0 + color_id) + data = glReadPixels(0, 0, self.width, self.height, GL_RGBA, GL_FLOAT, outputType=None) + glBindFramebuffer(GL_FRAMEBUFFER, 0) + rgb = data.reshape(self.height, self.width, -1) + rgb = np.flip(rgb, 0) + return rgb + + def get_z_value(self): + glBindFramebuffer(GL_FRAMEBUFFER, self.frame_buffer) + data = glReadPixels(0, 0, self.width, self.height, GL_DEPTH_COMPONENT, GL_FLOAT, outputType=None) + glBindFramebuffer(GL_FRAMEBUFFER, 0) + z = data.reshape(self.height, self.width) + z = np.flip(z, 0) + return z + + def display(self): + # First we draw a scene. + # Notice the result is stored in the texture buffer. + self.draw() + + # Then we return to the default frame buffer since we will display on the screen. + glBindFramebuffer(GL_FRAMEBUFFER, 0) + + # Do the clean-up. + # glClearColor(0.0, 0.0, 0.0, 0.0) #Black background + glClearColor(1.0, 1.0, 1.0, 0.0) #Black background + glClear(GL_COLOR_BUFFER_BIT) + + # We draw a rectangle which covers the whole screen. + glUseProgram(self.quad_program) + glBindBuffer(GL_ARRAY_BUFFER, self.quad_buffer) + + size_of_double = 8 + glEnableVertexAttribArray(0) + glVertexAttribPointer(0, 2, GL_DOUBLE, GL_FALSE, 4 * size_of_double, None) + glEnableVertexAttribArray(1) + glVertexAttribPointer(1, 2, GL_DOUBLE, GL_FALSE, 4 * size_of_double, c_void_p(2 * size_of_double)) + + glDisable(GL_DEPTH_TEST) + + # The stored texture is then mapped to this rectangle. + # properly assing color buffer texture + glActiveTexture(GL_TEXTURE0) + glBindTexture(GL_TEXTURE_2D, self.screen_texture[0]) + glUniform1i(glGetUniformLocation(self.quad_program, 'screenTexture'), 0) + + glDrawArrays(GL_TRIANGLES, 0, 6) + + glDisableVertexAttribArray(1) + glDisableVertexAttribArray(0) + + glEnable(GL_DEPTH_TEST) + glBindBuffer(GL_ARRAY_BUFFER, 0) + glUseProgram(0) + + glutSwapBuffers() + glutPostRedisplay() + + def show(self): + glutMainLoop() diff --git a/tools/render_mesh.py b/tools/render_mesh.py new file mode 100644 index 0000000000000000000000000000000000000000..f76e8194e3e0c840dc6914c2f81c9c94833cd061 --- /dev/null +++ b/tools/render_mesh.py @@ -0,0 +1,170 @@ +# Copyright (c) Facebook, Inc. and its affiliates. All rights reserved. + +import math +import numpy as np +import sys +import os + +from render.camera import Camera +from render.color_render import ColorRender +import trimesh + +import cv2 +import os +import argparse +from termcolor import colored + +width = 512 +height = 512 + + +def normalize_v3(arr): + ''' Normalize a numpy array of 3 component vectors shape=(n,3) ''' + lens = np.sqrt(arr[:, 0]**2 + arr[:, 1]**2 + arr[:, 2]**2) + eps = 0.00000001 + lens[lens < eps] = eps + arr[:, 0] /= lens + arr[:, 1] /= lens + arr[:, 2] /= lens + return arr + + +def compute_normal(vertices, faces): + # Create a zeroed array with the same type and shape as our vertices i.e., per vertex normal + norm = np.zeros(vertices.shape, dtype=vertices.dtype) + # Create an indexed view into the vertex array using the array of three indices for triangles + tris = vertices[faces] + # Calculate the normal for all the triangles, by taking the cross product of the vectors v1-v0, and v2-v0 in each triangle + n = np.cross(tris[::, 1] - tris[::, 0], tris[::, 2] - tris[::, 0]) + # n is now an array of normals per triangle. The length of each normal is dependent the vertices, + # we need to normalize these, so that our next step weights each normal equally. + normalize_v3(n) + # now we have a normalized array of normals, one per triangle, i.e., per triangle normals. + # But instead of one per triangle (i.e., flat shading), we add to each vertex in that triangle, + # the triangles' normal. Multiple triangles would then contribute to every vertex, so we need to normalize again afterwards. + # The cool part, we can actually add the normals through an indexed view of our (zeroed) per vertex normal array + norm[faces[:, 0]] += n + norm[faces[:, 1]] += n + norm[faces[:, 2]] += n + normalize_v3(norm) + + return norm + + +def make_rotate(rx, ry, rz): + + sinX = np.sin(rx) + sinY = np.sin(ry) + sinZ = np.sin(rz) + + cosX = np.cos(rx) + cosY = np.cos(ry) + cosZ = np.cos(rz) + + Rx = np.zeros((3, 3)) + Rx[0, 0] = 1.0 + Rx[1, 1] = cosX + Rx[1, 2] = -sinX + Rx[2, 1] = sinX + Rx[2, 2] = cosX + + Ry = np.zeros((3, 3)) + Ry[0, 0] = cosY + Ry[0, 2] = sinY + Ry[1, 1] = 1.0 + Ry[2, 0] = -sinY + Ry[2, 2] = cosY + + Rz = np.zeros((3, 3)) + Rz[0, 0] = cosZ + Rz[0, 1] = -sinZ + Rz[1, 0] = sinZ + Rz[1, 1] = cosZ + Rz[2, 2] = 1.0 + + R = np.matmul(np.matmul(Rz, Ry), Rx) + return R + + +parser = argparse.ArgumentParser() +parser.add_argument('-ww', '--width', type=int, default=512) +parser.add_argument('-hh', '--height', type=int, default=512) +parser.add_argument('--exp_name', type=str) +parser.add_argument('--dataset', type=str) +parser.add_argument('--mesh_ind', type=int, default=0) + +args = parser.parse_args() + +renderer = ColorRender(width=args.width, height=args.height) +cam = Camera(width=1.0, height=args.height / args.width) +cam.ortho_ratio = 1.2 +cam.near = -100 +cam.far = 10 + +data_root = 'data/result/if_nerf/{}/mesh'.format( + args.exp_name) +obj_path = os.path.join(data_root, '{:04d}.ply'.format(args.mesh_ind)) + +mesh_render_dir = os.path.join(data_root, 'mesh{}_render'.format(args.mesh_ind)) + +os.system('mkdir -p {}'.format(mesh_render_dir)) +obj_files = [obj_path] + +if args.dataset == 'zju_mocap': + R = make_rotate(0, math.radians(0), 0) # zju-mocap +else: + R = make_rotate(0, math.radians(90), math.radians(90)) # people-snapshot + +print(colored('the results are saved at {}'.format(mesh_render_dir), 'yellow')) + +for i, obj_path in enumerate(obj_files): + + print(obj_path) + obj_file = obj_path.split('/')[-1] + file_name = obj_file[:-4] + + if not os.path.exists(obj_path): + continue + mesh = trimesh.load(obj_path) + vertices = mesh.vertices + faces = mesh.faces + + rot = np.array([[1, 0, 0], [0, 0, 1], [0, -1, 0]]) + vertices = np.dot(vertices, rot.T) + mesh.vertices = vertices + + vertices = np.matmul(vertices, R.T) + bbox_max = vertices.max(0) + bbox_min = vertices.min(0) + + # notice that original scale is discarded to render with the same size + vertices -= 0.5 * (bbox_max + bbox_min)[None, :] + vertices /= bbox_max[1] - bbox_min[1] + + normals = compute_normal(vertices, faces) + + renderer.set_mesh(vertices, faces, 0.5 * normals + 0.5, faces) + + self_rot = make_rotate(i, math.radians(-90), 0) + vertices = np.matmul(vertices, self_rot.T) + cnt = 0 + for j in range(0, 361, 4): + cam.center = np.array([0, 0, 0]) + cam.eye = np.array([ + 2.0 * math.sin(math.radians(0)), 0, 2.0 * math.cos(math.radians(0)) + ]) + cam.center + + self_rot = make_rotate(i, math.radians(-4), 0) + vertices = np.matmul(vertices, self_rot.T) + normals = compute_normal(vertices, faces) + + renderer.set_mesh(vertices, faces, 0.5 * normals + 0.5, faces) + renderer.set_camera(cam) + renderer.display() + + img = renderer.get_color(0) + img = cv2.cvtColor(img, cv2.COLOR_RGBA2BGRA) + img = img[..., :3] + + cv2.imwrite(os.path.join(mesh_render_dir, '%d.jpg' % cnt), 255 * img) + cnt += 1 diff --git a/tools/snapshot_smpl/renderer.py b/tools/snapshot_smpl/renderer.py new file mode 100644 index 0000000000000000000000000000000000000000..22f098b88fac028dd9b81b0a0642d61045a00acb --- /dev/null +++ b/tools/snapshot_smpl/renderer.py @@ -0,0 +1,186 @@ +import os + +# os.environ['PYOPENGL_PLATFORM'] = 'osmesa' + +import numpy as np + +import pyrender +import trimesh + +colors = [ + (0.5, 0.2, 0.2, 1.0), # Defalut + (.7, .5, .5, 1.), # Pink + (.7, .7, .6, 1.), # Neutral + (.5, .5, .7, 1.), # Blue + (.5, .55, .3, 1.), # capsule + (.3, .5, .55, 1.), # Yellow +] + + +class Renderer(object): + + def __init__(self, focal_length=1000, height=512, width=512, faces=None): + self.renderer = pyrender.OffscreenRenderer(height, width) + self.faces = faces + self.focal_length = focal_length + + def render_multiview(self, vertices, K, R, T, imglist, return_depth=False): + # List to store rendered scenes + output_images, output_depths = [], [] + # Need to flip x-axis + rot = trimesh.transformations.rotation_matrix( + np.radians(180), [1, 0, 0]) + nViews = len(imglist) + for nv in range(nViews): + img = imglist[nv] + self.renderer.viewport_height = img.shape[0] + self.renderer.viewport_width = img.shape[1] + # Create a scene for each image and render all meshes + scene = pyrender.Scene(bg_color=[0.0, 0.0, 0.0, 0.0], + ambient_light=(0.5, 0.5, 0.5)) + camera_pose = np.eye(4) + + if K is None: + camera_center = np.array([img.shape[1] / 2., img.shape[0] / 2.]) + camera = pyrender.camera.IntrinsicsCamera(fx=self.focal_length, fy=self.focal_length, cx=camera_center[0], cy=camera_center[1]) + else: + camera = pyrender.camera.IntrinsicsCamera(fx=K[nv][0, 0], fy=K[nv][1, 1], cx=K[nv][0, 2], cy=K[nv][1, 2]) + scene.add(camera, pose=camera_pose) + # Create light source + light = pyrender.DirectionalLight(color=[1.0, 1.0, 1.0], intensity=1) + # for every person in the scene + if isinstance(vertices, dict): + for trackId, vert in vertices.items(): + vert = vert @ R[nv].T + T[nv] + mesh = trimesh.Trimesh(vert, self.faces) + mesh.apply_transform(rot) + trans = [0, 0, 0] + + material = pyrender.MetallicRoughnessMaterial( + metallicFactor=0.2, + alphaMode='OPAQUE', + baseColorFactor=colors[trackId % len(colors)]) + mesh = pyrender.Mesh.from_trimesh( + mesh, + material=material) + scene.add(mesh, 'mesh') + + # Use 3 directional lights + light_pose = np.eye(4) + light_pose[:3, 3] = np.array([0, -1, 1]) + trans + scene.add(light, pose=light_pose) + light_pose[:3, 3] = np.array([0, 1, 1]) + trans + scene.add(light, pose=light_pose) + light_pose[:3, 3] = np.array([1, 1, 2]) + trans + scene.add(light, pose=light_pose) + else: + n = 0 + verts = vertices @ R[nv].T + T[nv] + mesh = trimesh.Trimesh(verts, self.faces) + mesh.apply_transform(rot) + trans = [0, 0, 0] + + material = pyrender.MetallicRoughnessMaterial( + metallicFactor=0.2, + alphaMode='OPAQUE', + baseColorFactor=colors[n % len(colors)]) + mesh = pyrender.Mesh.from_trimesh( + mesh, + material=material) + scene.add(mesh, 'mesh') + + # Use 3 directional lights + light_pose = np.eye(4) + light_pose[:3, 3] = np.array([0, -1, 1]) + trans + scene.add(light, pose=light_pose) + light_pose[:3, 3] = np.array([0, 1, 1]) + trans + scene.add(light, pose=light_pose) + light_pose[:3, 3] = np.array([1, 1, 2]) + trans + scene.add(light, pose=light_pose) + # Alpha channel was not working previously need to check again + # Until this is fixed use hack with depth image to get the opacity + color, rend_depth = self.renderer.render(scene, flags=pyrender.RenderFlags.RGBA) + # color = color[::-1,::-1] + # rend_depth = rend_depth[::-1,::-1] + output_depths.append(rend_depth) + color = color.astype(np.uint8) + valid_mask = (rend_depth > 0)[:, :, None] + output_img = (color[:, :, :3] * valid_mask + + (1 - valid_mask) * img) + + output_img = output_img.astype(np.uint8) + output_images.append(output_img) + if return_depth: + return output_images, output_depths + else: + return output_images + + def __call__(self, images, vertices, translation, K=None): + # List to store rendered scenes + output_images = [] + # Need to flip x-axis + rot = trimesh.transformations.rotation_matrix( + np.radians(180), [1, 0, 0]) + # For all iamges + for i in range(len(images)): + img = images[i].cpu().numpy().transpose(1, 2, 0) + self.renderer.viewport_height = img.shape[0] + self.renderer.viewport_width = img.shape[1] + verts = vertices[i].detach().cpu().numpy() + mesh_trans = translation[i].cpu().numpy() + verts = verts + mesh_trans[:, None, ] + num_people = verts.shape[0] + + # Create a scene for each image and render all meshes + scene = pyrender.Scene(bg_color=[0.0, 0.0, 0.0, 0.0], + ambient_light=(0.5, 0.5, 0.5)) + + # Create camera. Camera will always be at [0,0,0] + # CHECK If I need to swap x and y + camera_pose = np.eye(4) + + if K is None: + camera_center = np.array([img.shape[1] / 2., img.shape[0] / 2.]) + camera = pyrender.camera.IntrinsicsCamera(fx=self.focal_length, fy=self.focal_length, cx=camera_center[0], cy=camera_center[1]) + else: + camera = pyrender.camera.IntrinsicsCamera(fx=K[i][0, 0], fy=K[i][1, 1], cx=K[i][0, 2], cy=K[i][1, 2]) + scene.add(camera, pose=camera_pose) + # Create light source + light = pyrender.DirectionalLight(color=[1.0, 1.0, 1.0], intensity=1) + # for every person in the scene + for n in range(num_people): + mesh = trimesh.Trimesh(verts[n], self.faces) + mesh.apply_transform(rot) + trans = 0 * mesh_trans[n] + trans[0] *= -1 + trans[2] *= -1 + material = pyrender.MetallicRoughnessMaterial( + metallicFactor=0.2, + alphaMode='OPAQUE', + baseColorFactor=colors[n % len(colors)]) + mesh = pyrender.Mesh.from_trimesh( + mesh, + material=material) + scene.add(mesh, 'mesh') + + # Use 3 directional lights + light_pose = np.eye(4) + light_pose[:3, 3] = np.array([0, -1, 1]) + trans + scene.add(light, pose=light_pose) + light_pose[:3, 3] = np.array([0, 1, 1]) + trans + scene.add(light, pose=light_pose) + light_pose[:3, 3] = np.array([1, 1, 2]) + trans + scene.add(light, pose=light_pose) + # Alpha channel was not working previously need to check again + # Until this is fixed use hack with depth image to get the opacity + color, rend_depth = self.renderer.render(scene, flags=pyrender.RenderFlags.RGBA) + # color = color[::-1,::-1] + # rend_depth = rend_depth[::-1,::-1] + color = color.astype(np.float32) / 255.0 + valid_mask = (rend_depth > 0)[:, :, None] + output_img = (color[:, :, :] * valid_mask + + (1 - valid_mask) * img) + output_img = np.transpose(output_img, (2, 0, 1)) + output_images.append(output_img) + + return output_images diff --git a/tools/snapshot_smpl/smpl.py b/tools/snapshot_smpl/smpl.py new file mode 100644 index 0000000000000000000000000000000000000000..438e4de86e607a04cc4e95dfab32ab27f211946d --- /dev/null +++ b/tools/snapshot_smpl/smpl.py @@ -0,0 +1,207 @@ +import chumpy as ch +import numpy as np +import sys +import pickle as pkl +import scipy.sparse as sp +from chumpy.ch import Ch +from .vendor.smpl.posemapper import posemap, Rodrigues +from .vendor.smpl.serialization import backwards_compatibility_replacements + + +VERT_NOSE = 331 +VERT_EAR_L = 3485 +VERT_EAR_R = 6880 +VERT_EYE_L = 2802 +VERT_EYE_R = 6262 + + +class Smpl(Ch): + """ + Class to store SMPL object with slightly improved code and access to more matrices + """ + terms = 'model', + dterms = 'trans', 'betas', 'pose', 'v_personal' + + def __init__(self, *args, **kwargs): + self.on_changed(self._dirty_vars) + + def on_changed(self, which): + if not hasattr(self, 'trans'): + self.trans = ch.zeros(3) + + if not hasattr(self, 'betas'): + self.betas = ch.zeros(10) + + if not hasattr(self, 'pose'): + self.pose = ch.zeros(72) + + if 'model' in which: + if not isinstance(self.model, dict): + dd = pkl.load(open(self.model)) + else: + dd = self.model + + backwards_compatibility_replacements(dd) + + for s in ['posedirs', 'shapedirs']: + if (s in dd) and not hasattr(dd[s], 'dterms'): + dd[s] = ch.array(dd[s]) + + self.f = dd['f'] + self.v_template = dd['v_template'] + if not hasattr(self, 'v_personal'): + self.v_personal = ch.zeros_like(self.v_template) + self.shapedirs = dd['shapedirs'] + self.J_regressor = dd['J_regressor'] + if 'J_regressor_prior' in dd: + self.J_regressor_prior = dd['J_regressor_prior'] + if sp.issparse(self.J_regressor): + self.J_regressor = self.J_regressor.toarray() + self.bs_type = dd['bs_type'] + self.weights = dd['weights'] + if 'vert_sym_idxs' in dd: + self.vert_sym_idxs = dd['vert_sym_idxs'] + if 'weights_prior' in dd: + self.weights_prior = dd['weights_prior'] + self.kintree_table = dd['kintree_table'] + self.posedirs = dd['posedirs'] + + self._set_up() + + def _set_up(self): + self.v_shaped = self.shapedirs.dot(self.betas) + self.v_template + self.v_shaped_personal = self.v_shaped + self.v_personal + self.J = ch.sum(self.J_regressor.T.reshape(-1, 1, 24) * self.v_shaped.reshape(-1, 3, 1), axis=0).T + self.v_posevariation = self.posedirs.dot(posemap(self.bs_type)(self.pose)) + self.v_poseshaped = self.v_shaped_personal + self.v_posevariation + + self.A, A_global = self._global_rigid_transformation() + self.Jtr = ch.vstack([g[:3, 3] for g in A_global]) + self.J_transformed = self.Jtr + self.trans.reshape((1, 3)) + + self.V = self.A.dot(self.weights.T) + + rest_shape_h = ch.hstack((self.v_poseshaped, ch.ones((self.v_poseshaped.shape[0], 1)))) + self.v_posed = ch.sum(self.V.T * rest_shape_h.reshape(-1, 4, 1), axis=1)[:, :3] + self.v = self.v_posed + self.trans + + def _global_rigid_transformation(self): + results = {} + pose = self.pose.reshape((-1, 3)) + parent = {i: self.kintree_table[0, i] for i in range(1, self.kintree_table.shape[1])} + + with_zeros = lambda x: ch.vstack((x, ch.array([[0.0, 0.0, 0.0, 1.0]]))) + pack = lambda x: ch.hstack([ch.zeros((4, 3)), x.reshape((4, 1))]) + + results[0] = with_zeros(ch.hstack((Rodrigues(pose[0, :]), self.J[0, :].reshape((3, 1))))) + + for i in range(1, self.kintree_table.shape[1]): + results[i] = results[parent[i]].dot(with_zeros(ch.hstack(( + Rodrigues(pose[i, :]), # rotation around bone endpoint + (self.J[i, :] - self.J[parent[i], :]).reshape((3, 1)) # bone + )))) + + results = [results[i] for i in sorted(results.keys())] + results_global = results + + # subtract rotated J position + results2 = [results[i] - (pack( + results[i].dot(ch.concatenate((self.J[i, :], [0])))) + ) for i in range(len(results))] + result = ch.dstack(results2) + + return result, results_global + + def compute_r(self): + return self.v.r + + def compute_dr_wrt(self, wrt): + if wrt is not self.trans and wrt is not self.betas and wrt is not self.pose and wrt is not self.v_personal: + return None + + return self.v.dr_wrt(wrt) + + +def copy_smpl(smpl, model): + new = Smpl(model, betas=smpl.betas) + new.pose[:] = smpl.pose.r + new.trans[:] = smpl.trans.r + + return new + + +def joints_coco(smpl): + J = smpl.J_transformed + nose = smpl[VERT_NOSE] + ear_l = smpl[VERT_EAR_L] + ear_r = smpl[VERT_EAR_R] + eye_l = smpl[VERT_EYE_L] + eye_r = smpl[VERT_EYE_R] + + shoulders_m = ch.sum(J[[14, 13]], axis=0) / 2. + neck = J[12] - 0.55 * (J[12] - shoulders_m) + + return ch.vstack(( + nose, + neck, + 2.1 * (J[14] - shoulders_m) + neck, + J[[19, 21]], + 2.1 * (J[13] - shoulders_m) + neck, + J[[18, 20]], + J[2] + 0.38 * (J[2] - J[1]), + J[[5, 8]], + J[1] + 0.38 * (J[1] - J[2]), + J[[4, 7]], + eye_r, + eye_l, + ear_r, + ear_l, + )) + + +def model_params_in_camera_coords(trans, pose, J0, camera_t, camera_rt): + root = Rodrigues(np.matmul(Rodrigues(camera_rt).r, Rodrigues(pose[:3]).r)).r.reshape(-1) + pose[:3] = root + + trans = (Rodrigues(camera_rt).dot(J0 + trans) - J0 + camera_t).r + + return trans, pose + + +if __name__ == '__main__': + smpl = Smpl(model='../vendor/smpl/models/basicModel_f_lbs_10_207_0_v1.0.0.pkl') + smpl.pose[:] = np.random.randn(72) * .2 + smpl.pose[0] = np.pi + # smpl.v_personal[:] = np.random.randn(*smpl.shape) / 500. + + # render test + from opendr.renderer import ColoredRenderer + from opendr.camera import ProjectPoints + from opendr.lighting import LambertianPointLight + + rn = ColoredRenderer() + + # Assign attributes to renderer + w, h = (640, 480) + + rn.camera = ProjectPoints(v=smpl, rt=np.zeros(3), t=np.array([0, 0, 3.]), f=np.array([w, w]), + c=np.array([w, h]) / 2., k=np.zeros(5)) + rn.frustum = {'near': 1., 'far': 10., 'width': w, 'height': h} + rn.set(v=smpl, f=smpl.f, bgcolor=np.zeros(3)) + + # Construct point light source + rn.vc = LambertianPointLight( + f=smpl.f, + v=rn.v, + num_verts=len(smpl), + light_pos=np.array([-1000, -1000, -2000]), + vc=np.ones_like(smpl) * .9, + light_color=np.array([1., 1., 1.])) + + # Show it using OpenCV + import cv2 + + cv2.imshow('render_SMPL', rn.r) + print ('..Print any key while on the display window') + cv2.waitKey(0) + cv2.destroyAllWindows() diff --git a/tools/snapshot_smpl/vendor/__init__.py b/tools/snapshot_smpl/vendor/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..aec8850cc3f3c2d24d67964df3b0992b5d3a12aa --- /dev/null +++ b/tools/snapshot_smpl/vendor/__init__.py @@ -0,0 +1,2 @@ +#!/usr/bin/env python2 +# -*- coding: utf-8 -*- diff --git a/tools/snapshot_smpl/vendor/smpl/__init__.py b/tools/snapshot_smpl/vendor/smpl/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..56e77d4e7ffc4bc85375e68ec73fcb0b5c407ea7 --- /dev/null +++ b/tools/snapshot_smpl/vendor/smpl/__init__.py @@ -0,0 +1,13 @@ +''' +Copyright 2015 Matthew Loper, Naureen Mahmood and the Max Planck Gesellschaft. All rights reserved. +This software is provided for research purposes only. +By using this software you agree to the terms of the SMPL Model license here http://smpl.is.tue.mpg.de/license + +More information about SMPL is available here http://smpl.is.tue.mpg. +For comments or questions, please email us at: smpl@tuebingen.mpg.de + + +About this file: +================ +This is an initialization file to help python look for submodules in this directory. +''' \ No newline at end of file diff --git a/tools/snapshot_smpl/vendor/smpl/lbs.py b/tools/snapshot_smpl/vendor/smpl/lbs.py new file mode 100755 index 0000000000000000000000000000000000000000..dd339b9d0758be7438f61197fad3c14a9e5eb855 --- /dev/null +++ b/tools/snapshot_smpl/vendor/smpl/lbs.py @@ -0,0 +1,80 @@ +''' +Copyright 2015 Matthew Loper, Naureen Mahmood and the Max Planck Gesellschaft. All rights reserved. +This software is provided for research purposes only. +By using this software you agree to the terms of the SMPL Model license here http://smpl.is.tue.mpg.de/license + +More information about SMPL is available here http://smpl.is.tue.mpg. +For comments or questions, please email us at: smpl@tuebingen.mpg.de + + +About this file: +================ +This file defines linear blend skinning for the SMPL loader which +defines the effect of bones and blendshapes on the vertices of the template mesh. + +Modules included: +- global_rigid_transformation: + computes global rotation & translation of the model +- verts_core: [overloaded function inherited from verts.verts_core] + computes the blending of joint-influences for each vertex based on type of skinning + +''' + +from .posemapper import posemap +import chumpy +import numpy as np + +def global_rigid_transformation(pose, J, kintree_table, xp): + results = {} + pose = pose.reshape((-1,3)) + id_to_col = {kintree_table[1,i] : i for i in range(kintree_table.shape[1])} + parent = {i : id_to_col[kintree_table[0,i]] for i in range(1, kintree_table.shape[1])} + + if xp == chumpy: + from posemapper import Rodrigues + rodrigues = lambda x : Rodrigues(x) + else: + import cv2 + rodrigues = lambda x : cv2.Rodrigues(x)[0] + + with_zeros = lambda x : xp.vstack((x, xp.array([[0.0, 0.0, 0.0, 1.0]]))) + results[0] = with_zeros(xp.hstack((rodrigues(pose[0,:]), J[0,:].reshape((3,1))))) + + for i in range(1, kintree_table.shape[1]): + results[i] = results[parent[i]].dot(with_zeros(xp.hstack(( + rodrigues(pose[i,:]), + ((J[i,:] - J[parent[i],:]).reshape((3,1))) + )))) + + pack = lambda x : xp.hstack([np.zeros((4, 3)), x.reshape((4,1))]) + + results = [results[i] for i in sorted(results.keys())] + results_global = results + + if True: + results2 = [results[i] - (pack( + results[i].dot(xp.concatenate( ( (J[i,:]), 0 ) ))) + ) for i in range(len(results))] + results = results2 + result = xp.dstack(results) + return result, results_global + + +def verts_core(pose, v, J, weights, kintree_table, want_Jtr=False, xp=chumpy): + A, A_global = global_rigid_transformation(pose, J, kintree_table, xp) + T = A.dot(weights.T) + + rest_shape_h = xp.vstack((v.T, np.ones((1, v.shape[0])))) + + v =(T[:,0,:] * rest_shape_h[0, :].reshape((1, -1)) + + T[:,1,:] * rest_shape_h[1, :].reshape((1, -1)) + + T[:,2,:] * rest_shape_h[2, :].reshape((1, -1)) + + T[:,3,:] * rest_shape_h[3, :].reshape((1, -1))).T + + v = v[:,:3] + + if not want_Jtr: + return v + Jtr = xp.vstack([g[:3,3] for g in A_global]) + return (v, Jtr) + diff --git a/tools/snapshot_smpl/vendor/smpl/posemapper.py b/tools/snapshot_smpl/vendor/smpl/posemapper.py new file mode 100755 index 0000000000000000000000000000000000000000..47f08f57f19261d3029d2ac417af3ffb19c4bdc4 --- /dev/null +++ b/tools/snapshot_smpl/vendor/smpl/posemapper.py @@ -0,0 +1,49 @@ +''' +Copyright 2015 Matthew Loper, Naureen Mahmood and the Max Planck Gesellschaft. All rights reserved. +This software is provided for research purposes only. +By using this software you agree to the terms of the SMPL Model license here http://smpl.is.tue.mpg.de/license + +More information about SMPL is available here http://smpl.is.tue.mpg. +For comments or questions, please email us at: smpl@tuebingen.mpg.de + + +About this file: +================ +This module defines the mapping of joint-angles to pose-blendshapes. + +Modules included: +- posemap: + computes the joint-to-pose blend shape mapping given a mapping type as input + +''' + +import chumpy as ch +import numpy as np +import cv2 + + +class Rodrigues(ch.Ch): + dterms = 'rt' + + def compute_r(self): + return cv2.Rodrigues(self.rt.r)[0] + + def compute_dr_wrt(self, wrt): + if wrt is self.rt: + return cv2.Rodrigues(self.rt.r)[1].T + + +def lrotmin(p): + if isinstance(p, np.ndarray): + p = p.ravel()[3:] + return np.concatenate([(cv2.Rodrigues(np.array(pp))[0]-np.eye(3)).ravel() for pp in p.reshape((-1,3))]).ravel() + if p.ndim != 2 or p.shape[1] != 3: + p = p.reshape((-1,3)) + p = p[1:] + return ch.concatenate([(Rodrigues(pp)-ch.eye(3)).ravel() for pp in p]).ravel() + +def posemap(s): + if s == 'lrotmin': + return lrotmin + else: + raise Exception('Unknown posemapping: %s' % (str(s),)) diff --git a/tools/snapshot_smpl/vendor/smpl/serialization.py b/tools/snapshot_smpl/vendor/smpl/serialization.py new file mode 100755 index 0000000000000000000000000000000000000000..74d22cdd3b4e8d8a664ede9aa82746941dde16ce --- /dev/null +++ b/tools/snapshot_smpl/vendor/smpl/serialization.py @@ -0,0 +1,137 @@ +''' +Copyright 2015 Matthew Loper, Naureen Mahmood and the Max Planck Gesellschaft. All rights reserved. +This software is provided for research purposes only. +By using this software you agree to the terms of the SMPL Model license here http://smpl.is.tue.mpg.de/license + +More information about SMPL is available here http://smpl.is.tue.mpg. +For comments or questions, please email us at: smpl@tuebingen.mpg.de + + +About this file: +================ +This file defines the serialization functions of the SMPL model. + +Modules included: +- save_model: + saves the SMPL model to a given file location as a .pkl file +- load_model: + loads the SMPL model from a given file location (i.e. a .pkl file location), + or a dictionary object. + +''' + +__all__ = ['load_model', 'save_model'] + +import numpy as np +import pickle +import chumpy as ch +from chumpy.ch import MatVecMult +from .posemapper import posemap +from .verts import verts_core + +def save_model(model, fname): + m0 = model + trainer_dict = {'v_template': np.asarray(m0.v_template),'J': np.asarray(m0.J),'weights': np.asarray(m0.weights),'kintree_table': m0.kintree_table,'f': m0.f, 'bs_type': m0.bs_type, 'posedirs': np.asarray(m0.posedirs)} + if hasattr(model, 'J_regressor'): + trainer_dict['J_regressor'] = m0.J_regressor + if hasattr(model, 'J_regressor_prior'): + trainer_dict['J_regressor_prior'] = m0.J_regressor_prior + if hasattr(model, 'weights_prior'): + trainer_dict['weights_prior'] = m0.weights_prior + if hasattr(model, 'shapedirs'): + trainer_dict['shapedirs'] = m0.shapedirs + if hasattr(model, 'vert_sym_idxs'): + trainer_dict['vert_sym_idxs'] = m0.vert_sym_idxs + if hasattr(model, 'bs_style'): + trainer_dict['bs_style'] = model.bs_style + else: + trainer_dict['bs_style'] = 'lbs' + pickle.dump(trainer_dict, open(fname, 'w'), -1) + + +def backwards_compatibility_replacements(dd): + + # replacements + if 'default_v' in dd: + dd['v_template'] = dd['default_v'] + del dd['default_v'] + if 'template_v' in dd: + dd['v_template'] = dd['template_v'] + del dd['template_v'] + if 'joint_regressor' in dd: + dd['J_regressor'] = dd['joint_regressor'] + del dd['joint_regressor'] + if 'blendshapes' in dd: + dd['posedirs'] = dd['blendshapes'] + del dd['blendshapes'] + if 'J' not in dd: + dd['J'] = dd['joints'] + del dd['joints'] + + # defaults + if 'bs_style' not in dd: + dd['bs_style'] = 'lbs' + + + +def ready_arguments(fname_or_dict): + + if not isinstance(fname_or_dict, dict): + dd = pickle.load(open(fname_or_dict)) + else: + dd = fname_or_dict + + backwards_compatibility_replacements(dd) + + want_shapemodel = 'shapedirs' in dd + nposeparms = dd['kintree_table'].shape[1]*3 + + if 'trans' not in dd: + dd['trans'] = np.zeros(3) + if 'pose' not in dd: + dd['pose'] = np.zeros(nposeparms) + if 'shapedirs' in dd and 'betas' not in dd: + dd['betas'] = np.zeros(dd['shapedirs'].shape[-1]) + + for s in ['v_template', 'weights', 'posedirs', 'pose', 'trans', 'shapedirs', 'betas', 'J']: + if (s in dd) and not hasattr(dd[s], 'dterms'): + dd[s] = ch.array(dd[s]) + + if want_shapemodel: + dd['v_shaped'] = dd['shapedirs'].dot(dd['betas'])+dd['v_template'] + v_shaped = dd['v_shaped'] + J_tmpx = MatVecMult(dd['J_regressor'], v_shaped[:,0]) + J_tmpy = MatVecMult(dd['J_regressor'], v_shaped[:,1]) + J_tmpz = MatVecMult(dd['J_regressor'], v_shaped[:,2]) + dd['J'] = ch.vstack((J_tmpx, J_tmpy, J_tmpz)).T + dd['v_posed'] = v_shaped + dd['posedirs'].dot(posemap(dd['bs_type'])(dd['pose'])) + else: + dd['v_posed'] = dd['v_template'] + dd['posedirs'].dot(posemap(dd['bs_type'])(dd['pose'])) + + return dd + + + +def load_model(fname_or_dict): + dd = ready_arguments(fname_or_dict) + + args = { + 'pose': dd['pose'], + 'v': dd['v_posed'], + 'J': dd['J'], + 'weights': dd['weights'], + 'kintree_table': dd['kintree_table'], + 'xp': ch, + 'want_Jtr': True, + 'bs_style': dd['bs_style'] + } + + result, Jtr = verts_core(**args) + result = result + dd['trans'].reshape((1,3)) + result.J_transformed = Jtr + dd['trans'].reshape((1,3)) + + for k, v in dd.items(): + setattr(result, k, v) + + return result + diff --git a/tools/snapshot_smpl/vendor/smpl/verts.py b/tools/snapshot_smpl/vendor/smpl/verts.py new file mode 100755 index 0000000000000000000000000000000000000000..4f8526151d9a7e86a40fd83ff9f11287554f33b4 --- /dev/null +++ b/tools/snapshot_smpl/vendor/smpl/verts.py @@ -0,0 +1,103 @@ +''' +Copyright 2015 Matthew Loper, Naureen Mahmood and the Max Planck Gesellschaft. All rights reserved. +This software is provided for research purposes only. +By using this software you agree to the terms of the SMPL Model license here http://smpl.is.tue.mpg.de/license + +More information about SMPL is available here http://smpl.is.tue.mpg. +For comments or questions, please email us at: smpl@tuebingen.mpg.de + + +About this file: +================ +This file defines the basic skinning modules for the SMPL loader which +defines the effect of bones and blendshapes on the vertices of the template mesh. + +Modules included: +- verts_decorated: + creates an instance of the SMPL model which inherits model attributes from another + SMPL model. +- verts_core: [overloaded function inherited by lbs.verts_core] + computes the blending of joint-influences for each vertex based on type of skinning + +''' + +import chumpy +from . import lbs +from .posemapper import posemap +import scipy.sparse as sp +from chumpy.ch import MatVecMult + +def ischumpy(x): return hasattr(x, 'dterms') + +def verts_decorated(trans, pose, + v_template, J, weights, kintree_table, bs_style, f, + bs_type=None, posedirs=None, betas=None, shapedirs=None, want_Jtr=False): + + for which in [trans, pose, v_template, weights, posedirs, betas, shapedirs]: + if which is not None: + assert ischumpy(which) + + v = v_template + + if shapedirs is not None: + if betas is None: + betas = chumpy.zeros(shapedirs.shape[-1]) + v_shaped = v + shapedirs.dot(betas) + else: + v_shaped = v + + if posedirs is not None: + v_posed = v_shaped + posedirs.dot(posemap(bs_type)(pose)) + else: + v_posed = v_shaped + + v = v_posed + + if sp.issparse(J): + regressor = J + J_tmpx = MatVecMult(regressor, v_shaped[:,0]) + J_tmpy = MatVecMult(regressor, v_shaped[:,1]) + J_tmpz = MatVecMult(regressor, v_shaped[:,2]) + J = chumpy.vstack((J_tmpx, J_tmpy, J_tmpz)).T + else: + assert(ischumpy(J)) + + assert(bs_style=='lbs') + result, Jtr = lbs.verts_core(pose, v, J, weights, kintree_table, want_Jtr=True, xp=chumpy) + + tr = trans.reshape((1,3)) + result = result + tr + Jtr = Jtr + tr + + result.trans = trans + result.f = f + result.pose = pose + result.v_template = v_template + result.J = J + result.weights = weights + result.kintree_table = kintree_table + result.bs_style = bs_style + result.bs_type =bs_type + if posedirs is not None: + result.posedirs = posedirs + result.v_posed = v_posed + if shapedirs is not None: + result.shapedirs = shapedirs + result.betas = betas + result.v_shaped = v_shaped + if want_Jtr: + result.J_transformed = Jtr + return result + +def verts_core(pose, v, J, weights, kintree_table, bs_style, want_Jtr=False, xp=chumpy): + + if xp == chumpy: + assert(hasattr(pose, 'dterms')) + assert(hasattr(v, 'dterms')) + assert(hasattr(J, 'dterms')) + assert(hasattr(weights, 'dterms')) + + assert(bs_style=='lbs') + result = lbs.verts_core(pose, v, J, weights, kintree_table, want_Jtr, xp) + + return result diff --git a/tools/vis_snapshot.py b/tools/vis_snapshot.py new file mode 100644 index 0000000000000000000000000000000000000000..f9e77ba17c8041ff42af78235b6b15a3a9f0fa36 --- /dev/null +++ b/tools/vis_snapshot.py @@ -0,0 +1,96 @@ +import pickle +import os +import h5py +import numpy as np +import open3d as o3d +from snapshot_smpl.renderer import Renderer +import cv2 +import tqdm + + +def read_pickle(pkl_path): + with open(pkl_path, 'rb') as f: + u = pickle._Unpickler(f) + u.encoding = 'latin1' + return u.load() + + +def get_KRTD(camera): + K = np.zeros([3, 3]) + K[0, 0] = camera['camera_f'][0] + K[1, 1] = camera['camera_f'][1] + K[:2, 2] = camera['camera_c'] + K[2, 2] = 1 + R = np.eye(3) + T = np.zeros([3]) + D = camera['camera_k'] + return K, R, T, D + + +def get_o3d_mesh(vertices, faces): + mesh = o3d.geometry.TriangleMesh() + mesh.vertices = o3d.utility.Vector3dVector(vertices) + mesh.triangles = o3d.utility.Vector3iVector(faces) + mesh.compute_vertex_normals() + return mesh + + +def get_smpl(base_smpl, betas, poses, trans): + base_smpl.betas = betas + base_smpl.pose = poses + base_smpl.trans = trans + vertices = np.array(base_smpl) + + faces = base_smpl.f + mesh = get_o3d_mesh(vertices, faces) + + return mesh + + +def render_smpl(vertices, img, K, R, T): + rendered_img = renderer.render_multiview(vertices, K[None], R[None], + T[None, None], [img])[0] + return rendered_img + + +data_root = 'data/people_snapshot' +video = 'female-3-casual' + +# if you do not have these smpl models, you could download them from https://zjueducn-my.sharepoint.com/:u:/g/personal/pengsida_zju_edu_cn/Eb_JIyA74O1Cnfhvn1ddrG4BC9TMK31022TykVxGdRenUQ?e=JU8pPt +model_paths = [ + 'basicModel_f_lbs_10_207_0_v1.0.0.pkl', + 'basicmodel_m_lbs_10_207_0_v1.0.0.pkl' +] + +camera_path = os.path.join(data_root, video, 'camera.pkl') +camera = read_pickle(camera_path) +K, R, T, D = get_KRTD(camera) + +mask_path = os.path.join(data_root, video, 'masks.hdf5') +masks = h5py.File(mask_path)['masks'] + +smpl_path = os.path.join(data_root, video, 'reconstructed_poses.hdf5') +smpl = h5py.File(smpl_path) +betas = smpl['betas'] +pose = smpl['pose'] +trans = smpl['trans'] + +if 'female' in video: + model_path = model_paths[0] +else: + model_path = model_paths[1] +model_data = read_pickle(model_path) +faces = model_data['f'] +renderer = Renderer(height=1080, width=1080, faces=faces) + +img_dir = os.path.join(data_root, video, 'image') +vertices_dir = os.path.join(data_root, video, 'vertices') + +num_img = len(os.listdir(img_dir)) +for i in tqdm.tqdm(range(num_img)): + img = cv2.imread(os.path.join(img_dir, '{}.jpg'.format(i))) + img = cv2.undistort(img, K, D) + vertices = np.load(os.path.join(vertices_dir, '{}.npy'.format(i))) + rendered_img = render_smpl(vertices, img, K, R, T) + cv2.imshow('main', rendered_img) + cv2.waitKey(50) & 0xFF diff --git a/train.sh b/train.sh new file mode 100644 index 0000000000000000000000000000000000000000..3b685d63279830b228a6bbbaf51914350eb74ae0 --- /dev/null +++ b/train.sh @@ -0,0 +1,36 @@ +# People-Snapshot dataset + +# training +# python train_net.py --cfg_file configs/snapshot_f3c.yaml exp_name female3c resume False +# python train_net.py --cfg_file configs/snapshot_f4c.yaml exp_name female4c resume False +# python train_net.py --cfg_file configs/snapshot_f6p.yaml exp_name female6p resume False +# python train_net.py --cfg_file configs/snapshot_f7p.yaml exp_name female7p resume False +# python train_net.py --cfg_file configs/snapshot_f8p.yaml exp_name female8p resume False +# python train_net.py --cfg_file configs/snapshot_m2c.yaml exp_name male2c resume False +# python train_net.py --cfg_file configs/snapshot_m2o.yaml exp_name male2o resume False +# python train_net.py --cfg_file configs/snapshot_m3c.yaml exp_name male3c resume False +# python train_net.py --cfg_file configs/snapshot_m5o.yaml exp_name male5o resume False + +# ZJU-Mocap dataset + +# training +# python train_net.py --cfg_file configs/latent_xyzc_313.yaml exp_name xyzc_313 resume False +# python train_net.py --cfg_file configs/latent_xyzc_315.yaml exp_name xyzc_315 resume False +# python train_net.py --cfg_file configs/latent_xyzc_392.yaml exp_name xyzc_392 resume False +# python train_net.py --cfg_file configs/latent_xyzc_393.yaml exp_name xyzc_393 resume False +# python train_net.py --cfg_file configs/latent_xyzc_394.yaml exp_name xyzc_394 resume False +# python train_net.py --cfg_file configs/latent_xyzc_377.yaml exp_name xyzc_377 resume False +# python train_net.py --cfg_file configs/latent_xyzc_386.yaml exp_name xyzc_386 resume False +# python train_net.py --cfg_file configs/latent_xyzc_390.yaml exp_name xyzc_390 resume False +# python train_net.py --cfg_file configs/latent_xyzc_387.yaml exp_name xyzc_387 resume False + +# distributed training +# python -m torch.distributed.launch --nproc_per_node=4 train_net.py --cfg_file configs/latent_xyzc_313.yaml exp_name xyzc_313 resume False gpus "0, 1, 2, 3" distributed True +# python -m torch.distributed.launch --nproc_per_node=4 train_net.py --cfg_file configs/latent_xyzc_315.yaml exp_name xyzc_315 resume False gpus "0, 1, 2, 3" distributed True +# python -m torch.distributed.launch --nproc_per_node=4 train_net.py --cfg_file configs/latent_xyzc_392.yaml exp_name xyzc_392 resume False gpus "0, 1, 2, 3" distributed True +# python -m torch.distributed.launch --nproc_per_node=4 train_net.py --cfg_file configs/latent_xyzc_393.yaml exp_name xyzc_393 resume False gpus "0, 1, 2, 3" distributed True +# python -m torch.distributed.launch --nproc_per_node=4 train_net.py --cfg_file configs/latent_xyzc_394.yaml exp_name xyzc_394 resume False gpus "0, 1, 2, 3" distributed True +# python -m torch.distributed.launch --nproc_per_node=4 train_net.py --cfg_file configs/latent_xyzc_377.yaml exp_name xyzc_377 resume False gpus "0, 1, 2, 3" distributed True +# python -m torch.distributed.launch --nproc_per_node=4 train_net.py --cfg_file configs/latent_xyzc_386.yaml exp_name xyzc_386 resume False gpus "0, 1, 2, 3" distributed True +# python -m torch.distributed.launch --nproc_per_node=4 train_net.py --cfg_file configs/latent_xyzc_390.yaml exp_name xyzc_390 resume False gpus "0, 1, 2, 3" distributed True +# python -m torch.distributed.launch --nproc_per_node=4 train_net.py --cfg_file configs/latent_xyzc_387.yaml exp_name xyzc_387 resume False gpus "0, 1, 2, 3" distributed True diff --git a/train_net.py b/train_net.py new file mode 100644 index 0000000000000000000000000000000000000000..cb47c58266a0cfad8d16cdae0f55fd633dcdfdbe --- /dev/null +++ b/train_net.py @@ -0,0 +1,108 @@ +from lib.config import cfg, args +from lib.networks import make_network +from lib.train import make_trainer, make_optimizer, make_lr_scheduler, make_recorder, set_lr_scheduler +from lib.datasets import make_data_loader +from lib.utils.net_utils import load_model, save_model, load_network +from lib.evaluators import make_evaluator +import torch.multiprocessing +import torch +import torch.distributed as dist +import os + +if cfg.fix_random: + torch.manual_seed(0) + torch.backends.cudnn.deterministic = True + torch.backends.cudnn.benchmark = False + + +def train(cfg, network): + trainer = make_trainer(cfg, network) + optimizer = make_optimizer(cfg, network) + scheduler = make_lr_scheduler(cfg, optimizer) + recorder = make_recorder(cfg) + evaluator = make_evaluator(cfg) + + begin_epoch = load_model(network, + optimizer, + scheduler, + recorder, + cfg.trained_model_dir, + resume=cfg.resume) + set_lr_scheduler(cfg, scheduler) + + train_loader = make_data_loader(cfg, + is_train=True, + is_distributed=cfg.distributed, + max_iter=cfg.ep_iter) + val_loader = make_data_loader(cfg, is_train=False) + + for epoch in range(begin_epoch, cfg.train.epoch): + recorder.epoch = epoch + if cfg.distributed: + train_loader.batch_sampler.sampler.set_epoch(epoch) + + trainer.train(epoch, train_loader, optimizer, recorder) + scheduler.step() + + if (epoch + 1) % cfg.save_ep == 0 and cfg.local_rank == 0: + save_model(network, optimizer, scheduler, recorder, + cfg.trained_model_dir, epoch) + + if (epoch + 1) % cfg.save_latest_ep == 0 and cfg.local_rank == 0: + save_model(network, + optimizer, + scheduler, + recorder, + cfg.trained_model_dir, + epoch, + last=True) + + if (epoch + 1) % cfg.eval_ep == 0: + trainer.val(epoch, val_loader, evaluator, recorder) + + return network + + +def test(cfg, network): + trainer = make_trainer(cfg, network) + val_loader = make_data_loader(cfg, is_train=False) + evaluator = make_evaluator(cfg) + epoch = load_network(network, + cfg.trained_model_dir, + resume=cfg.resume, + epoch=cfg.test.epoch) + trainer.val(epoch, val_loader, evaluator) + + +def synchronize(): + """ + Helper function to synchronize (barrier) among all processes when + using distributed training + """ + if not dist.is_available(): + return + if not dist.is_initialized(): + return + world_size = dist.get_world_size() + if world_size == 1: + return + dist.barrier() + + +def main(): + if cfg.distributed: + cfg.local_rank = int(os.environ['RANK']) % torch.cuda.device_count() + torch.cuda.set_device(cfg.local_rank) + torch.distributed.init_process_group(backend="nccl", + init_method="env://") + synchronize() + + network = make_network(cfg) + if args.test: + test(cfg, network) + else: + train(cfg, network) + + +if __name__ == "__main__": + main() diff --git a/visualize.sh b/visualize.sh new file mode 100644 index 0000000000000000000000000000000000000000..6228f6c44d47174d00bc6208a04cb7587da6f15f --- /dev/null +++ b/visualize.sh @@ -0,0 +1,82 @@ +# People-Snapshot dataset + +# visualize novel views of single frame +# python run.py --type visualize --cfg_file configs/snapshot_f3c_demo.yaml exp_name female3c +# python run.py --type visualize --cfg_file configs/snapshot_f4c_demo.yaml exp_name female4c +# python run.py --type visualize --cfg_file configs/snapshot_f6p_demo.yaml exp_name female6p +# python run.py --type visualize --cfg_file configs/snapshot_f7p_demo.yaml exp_name female7p +# python run.py --type visualize --cfg_file configs/snapshot_f8p_demo.yaml exp_name female8p +# python run.py --type visualize --cfg_file configs/snapshot_m2c_demo.yaml exp_name male2c +# python run.py --type visualize --cfg_file configs/snapshot_m2o_demo.yaml exp_name male2o +# python run.py --type visualize --cfg_file configs/snapshot_m3c_demo.yaml exp_name male3c +# python run.py --type visualize --cfg_file configs/snapshot_m5o_demo.yaml exp_name male5o + +# visualize views of dynamic humans +# python run.py --type visualize --cfg_file configs/snapshot_f3c_perform.yaml exp_name female3c +# python run.py --type visualize --cfg_file configs/snapshot_f4c_perform.yaml exp_name female4c +# python run.py --type visualize --cfg_file configs/snapshot_f6p_perform.yaml exp_name female6p +# python run.py --type visualize --cfg_file configs/snapshot_f7p_perform.yaml exp_name female7p +# python run.py --type visualize --cfg_file configs/snapshot_f8p_perform.yaml exp_name female8p +# python run.py --type visualize --cfg_file configs/snapshot_m2c_perform.yaml exp_name male2c +# python run.py --type visualize --cfg_file configs/snapshot_m2o_perform.yaml exp_name male2o +# python run.py --type visualize --cfg_file configs/snapshot_m3c_perform.yaml exp_name male3c +# python run.py --type visualize --cfg_file configs/snapshot_m5o_perform.yaml exp_name male5o + +# visualize mesh + +# ZJU-Mocap dataset + +# visualize novel views of single frame +# python run.py --type visualize --cfg_file configs/xyzc_demo_313.yaml exp_name xyzc_313 +# python run.py --type visualize --cfg_file configs/xyzc_demo_315.yaml exp_name xyzc_315 +# python run.py --type visualize --cfg_file configs/xyzc_demo_392.yaml exp_name xyzc_392 +# python run.py --type visualize --cfg_file configs/xyzc_demo_393.yaml exp_name xyzc_393 +# python run.py --type visualize --cfg_file configs/xyzc_demo_394.yaml exp_name xyzc_394 +# python run.py --type visualize --cfg_file configs/xyzc_demo_377.yaml exp_name xyzc_377 +# python run.py --type visualize --cfg_file configs/xyzc_demo_386.yaml exp_name xyzc_386 +# python run.py --type visualize --cfg_file configs/xyzc_demo_390.yaml exp_name xyzc_390 +# python run.py --type visualize --cfg_file configs/xyzc_demo_387.yaml exp_name xyzc_387 + +# visualize novel views of dynamic humans +# python run.py --type visualize --cfg_file configs/xyzc_perform_313.yaml exp_name xyzc_313 +# python run.py --type visualize --cfg_file configs/xyzc_perform_315.yaml exp_name xyzc_315 +# python run.py --type visualize --cfg_file configs/xyzc_perform_392.yaml exp_name xyzc_392 +# python run.py --type visualize --cfg_file configs/xyzc_perform_393.yaml exp_name xyzc_393 +# python run.py --type visualize --cfg_file configs/xyzc_perform_394.yaml exp_name xyzc_394 +# python run.py --type visualize --cfg_file configs/xyzc_perform_377.yaml exp_name xyzc_377 +# python run.py --type visualize --cfg_file configs/xyzc_perform_386.yaml exp_name xyzc_386 +# python run.py --type visualize --cfg_file configs/xyzc_perform_390.yaml exp_name xyzc_390 +# python run.py --type visualize --cfg_file configs/xyzc_perform_387.yaml exp_name xyzc_387 + +# visualize mesh +# python run.py --type visualize --cfg_file configs/latent_xyzc_mesh_313.yaml exp_name xyzc_313 train.num_workers 0 +# python run.py --type visualize --cfg_file configs/latent_xyzc_mesh_315.yaml exp_name xyzc_315 train.num_workers 0 +# python run.py --type visualize --cfg_file configs/latent_xyzc_mesh_392.yaml exp_name xyzc_392 train.num_workers 0 +# python run.py --type visualize --cfg_file configs/latent_xyzc_mesh_393.yaml exp_name xyzc_393 train.num_workers 0 +# python run.py --type visualize --cfg_file configs/latent_xyzc_mesh_394.yaml exp_name xyzc_394 train.num_workers 0 +# python run.py --type visualize --cfg_file configs/latent_xyzc_mesh_377.yaml exp_name xyzc_377 train.num_workers 0 +# python run.py --type visualize --cfg_file configs/latent_xyzc_mesh_386.yaml exp_name xyzc_386 train.num_workers 0 +# python run.py --type visualize --cfg_file configs/latent_xyzc_mesh_390.yaml exp_name xyzc_390 train.num_workers 0 +# python run.py --type visualize --cfg_file configs/latent_xyzc_mesh_387.yaml exp_name xyzc_387 train.num_workers 0 + +# visualize test views +# python run.py --type visualize --cfg_file configs/latent_xyzc_313.yaml exp_name xyzc_313 test_dataset_path 'lib/datasets/light_stage/can_smpl_test.py' visualizer_path 'lib/visualizers/if_nerf_test.py' renderer_path 'lib/networks/renderer/if_clight_renderer_mmsk.py' +# python run.py --type visualize --cfg_file configs/latent_xyzc_315.yaml exp_name xyzc_315 test_dataset_path 'lib/datasets/light_stage/can_smpl_test.py' visualizer_path 'lib/visualizers/if_nerf_test.py' renderer_path 'lib/networks/renderer/if_clight_renderer_mmsk.py' +# python run.py --type visualize --cfg_file configs/latent_xyzc_392.yaml exp_name xyzc_392 test_dataset_path 'lib/datasets/light_stage/can_smpl_test.py' visualizer_path 'lib/visualizers/if_nerf_test.py' renderer_path 'lib/networks/renderer/if_clight_renderer_mmsk.py' +# python run.py --type visualize --cfg_file configs/latent_xyzc_393.yaml exp_name xyzc_393 test_dataset_path 'lib/datasets/light_stage/can_smpl_test.py' visualizer_path 'lib/visualizers/if_nerf_test.py' renderer_path 'lib/networks/renderer/if_clight_renderer_mmsk.py' +# python run.py --type visualize --cfg_file configs/latent_xyzc_394.yaml exp_name xyzc_394 test_dataset_path 'lib/datasets/light_stage/can_smpl_test.py' visualizer_path 'lib/visualizers/if_nerf_test.py' renderer_path 'lib/networks/renderer/if_clight_renderer_mmsk.py' +# python run.py --type visualize --cfg_file configs/latent_xyzc_377.yaml exp_name xyzc_377 test_dataset_path 'lib/datasets/light_stage/can_smpl_test.py' visualizer_path 'lib/visualizers/if_nerf_test.py' renderer_path 'lib/networks/renderer/if_clight_renderer_mmsk.py' +# python run.py --type visualize --cfg_file configs/latent_xyzc_386.yaml exp_name xyzc_386 test_dataset_path 'lib/datasets/light_stage/can_smpl_test.py' visualizer_path 'lib/visualizers/if_nerf_test.py' renderer_path 'lib/networks/renderer/if_clight_renderer_mmsk.py' +# python run.py --type visualize --cfg_file configs/latent_xyzc_390.yaml exp_name xyzc_390 test_dataset_path 'lib/datasets/light_stage/can_smpl_test.py' visualizer_path 'lib/visualizers/if_nerf_test.py' renderer_path 'lib/networks/renderer/if_clight_renderer_mmsk.py' +# python run.py --type visualize --cfg_file configs/latent_xyzc_387.yaml exp_name xyzc_387 test_dataset_path 'lib/datasets/light_stage/can_smpl_test.py' visualizer_path 'lib/visualizers/if_nerf_test.py' renderer_path 'lib/networks/renderer/if_clight_renderer_mmsk.py' + +# visualize test views for NeRF +# python run.py --type visualize --cfg_file configs/nerf_313.yaml exp_name nerf_313 visualizer_path 'lib/visualizers/if_nerf_test.py' +# python run.py --type visualize --cfg_file configs/nerf_315.yaml exp_name nerf_315 visualizer_path 'lib/visualizers/if_nerf_test.py' +# python run.py --type visualize --cfg_file configs/nerf_392.yaml exp_name nerf_392 visualizer_path 'lib/visualizers/if_nerf_test.py' +# python run.py --type visualize --cfg_file configs/nerf_393.yaml exp_name nerf_393 visualizer_path 'lib/visualizers/if_nerf_test.py' +# python run.py --type visualize --cfg_file configs/nerf_394.yaml exp_name nerf_394 visualizer_path 'lib/visualizers/if_nerf_test.py' +# python run.py --type visualize --cfg_file configs/nerf_377.yaml exp_name nerf_377 visualizer_path 'lib/visualizers/if_nerf_test.py' +# python run.py --type visualize --cfg_file configs/nerf_386.yaml exp_name nerf_386 visualizer_path 'lib/visualizers/if_nerf_test.py' +# python run.py --type visualize --cfg_file configs/nerf_390.yaml exp_name nerf_390 visualizer_path 'lib/visualizers/if_nerf_test.py' +# python run.py --type visualize --cfg_file configs/nerf_387.yaml exp_name nerf_387 visualizer_path 'lib/visualizers/if_nerf_test.py' diff --git a/zju_smpl/easymocap_to_neuralbody.py b/zju_smpl/easymocap_to_neuralbody.py new file mode 100644 index 0000000000000000000000000000000000000000..762e5520dd4dd510637c77c63e9141555415c724 --- /dev/null +++ b/zju_smpl/easymocap_to_neuralbody.py @@ -0,0 +1,38 @@ +import os +import sys +import json + +import numpy as np +import torch +sys.path.append("../") +from smplmodel.body_model import SMPLlayer + +easymocap_params = json.load(open('zju_smpl/example.json'))[0] +poses = np.array(easymocap_params['poses']) +Rh = np.array(easymocap_params['Rh']) +Th = np.array(easymocap_params['Th']) +shapes = np.array(easymocap_params['shapes']) + +# the params of neural body +params = {'poses': poses, 'Rh': Rh, 'Th': Th, 'shapes': shapes} +# np.save('params_0.npy', params) + +# The newlly fitted SMPL parameters consider pose blend shapes. +new_params = True + +## create smpl model +model_folder = 'data/zju_mocap/smplx' +device = torch.device('cpu') +body_model = SMPLlayer(os.path.join(model_folder, 'smpl'), + gender='neutral', + device=device, + regressor_path=os.path.join(model_folder, + 'J_regressor_body25.npy')) +body_model.to(device) + +## load SMPL zju +vertices = body_model(return_verts=True, + return_tensor=False, + new_params=new_params, + **params) +# np.save('vertices_0.npy', vertices) diff --git a/zju_smpl/example.json b/zju_smpl/example.json new file mode 100644 index 0000000000000000000000000000000000000000..4948f2240597221976a056d2f57c7b13503ef0d0 --- /dev/null +++ b/zju_smpl/example.json @@ -0,0 +1,9 @@ +[ + { + "id": 0, + "Rh": [[-1.210, -0.104, 0.541]], + "Th": [[0.267, 0.188, -0.861]], + "poses": [[0.000, 0.000, 0.000, -1.382, -0.390, 0.088, 0.319, -0.340, -0.549, -0.000, 0.014, -0.000, 0.626, -0.182, -0.036, 0.238, -0.388, 0.520, 0.000, 0.121, 0.000, 0.128, -0.360, -0.035, -0.268, -0.207, -0.098, 0.000, 0.033, -0.000, 0.000, 0.001, -0.000, -0.000, -0.000, -0.000, 0.595, -0.294, 0.183, 0.000, -0.000, -0.000, -0.000, 0.013, 0.000, -0.309, -0.380, 0.262, 0.020, 0.171, 0.027, -0.029, 1.803, 0.233, 0.020, 0.209, 0.252, -0.048, 0.737, -0.012, -0.000, 0.000, 0.000, -0.000, 0.000, -0.000, 0.000, 0.000, -0.000, 0.000, -0.000, -0.000]], + "shapes": [[-0.940, -0.223, -0.009, 0.196, 0.039, 0.018, -0.005, -0.011, -0.001, -0.004]] + } +] diff --git a/zju_smpl/extract_vertices.py b/zju_smpl/extract_vertices.py new file mode 100644 index 0000000000000000000000000000000000000000..50ff0a4780fdf476dbb4ed6a01e2ba66d87f810f --- /dev/null +++ b/zju_smpl/extract_vertices.py @@ -0,0 +1,40 @@ +import os +import sys + +import numpy as np +import torch +sys.path.append("../") +from smplmodel.body_model import SMPLlayer + +smpl_dir = 'data/zju_mocap/CoreView_313/params' +verts_dir = 'data/zju_mocap/CoreView_313/vertices' + +# Previously, EasyMocap estimated SMPL parameters without pose blend shapes. +# The newly fitted SMPL parameters consider pose blend shapes. +new_params = False +if 'new' in os.path.basename(smpl_dir): + new_params = True + +smpl_path = os.path.join(smpl_dir, "1.npy") +verts_path = os.path.join(verts_dir, "1.npy") + +## load precomputed vertices +verts_load = np.load(verts_path) + +## create smpl model +model_folder = 'data/zju_mocap/smplx' +device = torch.device('cpu') +body_model = SMPLlayer(os.path.join(model_folder, 'smpl'), + gender='neutral', + device=device, + regressor_path=os.path.join(model_folder, + 'J_regressor_body25.npy')) +body_model.to(device) + +## load SMPL zju +params = np.load(smpl_path, allow_pickle=True).item() + +vertices = body_model(return_verts=True, + return_tensor=False, + new_params=new_params, + **params) diff --git a/zju_smpl/smplmodel/body_model.py b/zju_smpl/smplmodel/body_model.py new file mode 100644 index 0000000000000000000000000000000000000000..97d5b562ec2f12bb936fee112756739ac625b77c --- /dev/null +++ b/zju_smpl/smplmodel/body_model.py @@ -0,0 +1,153 @@ +import torch +import torch.nn as nn +from .lbs import lbs, batch_rodrigues +import os.path as osp +import pickle +import numpy as np + + +def to_tensor(array, dtype=torch.float32, device=torch.device('cpu')): + if 'torch.tensor' not in str(type(array)): + return torch.tensor(array, dtype=dtype).to(device) + else: + return array.to(device) + + +def to_np(array, dtype=np.float32): + if 'scipy.sparse' in str(type(array)): + array = array.todense() + return np.array(array, dtype=dtype) + + +class SMPLlayer(nn.Module): + def __init__(self, + model_path, + gender='neutral', + device=None, + regressor_path=None) -> None: + super(SMPLlayer, self).__init__() + dtype = torch.float32 + self.dtype = dtype + self.device = device + # create the SMPL model + if osp.isdir(model_path): + model_fn = 'SMPL_{}.{ext}'.format(gender.upper(), ext='pkl') + smpl_path = osp.join(model_path, model_fn) + else: + smpl_path = model_path + assert osp.exists(smpl_path), 'Path {} does not exist!'.format( + smpl_path) + + with open(smpl_path, 'rb') as smpl_file: + data = pickle.load(smpl_file, encoding='latin1') + self.faces = data['f'] + self.register_buffer( + 'faces_tensor', + to_tensor(to_np(self.faces, dtype=np.int64), dtype=torch.long)) + # Pose blend shape basis: 6890 x 3 x 207, reshaped to 6890*3 x 207 + num_pose_basis = data['posedirs'].shape[-1] + # 207 x 20670 + posedirs = data['posedirs'] + data['posedirs'] = np.reshape(data['posedirs'], [-1, num_pose_basis]).T + + for key in [ + 'J_regressor', 'v_template', 'weights', 'posedirs', 'shapedirs' + ]: + val = to_tensor(to_np(data[key]), dtype=dtype) + self.register_buffer(key, val) + # indices of parents for each joints + parents = to_tensor(to_np(data['kintree_table'][0])).long() + parents[0] = -1 + self.register_buffer('parents', parents) + # joints regressor + if regressor_path is not None: + X_regressor = to_tensor(np.load(regressor_path)) + X_regressor = torch.cat((self.J_regressor, X_regressor), dim=0) + + j_J_regressor = torch.zeros(24, + X_regressor.shape[0], + device=device) + for i in range(24): + j_J_regressor[i, i] = 1 + j_v_template = X_regressor @ self.v_template + # + j_shapedirs = torch.einsum('vij,kv->kij', + [self.shapedirs, X_regressor]) + # (25, 24) + j_weights = X_regressor @ self.weights + j_posedirs = torch.einsum( + 'ab, bde->ade', + [X_regressor, torch.Tensor(posedirs)]).numpy() + j_posedirs = np.reshape(j_posedirs, [-1, num_pose_basis]).T + j_posedirs = to_tensor(j_posedirs) + self.register_buffer('j_posedirs', j_posedirs) + self.register_buffer('j_shapedirs', j_shapedirs) + self.register_buffer('j_weights', j_weights) + self.register_buffer('j_v_template', j_v_template) + self.register_buffer('j_J_regressor', j_J_regressor) + + def forward(self, + poses, + shapes, + Rh=None, + Th=None, + return_verts=True, + return_tensor=True, + scale=1, + new_params=False, + **kwargs): + """ Forward pass for SMPL model + + Args: + poses (n, 72) + shapes (n, 10) + Rh (n, 3): global orientation + Th (n, 3): global translation + return_verts (bool, optional): if True return (6890, 3). Defaults to False. + """ + if 'torch' not in str(type(poses)): + dtype, device = self.dtype, self.device + poses = to_tensor(poses, dtype, device) + shapes = to_tensor(shapes, dtype, device) + Rh = to_tensor(Rh, dtype, device) + Th = to_tensor(Th, dtype, device) + bn = poses.shape[0] + if Rh is None: + Rh = torch.zeros(bn, 3, device=poses.device) + rot = batch_rodrigues(Rh) + transl = Th.unsqueeze(dim=1) + if shapes.shape[0] < bn: + shapes = shapes.expand(bn, -1) + if return_verts: + vertices, joints = lbs(shapes, + poses, + self.v_template, + self.shapedirs, + self.posedirs, + self.J_regressor, + self.parents, + self.weights, + pose2rot=True, + new_params=new_params, + dtype=self.dtype) + else: + vertices, joints = lbs(shapes, + poses, + self.j_v_template, + self.j_shapedirs, + self.j_posedirs, + self.j_J_regressor, + self.parents, + self.j_weights, + pose2rot=True, + new_params=new_params, + dtype=self.dtype) + vertices = vertices[:, 24:, :] + # transl = transl + joints[:, :1] * scale - torch.matmul(joints[:, :1], + # rot.permute(0, 2, 1)) * scale + vertices = torch.matmul(vertices, rot.transpose(1, 2)) * scale + transl + # vertices = vertices * scale + transl + if not return_tensor: + vertices = vertices.detach().cpu().numpy() + transl = transl.detach().cpu().numpy() + return vertices[0] diff --git a/zju_smpl/smplmodel/lbs.py b/zju_smpl/smplmodel/lbs.py new file mode 100644 index 0000000000000000000000000000000000000000..bbe11d33aced1fee3bd49156e76953ea98c02d03 --- /dev/null +++ b/zju_smpl/smplmodel/lbs.py @@ -0,0 +1,378 @@ +# -*- coding: utf-8 -*- + +# Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V. (MPG) is +# holder of all proprietary rights on this computer program. +# You can only use this computer program if you have closed +# a license agreement with MPG or you get the right to use the computer +# program from someone who is authorized to grant you that right. +# Any use of the computer program without a valid license is prohibited and +# liable to prosecution. +# +# Copyright©2019 Max-Planck-Gesellschaft zur Förderung +# der Wissenschaften e.V. (MPG). acting on behalf of its Max Planck Institute +# for Intelligent Systems. All rights reserved. +# +# Contact: ps-license@tuebingen.mpg.de + +from __future__ import absolute_import +from __future__ import print_function +from __future__ import division + +import numpy as np + +import torch +import torch.nn.functional as F + + +def rot_mat_to_euler(rot_mats): + # Calculates rotation matrix to euler angles + # Careful for extreme cases of eular angles like [0.0, pi, 0.0] + + sy = torch.sqrt(rot_mats[:, 0, 0] * rot_mats[:, 0, 0] + + rot_mats[:, 1, 0] * rot_mats[:, 1, 0]) + return torch.atan2(-rot_mats[:, 2, 0], sy) + + +def find_dynamic_lmk_idx_and_bcoords(vertices, pose, dynamic_lmk_faces_idx, + dynamic_lmk_b_coords, + neck_kin_chain, dtype=torch.float32): + ''' Compute the faces, barycentric coordinates for the dynamic landmarks + + + To do so, we first compute the rotation of the neck around the y-axis + and then use a pre-computed look-up table to find the faces and the + barycentric coordinates that will be used. + + Special thanks to Soubhik Sanyal (soubhik.sanyal@tuebingen.mpg.de) + for providing the original TensorFlow implementation and for the LUT. + + Parameters + ---------- + vertices: torch.tensor BxVx3, dtype = torch.float32 + The tensor of input vertices + pose: torch.tensor Bx(Jx3), dtype = torch.float32 + The current pose of the body model + dynamic_lmk_faces_idx: torch.tensor L, dtype = torch.long + The look-up table from neck rotation to faces + dynamic_lmk_b_coords: torch.tensor Lx3, dtype = torch.float32 + The look-up table from neck rotation to barycentric coordinates + neck_kin_chain: list + A python list that contains the indices of the joints that form the + kinematic chain of the neck. + dtype: torch.dtype, optional + + Returns + ------- + dyn_lmk_faces_idx: torch.tensor, dtype = torch.long + A tensor of size BxL that contains the indices of the faces that + will be used to compute the current dynamic landmarks. + dyn_lmk_b_coords: torch.tensor, dtype = torch.float32 + A tensor of size BxL that contains the indices of the faces that + will be used to compute the current dynamic landmarks. + ''' + + batch_size = vertices.shape[0] + + aa_pose = torch.index_select(pose.view(batch_size, -1, 3), 1, + neck_kin_chain) + rot_mats = batch_rodrigues( + aa_pose.view(-1, 3), dtype=dtype).view(batch_size, -1, 3, 3) + + rel_rot_mat = torch.eye(3, device=vertices.device, + dtype=dtype).unsqueeze_(dim=0) + for idx in range(len(neck_kin_chain)): + rel_rot_mat = torch.bmm(rot_mats[:, idx], rel_rot_mat) + + y_rot_angle = torch.round( + torch.clamp(-rot_mat_to_euler(rel_rot_mat) * 180.0 / np.pi, + max=39)).to(dtype=torch.long) + neg_mask = y_rot_angle.lt(0).to(dtype=torch.long) + mask = y_rot_angle.lt(-39).to(dtype=torch.long) + neg_vals = mask * 78 + (1 - mask) * (39 - y_rot_angle) + y_rot_angle = (neg_mask * neg_vals + + (1 - neg_mask) * y_rot_angle) + + dyn_lmk_faces_idx = torch.index_select(dynamic_lmk_faces_idx, + 0, y_rot_angle) + dyn_lmk_b_coords = torch.index_select(dynamic_lmk_b_coords, + 0, y_rot_angle) + + return dyn_lmk_faces_idx, dyn_lmk_b_coords + + +def vertices2landmarks(vertices, faces, lmk_faces_idx, lmk_bary_coords): + ''' Calculates landmarks by barycentric interpolation + + Parameters + ---------- + vertices: torch.tensor BxVx3, dtype = torch.float32 + The tensor of input vertices + faces: torch.tensor Fx3, dtype = torch.long + The faces of the mesh + lmk_faces_idx: torch.tensor L, dtype = torch.long + The tensor with the indices of the faces used to calculate the + landmarks. + lmk_bary_coords: torch.tensor Lx3, dtype = torch.float32 + The tensor of barycentric coordinates that are used to interpolate + the landmarks + + Returns + ------- + landmarks: torch.tensor BxLx3, dtype = torch.float32 + The coordinates of the landmarks for each mesh in the batch + ''' + # Extract the indices of the vertices for each face + # BxLx3 + batch_size, num_verts = vertices.shape[:2] + device = vertices.device + + lmk_faces = torch.index_select(faces, 0, lmk_faces_idx.view(-1)).view( + batch_size, -1, 3) + + lmk_faces += torch.arange( + batch_size, dtype=torch.long, device=device).view(-1, 1, 1) * num_verts + + lmk_vertices = vertices.view(-1, 3)[lmk_faces].view( + batch_size, -1, 3, 3) + + landmarks = torch.einsum('blfi,blf->bli', [lmk_vertices, lmk_bary_coords]) + return landmarks + + +def lbs(betas, pose, v_template, shapedirs, posedirs, J_regressor, parents, + lbs_weights, pose2rot=True, new_params=False, dtype=torch.float32): + ''' Performs Linear Blend Skinning with the given shape and pose parameters + + Parameters + ---------- + betas : torch.tensor BxNB + The tensor of shape parameters + pose : torch.tensor Bx(J + 1) * 3 + The pose parameters in axis-angle format + v_template torch.tensor BxVx3 + The template mesh that will be deformed + shapedirs : torch.tensor 1xNB + The tensor of PCA shape displacements + posedirs : torch.tensor Px(V * 3) + The pose PCA coefficients + J_regressor : torch.tensor JxV + The regressor array that is used to calculate the joints from + the position of the vertices + parents: torch.tensor J + The array that describes the kinematic tree for the model + lbs_weights: torch.tensor N x V x (J + 1) + The linear blend skinning weights that represent how much the + rotation matrix of each part affects each vertex + pose2rot: bool, optional + Flag on whether to convert the input pose tensor to rotation + matrices. The default value is True. If False, then the pose tensor + should already contain rotation matrices and have a size of + Bx(J + 1)x9 + dtype: torch.dtype, optional + + Returns + ------- + verts: torch.tensor BxVx3 + The vertices of the mesh after applying the shape and pose + displacements. + joints: torch.tensor BxJx3 + The joints of the model + ''' + + batch_size = max(betas.shape[0], pose.shape[0]) + device = betas.device + + # Add shape contribution + v_shaped = v_template + blend_shapes(betas, shapedirs) + + # Get the joints + # NxJx3 array + J = vertices2joints(J_regressor, v_shaped) + + # 3. Add pose blend shapes + # N x J x 3 x 3 + ident = torch.eye(3, dtype=dtype, device=device) + if pose2rot: + rot_mats = batch_rodrigues( + pose.view(-1, 3), dtype=dtype).view([batch_size, -1, 3, 3]) + + pose_feature = (rot_mats[:, 1:, :, :] - ident).view([batch_size, -1]) + # (N x P) x (P, V * 3) -> N x V x 3 + pose_offsets = torch.matmul(pose_feature, posedirs) \ + .view(batch_size, -1, 3) + else: + pose_feature = pose[:, 1:].view(batch_size, -1, 3, 3) - ident + rot_mats = pose.view(batch_size, -1, 3, 3) + + pose_offsets = torch.matmul(pose_feature.view(batch_size, -1), + posedirs).view(batch_size, -1, 3) + + if new_params: + v_posed = pose_offsets + v_shaped + else: + v_posed = v_shaped + + # 4. Get the global joint location + J_transformed, A = batch_rigid_transform(rot_mats, J, parents, dtype=dtype) + + # 5. Do skinning: + # W is N x V x (J + 1) + W = lbs_weights.unsqueeze(dim=0).expand([batch_size, -1, -1]) + # (N x V x (J + 1)) x (N x (J + 1) x 16) + num_joints = J_regressor.shape[0] + T = torch.matmul(W, A.view(batch_size, num_joints, 16)) \ + .view(batch_size, -1, 4, 4) + + homogen_coord = torch.ones([batch_size, v_posed.shape[1], 1], + dtype=dtype, device=device) + v_posed_homo = torch.cat([v_posed, homogen_coord], dim=2) + v_homo = torch.matmul(T, torch.unsqueeze(v_posed_homo, dim=-1)) + + verts = v_homo[:, :, :3, 0] + + return verts, J_transformed + + +def vertices2joints(J_regressor, vertices): + ''' Calculates the 3D joint locations from the vertices + + Parameters + ---------- + J_regressor : torch.tensor JxV + The regressor array that is used to calculate the joints from the + position of the vertices + vertices : torch.tensor BxVx3 + The tensor of mesh vertices + + Returns + ------- + torch.tensor BxJx3 + The location of the joints + ''' + + return torch.einsum('bik,ji->bjk', [vertices, J_regressor]) + + +def blend_shapes(betas, shape_disps): + ''' Calculates the per vertex displacement due to the blend shapes + + + Parameters + ---------- + betas : torch.tensor Bx(num_betas) + Blend shape coefficients + shape_disps: torch.tensor Vx3x(num_betas) + Blend shapes + + Returns + ------- + torch.tensor BxVx3 + The per-vertex displacement due to shape deformation + ''' + + # Displacement[b, m, k] = sum_{l} betas[b, l] * shape_disps[m, k, l] + # i.e. Multiply each shape displacement by its corresponding beta and + # then sum them. + blend_shape = torch.einsum('bl,mkl->bmk', [betas, shape_disps]) + return blend_shape + + +def batch_rodrigues(rot_vecs, epsilon=1e-8, dtype=torch.float32): + ''' Calculates the rotation matrices for a batch of rotation vectors + Parameters + ---------- + rot_vecs: torch.tensor Nx3 + array of N axis-angle vectors + Returns + ------- + R: torch.tensor Nx3x3 + The rotation matrices for the given axis-angle parameters + ''' + + batch_size = rot_vecs.shape[0] + device = rot_vecs.device + + angle = torch.norm(rot_vecs + 1e-8, dim=1, keepdim=True) + rot_dir = rot_vecs / angle + + cos = torch.unsqueeze(torch.cos(angle), dim=1) + sin = torch.unsqueeze(torch.sin(angle), dim=1) + + # Bx1 arrays + rx, ry, rz = torch.split(rot_dir, 1, dim=1) + K = torch.zeros((batch_size, 3, 3), dtype=dtype, device=device) + + zeros = torch.zeros((batch_size, 1), dtype=dtype, device=device) + K = torch.cat([zeros, -rz, ry, rz, zeros, -rx, -ry, rx, zeros], dim=1) \ + .view((batch_size, 3, 3)) + + ident = torch.eye(3, dtype=dtype, device=device).unsqueeze(dim=0) + rot_mat = ident + sin * K + (1 - cos) * torch.bmm(K, K) + return rot_mat + + +def transform_mat(R, t): + ''' Creates a batch of transformation matrices + Args: + - R: Bx3x3 array of a batch of rotation matrices + - t: Bx3x1 array of a batch of translation vectors + Returns: + - T: Bx4x4 Transformation matrix + ''' + # No padding left or right, only add an extra row + return torch.cat([F.pad(R, [0, 0, 0, 1]), + F.pad(t, [0, 0, 0, 1], value=1)], dim=2) + + +def batch_rigid_transform(rot_mats, joints, parents, dtype=torch.float32): + """ + Applies a batch of rigid transformations to the joints + + Parameters + ---------- + rot_mats : torch.tensor BxNx3x3 + Tensor of rotation matrices + joints : torch.tensor BxNx3 + Locations of joints + parents : torch.tensor BxN + The kinematic tree of each object + dtype : torch.dtype, optional: + The data type of the created tensors, the default is torch.float32 + + Returns + ------- + posed_joints : torch.tensor BxNx3 + The locations of the joints after applying the pose rotations + rel_transforms : torch.tensor BxNx4x4 + The relative (with respect to the root joint) rigid transformations + for all the joints + """ + + joints = torch.unsqueeze(joints, dim=-1) + + rel_joints = joints.clone() + rel_joints[:, 1:] -= joints[:, parents[1:]] + + transforms_mat = transform_mat( + rot_mats.view(-1, 3, 3), + rel_joints.contiguous().view(-1, 3, 1)).view(-1, joints.shape[1], 4, 4) + + transform_chain = [transforms_mat[:, 0]] + for i in range(1, parents.shape[0]): + # Subtract the joint location at the rest pose + # No need for rotation, since it's identity when at rest + curr_res = torch.matmul(transform_chain[parents[i]], + transforms_mat[:, i]) + transform_chain.append(curr_res) + + transforms = torch.stack(transform_chain, dim=1) + + # The last column of the transformations contains the posed joints + posed_joints = transforms[:, :, :3, 3] + + joints_homogen = F.pad(joints, [0, 0, 0, 1]) + + rel_transforms = transforms - F.pad( + torch.matmul(transforms, joints_homogen), [3, 0, 0, 0, 0, 0, 0, 0]) + + return posed_joints, rel_transforms