Spaces:

fffiloni
/

PSHuman

Paused

App Files Files Community

fffiloni commited on Dec 2, 2024

Commit

2252f3d

verified ·

1 Parent(s): e9c8d11

Migrated from GitHub

Browse files

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

.gitattributes +1 -0
LICENSE.txt +21 -0
ORIGINAL_README.md +79 -0
assets/result_clr_scale4_pexels-barbara-olsen-7869640.mp4 +0 -0
assets/result_clr_scale4_pexels-zdmit-6780091.mp4 +0 -0
blender/blender_render_human_ortho.py +837 -0
blender/check_render.py +46 -0
blender/count.py +44 -0
blender/distribute.py +149 -0
blender/rename_smpl_files.py +25 -0
blender/render.sh +4 -0
blender/render_human.py +88 -0
blender/render_single.sh +7 -0
blender/utils.py +128 -0
configs/inference-768-6view.yaml +72 -0
configs/remesh.yaml +18 -0
configs/train-768-6view-onlyscan_face.yaml +145 -0
configs/train-768-6view-onlyscan_face_smplx.yaml +154 -0
core/opt.py +197 -0
core/remesh.py +359 -0
econdataset.py +370 -0
examples/02986d0998ce01aa0aa67a99fbd1e09a.png +0 -0
examples/16171.png +0 -0
examples/26d2e846349647ff04c536816e0e8ca1.png +0 -0
examples/30755.png +0 -0
examples/3930.png +0 -0
examples/4656716-3016170581.png +0 -0
examples/663dcd6db19490de0b790da430bd5681.png +3 -0
examples/7332.png +0 -0
examples/85891251f52a2399e660a63c2a7fdf40.png +0 -0
examples/a689a48d23d6b8d58d67ff5146c6e088.png +0 -0
examples/b0d178743c7e3e09700aaee8d2b1ec47.png +0 -0
examples/case5.png +0 -0
examples/d40776a1e1582179d97907d36f84d776.png +0 -0
examples/durant.png +0 -0
examples/eedb9018-e0eb-45be-33bd-5a0108ca0d8b.png +0 -0
examples/f14f7d40b72062928461b21c6cc877407e69ee0c_high.png +0 -0
examples/f6317ac1b0498f4e6ef9d12bd991a9bd1ff4ae04f898-IQTEBw_fw1200.png +0 -0
examples/pexels-barbara-olsen-7869640.png +0 -0
examples/pexels-julia-m-cameron-4145040.png +0 -0
examples/pexels-marta-wave-6437749.png +0 -0
examples/pexels-photo-6311555-removebg.png +0 -0
examples/pexels-zdmit-6780091.png +0 -0
inference.py +221 -0
lib/__init__.py +0 -0
lib/common/__init__.py +0 -0
lib/common/cloth_extraction.py +182 -0
lib/common/config.py +218 -0
lib/common/imutils.py +364 -0
lib/common/render.py +398 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+examples/663dcd6db19490de0b790da430bd5681.png filter=lfs diff=lfs merge=lfs -text

LICENSE.txt ADDED Viewed

	@@ -0,0 +1,21 @@

+MIT License
+Copyright (c) 2024 Fusion Lab: Generative Vision Lab of Fudan University
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

ORIGINAL_README.md ADDED Viewed

	@@ -0,0 +1,79 @@

+# PSHuman
+This is the official implementation of *PSHuman: Photorealistic Single-image 3D Human Reconstruction using Cross-Scale Multiview Diffusion*.
+### [Project Page](https://penghtyx.github.io/PSHuman/) | [Arxiv](https://arxiv.org/pdf/2409.10141) | [Weights](https://huggingface.co/pengHTYX/PSHuman_Unclip_768_6views)
+https://github.com/user-attachments/assets/b62e3305-38a7-4b51-aed8-1fde967cca70
+https://github.com/user-attachments/assets/76100d2e-4a1a-41ad-815c-816340ac6500
+Given a single image of a clothed person, **PSHuman** facilitates detailed geometry and realistic 3D human appearance across various poses within one minute.
+### 📝 Update
+- __[2024.11.30]__: Release the SMPL-free [version](https://huggingface.co/pengHTYX/PSHuman_Unclip_768_6views), which does not requires SMPL condition for multview generation and perfome well in general posed human.
+### Installation
+```
+conda create -n pshuman python=3.10
+conda activate pshuman
+# torch
+pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu121
+# other depedency
+pip install -r requirement.txt
+```
+This project is also based on SMPLX. We borrowed the related models from [ECON](https://github.com/YuliangXiu/ECON) and [SIFU](https://github.com/River-Zhang/SIFU), and re-orginized them, which can be downloaded from [Onedrive](https://hkustconnect-my.sharepoint.com/:u:/g/personal/plibp_connect_ust_hk/EZQphP-2y5BGhEIe8jb03i4BIcqiJ2mUW2JmGC5s0VKOdw?e=qVzBBD).
+### Inference
+1. Given a human image, we use [Clipdrop](https://github.com/xxlong0/Wonder3D?tab=readme-ov-file) or ```rembg``` to remove the background. For the latter, we provide a simple scrip.
+```
+python utils/remove_bg.py --path $DATA_PATH$
+```
+Then, put the RGBA images in the ```$DATA_PATH$```.
+2. By running [inference.py](inference.py), the textured mesh and rendered video will be saved in ```out```.
+```
+CUDA_VISIBLE_DEVICES=$GPU python inference.py --config configs/inference-768-6view.yaml \
+    pretrained_model_name_or_path='pengHTYX/PSHuman_Unclip_768_6views' \
+    validation_dataset.crop_size=740 \
+    with_smpl=false \
+    validation_dataset.root_dir=$DATA_PATH$ \
+    seed=600 \
+    num_views=7 \
+    save_mode='rgb'
+```
+You can adjust the ```crop_size``` (720 or 740) and ```seed``` (42 or 600) to obtain best results for some cases.
+### Training
+For the data preparing and preprocessing, please refer to our [paper](https://arxiv.org/pdf/2409.10141). Once the data is ready, we begin the training by running
+```
+bash scripts/train_768.sh
+```
+You should modified some parameters, such as ```data_common.root_dir``` and ```data_common.object_list```.
+### Related projects
+We collect code from following projects. We thanks for the contributions from the open-source community!
+[ECON](https://github.com/YuliangXiu/ECON) and [SIFU](https://github.com/River-Zhang/SIFU) recover human mesh from single human image.
+[Era3D](https://github.com/pengHTYX/Era3D) and [Unique3D](https://github.com/AiuniAI/Unique3D) generate consistent multiview images with single color image.
+[Continuous-Remeshing](https://github.com/Profactor/continuous-remeshing) for Inverse Rendering.
+### Citation
+If you find this codebase useful, please consider cite our work.
+```
+@article{li2024pshuman,
+  title={PSHuman: Photorealistic Single-view Human Reconstruction using Cross-Scale Diffusion},
+  author={Li, Peng and Zheng, Wangguandong and Liu, Yuan and Yu, Tao and Li, Yangguang and Qi, Xingqun and Li, Mengfei and Chi, Xiaowei and Xia, Siyu and Xue, Wei and others},
+  journal={arXiv preprint arXiv:2409.10141},
+  year={2024}
+}
+```

assets/result_clr_scale4_pexels-barbara-olsen-7869640.mp4 ADDED Viewed

Binary file (320 kB). View file

assets/result_clr_scale4_pexels-zdmit-6780091.mp4 ADDED Viewed

Binary file (629 kB). View file

blender/blender_render_human_ortho.py ADDED Viewed

	@@ -0,0 +1,837 @@

+"""Blender script to render images of 3D models.
+This script is used to render images of 3D models. It takes in a list of paths
+to .glb files and renders images of each model. The images are from rotating the
+object around the origin. The images are saved to the output directory.
+Example usage:
+    blender -b -P blender_script.py -- \
+        --object_path my_object.glb \
+        --output_dir ./views \
+        --engine CYCLES \
+        --scale 0.8 \
+        --num_images 12 \
+        --camera_dist 1.2
+Here, input_model_paths.json is a json file containing a list of paths to .glb.
+"""
+import argparse
+import json
+import math
+import os
+import random
+import sys
+import time
+import glob
+import urllib.request
+import uuid
+from typing import Tuple
+from mathutils import Vector, Matrix
+os.environ["OPENCV_IO_ENABLE_OPENEXR"]="1"
+# os.environ["CUDA_VISIBLE_DEVICES"] = "0"
+import cv2
+import numpy as np
+from typing import Any, Callable, Dict, Generator, List, Literal, Optional, Set, Tuple
+import bpy
+from mathutils import Vector
+import OpenEXR
+import Imath
+from PIL import Image
+# import blenderproc as bproc
+bpy.app.debug_value=256
+parser = argparse.ArgumentParser()
+parser.add_argument(
+    "--object_path",
+    type=str,
+    required=True,
+    help="Path to the object file",
+)
+parser.add_argument("--smpl_path", type=str, required=True, help="Path to the object file")
+parser.add_argument("--output_dir", type=str, default="/views_whole_sphere-test2")
+parser.add_argument(
+    "--engine", type=str, default="BLENDER_EEVEE", choices=["CYCLES", "BLENDER_EEVEE"]
+)
+parser.add_argument("--scale", type=float, default=1.0)
+parser.add_argument("--num_images", type=int, default=8)
+parser.add_argument("--random_images", type=int, default=3)
+parser.add_argument("--random_ortho", type=int, default=1)
+parser.add_argument("--device", type=str, default="CUDA")
+parser.add_argument("--resolution", type=int, default=512)
+argv = sys.argv[sys.argv.index("--") + 1 :]
+args = parser.parse_args(argv)
+print('===================', args.engine, '===================')
+context = bpy.context
+scene = context.scene
+render = scene.render
+cam = scene.objects["Camera"]
+cam.data.type = 'ORTHO'
+cam.data.ortho_scale = 1.
+cam.data.lens = 35
+cam.data.sensor_height = 32
+cam.data.sensor_width = 32
+cam_constraint = cam.constraints.new(type="TRACK_TO")
+cam_constraint.track_axis = "TRACK_NEGATIVE_Z"
+cam_constraint.up_axis = "UP_Y"
+# setup lighting
+# bpy.ops.object.light_add(type="AREA")
+# light2 = bpy.data.lights["Area"]
+# light2.energy = 3000
+# bpy.data.objects["Area"].location[2] = 0.5
+# bpy.data.objects["Area"].scale[0] = 100
+# bpy.data.objects["Area"].scale[1] = 100
+# bpy.data.objects["Area"].scale[2] = 100
+render.engine = args.engine
+render.image_settings.file_format = "PNG"
+render.image_settings.color_mode = "RGBA"
+render.resolution_x = args.resolution
+render.resolution_y = args.resolution
+render.resolution_percentage = 100
+render.threads_mode = 'FIXED'  # 使用固定线程数模式
+render.threads = 32  # 设置线程数
+scene.cycles.device = "GPU"
+scene.cycles.samples = 128   # 128
+scene.cycles.diffuse_bounces = 1
+scene.cycles.glossy_bounces = 1
+scene.cycles.transparent_max_bounces = 3  # 3
+scene.cycles.transmission_bounces = 3   # 3
+# scene.cycles.filter_width = 0.01
+bpy.context.scene.cycles.adaptive_threshold = 0
+scene.cycles.use_denoising = True
+scene.render.film_transparent = True
+bpy.context.preferences.addons["cycles"].preferences.get_devices()
+# Set the device_type
+bpy.context.preferences.addons["cycles"].preferences.compute_device_type = 'CUDA' # or "OPENCL"
+bpy.context.scene.cycles.tile_size = 8192
+# eevee = scene.eevee
+# eevee.use_soft_shadows = True
+# eevee.use_ssr = True
+# eevee.use_ssr_refraction = True
+# eevee.taa_render_samples = 64
+# eevee.use_gtao = True
+# eevee.gtao_distance = 1
+# eevee.use_volumetric_shadows = True
+# eevee.volumetric_tile_size = '2'
+# eevee.gi_diffuse_bounces = 1
+# eevee.gi_cubemap_resolution = '128'
+# eevee.gi_visibility_resolution = '16'
+# eevee.gi_irradiance_smoothing = 0
+# for depth & normal
+context.view_layer.use_pass_normal = True
+context.view_layer.use_pass_z = True
+context.scene.use_nodes = True
+tree = bpy.context.scene.node_tree
+nodes = bpy.context.scene.node_tree.nodes
+links = bpy.context.scene.node_tree.links
+# Clear default nodes
+for n in nodes:
+    nodes.remove(n)
+# # Create input render layer node.
+render_layers = nodes.new('CompositorNodeRLayers')
+scale_normal = nodes.new(type="CompositorNodeMixRGB")
+scale_normal.blend_type = 'MULTIPLY'
+scale_normal.inputs[2].default_value = (0.5, 0.5, 0.5, 1)
+links.new(render_layers.outputs['Normal'], scale_normal.inputs[1])
+bias_normal = nodes.new(type="CompositorNodeMixRGB")
+bias_normal.blend_type = 'ADD'
+bias_normal.inputs[2].default_value = (0.5, 0.5, 0.5, 0)
+links.new(scale_normal.outputs[0], bias_normal.inputs[1])
+normal_file_output = nodes.new(type="CompositorNodeOutputFile")
+normal_file_output.label = 'Normal Output'
+links.new(bias_normal.outputs[0], normal_file_output.inputs[0])
+normal_file_output.format.file_format = "OPEN_EXR" # default is "PNG"
+normal_file_output.format.color_mode = "RGB"  # default is "BW"
+depth_file_output = nodes.new(type="CompositorNodeOutputFile")
+depth_file_output.label = 'Depth Output'
+links.new(render_layers.outputs['Depth'], depth_file_output.inputs[0])
+depth_file_output.format.file_format = "OPEN_EXR" # default is "PNG"
+depth_file_output.format.color_mode = "RGB"  # default is "BW"
+def prepare_depth_outputs():
+    tree = bpy.context.scene.node_tree
+    links = tree.links
+    render_node = tree.nodes['Render Layers']
+    depth_out_node = tree.nodes.new(type="CompositorNodeOutputFile")
+    depth_map_node = tree.nodes.new(type="CompositorNodeMapRange")
+    depth_out_node.base_path = ''
+    depth_out_node.format.file_format = 'OPEN_EXR'
+    depth_out_node.format.color_depth = '32'
+    depth_map_node.inputs[1].default_value = 0.54
+    depth_map_node.inputs[2].default_value = 1.96
+    depth_map_node.inputs[3].default_value = 0
+    depth_map_node.inputs[4].default_value = 1
+    depth_map_node.use_clamp = True
+    links.new(render_node.outputs[2],depth_map_node.inputs[0])
+    links.new(depth_map_node.outputs[0], depth_out_node.inputs[0])
+    return depth_out_node, depth_map_node
+depth_file_output, depth_map_node = prepare_depth_outputs()
+def exr_to_png(exr_path):
+    depth_path = exr_path.replace('.exr', '.png')
+    exr_image = OpenEXR.InputFile(exr_path)
+    dw = exr_image.header()['dataWindow']
+    (width, height) = (dw.max.x - dw.min.x + 1, dw.max.y - dw.min.y + 1)
+    def read_exr(s, width, height):
+        mat = np.fromstring(s, dtype=np.float32)
+        mat = mat.reshape(height, width)
+        return mat
+    dmap, _, _ = [read_exr(s, width, height) for s in exr_image.channels('BGR', Imath.PixelType(Imath.PixelType.FLOAT))]
+    dmap = np.clip(np.asarray(dmap,np.float64),a_max=1.0, a_min=0.0) * 65535
+    dmap = Image.fromarray(dmap.astype(np.uint16))
+    dmap.save(depth_path)
+    exr_image.close()
+    # os.system('rm {}'.format(exr_path))
+def extract_depth(directory):
+    fns = glob.glob(f'{directory}/*.exr')
+    for fn in fns: exr_to_png(fn)
+    os.system(f'rm {directory}/*.exr')
+def sample_point_on_sphere(radius: float) -> Tuple[float, float, float]:
+    theta = random.random() * 2 * math.pi
+    phi = math.acos(2 * random.random() - 1)
+    return (
+        radius * math.sin(phi) * math.cos(theta),
+        radius * math.sin(phi) * math.sin(theta),
+        radius * math.cos(phi),
+    )
+def sample_spherical(radius=3.0, maxz=3.0, minz=0.):
+    correct = False
+    while not correct:
+        vec = np.random.uniform(-1, 1, 3)
+        vec[2] = np.abs(vec[2])
+        vec = vec / np.linalg.norm(vec, axis=0) * radius
+        if maxz > vec[2] > minz:
+            correct = True
+    return vec
+def sample_spherical(radius_min=1.5, radius_max=2.0, maxz=1.6, minz=-0.75):
+    correct = False
+    while not correct:
+        vec = np.random.uniform(-1, 1, 3)
+#         vec[2] = np.abs(vec[2])
+        radius = np.random.uniform(radius_min, radius_max, 1)
+        vec = vec / np.linalg.norm(vec, axis=0) * radius[0]
+        if maxz > vec[2] > minz:
+            correct = True
+    return vec
+def randomize_camera():
+    elevation = random.uniform(0., 90.)
+    azimuth = random.uniform(0., 360)
+    distance = random.uniform(0.8, 1.6)
+    return set_camera_location(elevation, azimuth, distance)
+def set_camera_location(elevation, azimuth, distance):
+    # from https://blender.stackexchange.com/questions/18530/
+    x, y, z = sample_spherical(radius_min=1.5, radius_max=2.2, maxz=2.2, minz=-2.2)
+    camera = bpy.data.objects["Camera"]
+    camera.location = x, y, z
+    direction = - camera.location
+    rot_quat = direction.to_track_quat('-Z', 'Y')
+    camera.rotation_euler = rot_quat.to_euler()
+    return camera
+def set_camera_mvdream(azimuth, elevation, distance):
+    # theta, phi = np.deg2rad(azimuth), np.deg2rad(elevation)
+    azimuth, elevation = np.deg2rad(azimuth), np.deg2rad(elevation)
+    point = (
+        distance * math.cos(azimuth) * math.cos(elevation),
+        distance * math.sin(azimuth) * math.cos(elevation),
+        distance * math.sin(elevation),
+    )
+    camera = bpy.data.objects["Camera"]
+    camera.location = point
+    direction = -camera.location
+    rot_quat = direction.to_track_quat('-Z', 'Y')
+    camera.rotation_euler = rot_quat.to_euler()
+    return camera
+def reset_scene() -> None:
+    """Resets the scene to a clean state.
+    Returns:
+        None
+    """
+    # delete everything that isn't part of a camera or a light
+    for obj in bpy.data.objects:
+        if obj.type not in {"CAMERA", "LIGHT"}:
+            bpy.data.objects.remove(obj, do_unlink=True)
+    # delete all the materials
+    for material in bpy.data.materials:
+        bpy.data.materials.remove(material, do_unlink=True)
+    # delete all the textures
+    for texture in bpy.data.textures:
+        bpy.data.textures.remove(texture, do_unlink=True)
+    # delete all the images
+    for image in bpy.data.images:
+        bpy.data.images.remove(image, do_unlink=True)
+def process_ply(obj):
+    # obj = bpy.context.selected_objects[0]
+    # 创建一个新的材质
+    material = bpy.data.materials.new(name="VertexColors")
+    material.use_nodes = True
+    obj.data.materials.append(material)
+    # 获取材质的节点树
+    nodes = material.node_tree.nodes
+    links = material.node_tree.links
+    # 删除原有的'Principled BSDF'节点
+    principled_bsdf_node = nodes.get("Principled BSDF")
+    if principled_bsdf_node:
+        nodes.remove(principled_bsdf_node)
+    # 创建一个新的'Emission'节点
+    emission_node = nodes.new(type="ShaderNodeEmission")
+    emission_node.location = 0, 0
+    # 创建一个'Attribute'节点
+    attribute_node = nodes.new(type="ShaderNodeAttribute")
+    attribute_node.location = -300, 0
+    attribute_node.attribute_name = "Col"  # 顶点颜色属性名称
+    # 创建一个'Output'节点
+    output_node = nodes.get("Material Output")
+    # 连接节点
+    links.new(attribute_node.outputs["Color"], emission_node.inputs["Color"])
+    links.new(emission_node.outputs["Emission"], output_node.inputs["Surface"])
+# # load the glb model
+# def load_object(object_path: str) -> None:
+#     if object_path.endswith(".glb"):
+#         bpy.ops.import_scene.gltf(filepath=object_path, merge_vertices=False)
+#     elif object_path.endswith(".fbx"):
+#         bpy.ops.import_scene.fbx(filepath=object_path)
+#     elif object_path.endswith(".obj"):
+#         bpy.ops.import_scene.obj(filepath=object_path)
+#     elif object_path.endswith(".ply"):
+#         bpy.ops.import_mesh.ply(filepath=object_path)
+#         obj = bpy.context.selected_objects[0]
+#         obj.rotation_euler[0] = 1.5708
+#         # bpy.ops.wm.ply_import(filepath=object_path, directory=os.path.dirname(object_path),forward_axis='X', up_axis='Y')
+#         process_ply(obj)
+#     else:
+#         raise ValueError(f"Unsupported file type: {object_path}")
+def scene_bbox(
+    single_obj: Optional[bpy.types.Object] = None, ignore_matrix: bool = False
+) -> Tuple[Vector, Vector]:
+    """Returns the bounding box of the scene.
+    Taken from Shap-E rendering script
+    (https://github.com/openai/shap-e/blob/main/shap_e/rendering/blender/blender_script.py#L68-L82)
+    Args:
+        single_obj (Optional[bpy.types.Object], optional): If not None, only computes
+            the bounding box for the given object. Defaults to None.
+        ignore_matrix (bool, optional): Whether to ignore the object's matrix. Defaults
+            to False.
+    Raises:
+        RuntimeError: If there are no objects in the scene.
+    Returns:
+        Tuple[Vector, Vector]: The minimum and maximum coordinates of the bounding box.
+    """
+    bbox_min = (math.inf,) * 3
+    bbox_max = (-math.inf,) * 3
+    found = False
+    for obj in get_scene_meshes() if single_obj is None else [single_obj]:
+        found = True
+        for coord in obj.bound_box:
+            coord = Vector(coord)
+            if not ignore_matrix:
+                coord = obj.matrix_world @ coord
+            bbox_min = tuple(min(x, y) for x, y in zip(bbox_min, coord))
+            bbox_max = tuple(max(x, y) for x, y in zip(bbox_max, coord))
+    if not found:
+        raise RuntimeError("no objects in scene to compute bounding box for")
+    return Vector(bbox_min), Vector(bbox_max)
+def get_scene_root_objects() -> Generator[bpy.types.Object, None, None]:
+    """Returns all root objects in the scene.
+    Yields:
+        Generator[bpy.types.Object, None, None]: Generator of all root objects in the
+            scene.
+    """
+    for obj in bpy.context.scene.objects.values():
+        if not obj.parent:
+            yield obj
+def get_scene_meshes() -> Generator[bpy.types.Object, None, None]:
+    """Returns all meshes in the scene.
+    Yields:
+        Generator[bpy.types.Object, None, None]: Generator of all meshes in the scene.
+    """
+    for obj in bpy.context.scene.objects.values():
+        if isinstance(obj.data, (bpy.types.Mesh)):
+            yield obj
+# Build intrinsic camera parameters from Blender camera data
+#
+# See notes on this in
+# blender.stackexchange.com/questions/15102/what-is-blenders-camera-projection-matrix-model
+def get_calibration_matrix_K_from_blender(camd):
+    f_in_mm = camd.lens
+    scene = bpy.context.scene
+    resolution_x_in_px = scene.render.resolution_x
+    resolution_y_in_px = scene.render.resolution_y
+    scale = scene.render.resolution_percentage / 100
+    sensor_width_in_mm = camd.sensor_width
+    sensor_height_in_mm = camd.sensor_height
+    pixel_aspect_ratio = scene.render.pixel_aspect_x / scene.render.pixel_aspect_y
+    if (camd.sensor_fit == 'VERTICAL'):
+        # the sensor height is fixed (sensor fit is horizontal),
+        # the sensor width is effectively changed with the pixel aspect ratio
+        s_u = resolution_x_in_px * scale / sensor_width_in_mm / pixel_aspect_ratio
+        s_v = resolution_y_in_px * scale / sensor_height_in_mm
+    else: # 'HORIZONTAL' and 'AUTO'
+        # the sensor width is fixed (sensor fit is horizontal),
+        # the sensor height is effectively changed with the pixel aspect ratio
+        pixel_aspect_ratio = scene.render.pixel_aspect_x / scene.render.pixel_aspect_y
+        s_u = resolution_x_in_px * scale / sensor_width_in_mm
+        s_v = resolution_y_in_px * scale * pixel_aspect_ratio / sensor_height_in_mm
+    # Parameters of intrinsic calibration matrix K
+    alpha_u = f_in_mm * s_u
+    alpha_v = f_in_mm * s_v
+    u_0 = resolution_x_in_px * scale / 2
+    v_0 = resolution_y_in_px * scale / 2
+    skew = 0 # only use rectangular pixels
+    K = Matrix(
+        ((alpha_u, skew,    u_0),
+        (    0  , alpha_v, v_0),
+        (    0  , 0,        1 )))
+    return K
+def get_calibration_matrix_K_from_blender_for_ortho(camd, ortho_scale):
+    scene = bpy.context.scene
+    resolution_x_in_px = scene.render.resolution_x
+    resolution_y_in_px = scene.render.resolution_y
+    scale = scene.render.resolution_percentage / 100
+    pixel_aspect_ratio = scene.render.pixel_aspect_x / scene.render.pixel_aspect_y
+    fx = resolution_x_in_px / ortho_scale
+    fy = resolution_y_in_px / ortho_scale / pixel_aspect_ratio
+    cx = resolution_x_in_px / 2
+    cy = resolution_y_in_px / 2
+    K = Matrix(
+        ((fx, 0, cx),
+        (0, fy, cy),
+        (0 , 0, 1)))
+    return K
+def get_3x4_RT_matrix_from_blender(cam):
+    bpy.context.view_layer.update()
+    location, rotation = cam.matrix_world.decompose()[0:2]
+    R = np.asarray(rotation.to_matrix())
+    t = np.asarray(location)
+    cam_rec = np.asarray([[1, 0, 0], [0, -1, 0], [0, 0, -1]], np.float32)
+    R = R.T
+    t = -R @ t
+    R_world2cv = cam_rec @ R
+    t_world2cv = cam_rec @ t
+    RT = np.concatenate([R_world2cv,t_world2cv[:,None]],1)
+    return RT
+def delete_invisible_objects() -> None:
+    """Deletes all invisible objects in the scene.
+    Returns:
+        None
+    """
+    bpy.ops.object.select_all(action="DESELECT")
+    for obj in scene.objects:
+        if obj.hide_viewport or obj.hide_render:
+            obj.hide_viewport = False
+            obj.hide_render = False
+            obj.hide_select = False
+            obj.select_set(True)
+    bpy.ops.object.delete()
+    # Delete invisible collections
+    invisible_collections = [col for col in bpy.data.collections if col.hide_viewport]
+    for col in invisible_collections:
+        bpy.data.collections.remove(col)
+def normalize_scene():
+    """Normalizes the scene by scaling and translating it to fit in a unit cube centered
+    at the origin.
+    Mostly taken from the Point-E / Shap-E rendering script
+    (https://github.com/openai/point-e/blob/main/point_e/evals/scripts/blender_script.py#L97-L112),
+    but fix for multiple root objects: (see bug report here:
+    https://github.com/openai/shap-e/pull/60).
+    Returns:
+        None
+    """
+    if len(list(get_scene_root_objects())) > 1:
+        print('we have more than one root objects!!')
+        # create an empty object to be used as a parent for all root objects
+        parent_empty = bpy.data.objects.new("ParentEmpty", None)
+        bpy.context.scene.collection.objects.link(parent_empty)
+        # parent all root objects to the empty object
+        for obj in get_scene_root_objects():
+            if obj != parent_empty:
+                obj.parent = parent_empty
+    bbox_min, bbox_max = scene_bbox()
+    dxyz = bbox_max - bbox_min
+    dist = np.sqrt(dxyz[0]**2+ dxyz[1]**2+dxyz[2]**2)
+    scale = 1 / dist
+    for obj in get_scene_root_objects():
+        obj.scale = obj.scale * scale
+    # Apply scale to matrix_world.
+    bpy.context.view_layer.update()
+    bbox_min, bbox_max = scene_bbox()
+    offset = -(bbox_min + bbox_max) / 2
+    for obj in get_scene_root_objects():
+        obj.matrix_world.translation += offset
+    bpy.ops.object.select_all(action="DESELECT")
+    # unparent the camera
+    bpy.data.objects["Camera"].parent = None
+    return scale, offset
+def download_object(object_url: str) -> str:
+    """Download the object and return the path."""
+    # uid = uuid.uuid4()
+    uid = object_url.split("/")[-1].split(".")[0]
+    tmp_local_path = os.path.join("tmp-objects", f"{uid}.glb" + ".tmp")
+    local_path = os.path.join("tmp-objects", f"{uid}.glb")
+    # wget the file and put it in local_path
+    os.makedirs(os.path.dirname(tmp_local_path), exist_ok=True)
+    urllib.request.urlretrieve(object_url, tmp_local_path)
+    os.rename(tmp_local_path, local_path)
+    # get the absolute path
+    local_path = os.path.abspath(local_path)
+    return local_path
+def render_and_save(view_id, object_uid, len_val, azimuth, elevation, distance, ortho=False):
+    # print(view_id)
+    # render the image
+    render_path = os.path.join(args.output_dir, 'image', f"{view_id:03d}.png")
+    scene.render.filepath = render_path
+    if not ortho:
+        cam.data.lens = len_val
+    depth_map_node.inputs[1].default_value = distance - 1
+    depth_map_node.inputs[2].default_value = distance + 1
+    depth_file_output.base_path = os.path.join(args.output_dir, object_uid, 'depth')
+    depth_file_output.file_slots[0].path = f"{view_id:03d}"
+    normal_file_output.file_slots[0].path = f"{view_id:03d}"
+    if not os.path.exists(os.path.join(args.output_dir,  'normal', f"{view_id+1:03d}.png")):
+        bpy.ops.render.render(write_still=True)
+    if os.path.exists(os.path.join(args.output_dir, object_uid, 'depth', f"{view_id:03d}0001.exr")):
+        os.rename(os.path.join(args.output_dir, object_uid, 'depth', f"{view_id:03d}0001.exr"),
+                  os.path.join(args.output_dir, object_uid, 'depth', f"{view_id:03d}.exr"))
+    if os.path.exists(os.path.join(args.output_dir,  'normal', f"{view_id:03d}0001.exr")):
+        normal = cv2.imread(os.path.join(args.output_dir,  'normal', f"{view_id:03d}0001.exr"), cv2.IMREAD_UNCHANGED)
+        normal_unit16 = (normal * 65535).astype(np.uint16)
+        cv2.imwrite(os.path.join(args.output_dir,  'normal', f"{view_id:03d}.png"), normal_unit16)
+        os.remove(os.path.join(args.output_dir,  'normal', f"{view_id:03d}0001.exr"))
+    # save camera KRT matrix
+    if ortho:
+        K = get_calibration_matrix_K_from_blender_for_ortho(cam.data, ortho_scale=cam.data.ortho_scale)
+    else:
+        K = get_calibration_matrix_K_from_blender(cam.data)
+    RT = get_3x4_RT_matrix_from_blender(cam)
+    para_path = os.path.join(args.output_dir, 'camera', f"{view_id:03d}.npy")
+    # np.save(RT_path, RT)
+    paras = {}
+    paras['intrinsic'] = np.array(K, np.float32)
+    paras['extrinsic'] = np.array(RT, np.float32)
+    paras['fov'] = cam.data.angle
+    paras['azimuth'] = azimuth
+    paras['elevation'] = elevation
+    paras['distance'] = distance
+    paras['focal'] = cam.data.lens
+    paras['sensor_width'] = cam.data.sensor_width
+    paras['near'] = distance - 1
+    paras['far'] = distance + 1
+    paras['camera'] = 'persp' if not ortho else 'ortho'
+    np.save(para_path, paras)
+def render_and_save_smpl(view_id, object_uid, len_val, azimuth, elevation, distance, ortho=False):
+    if not ortho:
+        cam.data.lens = len_val
+    render_path = os.path.join(args.output_dir, 'smpl_image', f"{view_id:03d}.png")
+    scene.render.filepath = render_path
+    normal_file_output.file_slots[0].path = f"{view_id:03d}"
+    if not os.path.exists(os.path.join(args.output_dir, 'smpl_normal', f"{view_id:03d}.png")):
+        bpy.ops.render.render(write_still=True)
+    if os.path.exists(os.path.join(args.output_dir,  'smpl_normal', f"{view_id:03d}0001.exr")):
+        normal = cv2.imread(os.path.join(args.output_dir,  'smpl_normal', f"{view_id:03d}0001.exr"), cv2.IMREAD_UNCHANGED)
+        normal_unit16 = (normal * 65535).astype(np.uint16)
+        cv2.imwrite(os.path.join(args.output_dir,  'smpl_normal', f"{view_id:03d}.png"), normal_unit16)
+        os.remove(os.path.join(args.output_dir,  'smpl_normal', f"{view_id:03d}0001.exr"))
+def scene_meshes():
+    for obj in bpy.context.scene.objects.values():
+        if isinstance(obj.data, (bpy.types.Mesh)):
+            yield obj
+def load_object(object_path: str) -> None:
+    """Loads a glb model into the scene."""
+    if object_path.endswith(".glb"):
+        bpy.ops.import_scene.gltf(filepath=object_path, merge_vertices=False)
+    elif object_path.endswith(".fbx"):
+        bpy.ops.import_scene.fbx(filepath=object_path)
+    elif object_path.endswith(".obj"):
+        bpy.ops.import_scene.obj(filepath=object_path)
+        obj = bpy.context.selected_objects[0]
+        obj.rotation_euler[0] = 6.28319
+        # obj.rotation_euler[2] = 1.5708
+    elif object_path.endswith(".ply"):
+        bpy.ops.import_mesh.ply(filepath=object_path)
+        obj = bpy.context.selected_objects[0]
+        obj.rotation_euler[0] = 1.5708
+        obj.rotation_euler[2] = 1.5708
+        # bpy.ops.wm.ply_import(filepath=object_path, directory=os.path.dirname(object_path),forward_axis='X', up_axis='Y')
+        process_ply(obj)
+    else:
+        raise ValueError(f"Unsupported file type: {object_path}")
+def save_images(object_file: str, smpl_file: str) -> None:
+    """Saves rendered images of the object in the scene."""
+    object_uid = '' # os.path.basename(object_file).split(".")[0]
+#     # if we already render this object, we skip it
+    if os.path.exists(os.path.join(args.output_dir,  'meta.npy')): return
+    os.makedirs(args.output_dir, exist_ok=True)
+    os.makedirs(os.path.join(args.output_dir,  'camera'), exist_ok=True)
+    reset_scene()
+    load_object(object_file)
+    lights = [obj for obj in bpy.context.scene.objects if obj.type == 'LIGHT']
+    for light in lights:
+        bpy.data.objects.remove(light, do_unlink=True)
+#     bproc.init()
+    world_tree = bpy.context.scene.world.node_tree
+    back_node = world_tree.nodes['Background']
+    env_light = 0.5
+    back_node.inputs['Color'].default_value = Vector([env_light, env_light, env_light, 1.0])
+    back_node.inputs['Strength'].default_value = 1.0
+    #Make light just directional, disable shadows.
+    light_data = bpy.data.lights.new(name=f'Light', type='SUN')
+    light = bpy.data.objects.new(name=f'Light', object_data=light_data)
+    bpy.context.collection.objects.link(light)
+    light = bpy.data.lights['Light']
+    light.use_shadow = False
+    # Possibly disable specular shading:
+    light.specular_factor = 1.0
+    light.energy = 5.0
+    #Add another light source so stuff facing away from light is not completely dark
+    light_data = bpy.data.lights.new(name=f'Light2', type='SUN')
+    light = bpy.data.objects.new(name=f'Light2', object_data=light_data)
+    bpy.context.collection.objects.link(light)
+    light2 = bpy.data.lights['Light2']
+    light2.use_shadow = False
+    light2.specular_factor = 1.0
+    light2.energy = 3 #0.015
+    bpy.data.objects['Light2'].rotation_euler = bpy.data.objects['Light2'].rotation_euler
+    bpy.data.objects['Light2'].rotation_euler[0] += 180
+    #Add another light source so stuff facing away from light is not completely dark
+    light_data = bpy.data.lights.new(name=f'Light3', type='SUN')
+    light = bpy.data.objects.new(name=f'Light3', object_data=light_data)
+    bpy.context.collection.objects.link(light)
+    light3 = bpy.data.lights['Light3']
+    light3.use_shadow = False
+    light3.specular_factor = 1.0
+    light3.energy = 3 #0.015
+    bpy.data.objects['Light3'].rotation_euler = bpy.data.objects['Light3'].rotation_euler
+    bpy.data.objects['Light3'].rotation_euler[0] += 90
+    #Add another light source so stuff facing away from light is not completely dark
+    light_data = bpy.data.lights.new(name=f'Light4', type='SUN')
+    light = bpy.data.objects.new(name=f'Light4', object_data=light_data)
+    bpy.context.collection.objects.link(light)
+    light4 = bpy.data.lights['Light4']
+    light4.use_shadow = False
+    light4.specular_factor = 1.0
+    light4.energy = 3 #0.015
+    bpy.data.objects['Light4'].rotation_euler = bpy.data.objects['Light4'].rotation_euler
+    bpy.data.objects['Light4'].rotation_euler[0] += -90
+    scale, offset = normalize_scene()
+    try:
+        # some objects' normals are affected by textures
+        mesh_objects = [obj for obj in scene_meshes()]
+        main_bsdf_name = 'BsdfPrincipled'
+        normal_name = 'Normal'
+        for obj in mesh_objects:
+            for mat in obj.data.materials:
+                for node in mat.node_tree.nodes:
+                    if main_bsdf_name in node.bl_idname:
+                        principled_bsdf = node
+                        # remove links, we don't want add normal textures
+                        if principled_bsdf.inputs[normal_name].links:
+                            mat.node_tree.links.remove(principled_bsdf.inputs[normal_name].links[0])
+    except:
+        print("don't know why")
+    # create an empty object to track
+    empty = bpy.data.objects.new("Empty", None)
+    scene.collection.objects.link(empty)
+    cam_constraint.target = empty
+    subject_width = 1.0
+    normal_file_output.base_path = os.path.join(args.output_dir, object_uid, 'normal')
+    for i in range(args.num_images):
+        # change the camera to orthogonal
+        cam.data.type = 'ORTHO'
+        cam.data.ortho_scale = subject_width
+        distance = 1.5
+        azimuth = i * 360 / args.num_images
+        bpy.context.view_layer.update()
+        set_camera_mvdream(azimuth, 0, distance)
+        render_and_save(i * (args.random_images+1), object_uid, -1, azimuth, 0, distance, ortho=True)
+    extract_depth(os.path.join(args.output_dir, object_uid, 'depth'))
+#     ####  smpl
+    reset_scene()
+    load_object(smpl_file)
+    lights = [obj for obj in bpy.context.scene.objects if obj.type == 'LIGHT']
+    for light in lights:
+        bpy.data.objects.remove(light, do_unlink=True)
+    scale, offset = normalize_scene()
+    try:
+        # some objects' normals are affected by textures
+        mesh_objects = [obj for obj in scene_meshes()]
+        main_bsdf_name = 'BsdfPrincipled'
+        normal_name = 'Normal'
+        for obj in mesh_objects:
+            for mat in obj.data.materials:
+                for node in mat.node_tree.nodes:
+                    if main_bsdf_name in node.bl_idname:
+                        principled_bsdf = node
+                        # remove links, we don't want add normal textures
+                        if principled_bsdf.inputs[normal_name].links:
+                            mat.node_tree.links.remove(principled_bsdf.inputs[normal_name].links[0])
+    except:
+        print("don't know why")
+    # create an empty object to track
+    empty = bpy.data.objects.new("Empty", None)
+    scene.collection.objects.link(empty)
+    cam_constraint.target = empty
+    subject_width = 1.0
+    normal_file_output.base_path = os.path.join(args.output_dir, object_uid, 'smpl_normal')
+    for i in range(args.num_images):
+        # change the camera to orthogonal
+        cam.data.type = 'ORTHO'
+        cam.data.ortho_scale = subject_width
+        distance = 1.5
+        azimuth = i * 360 / args.num_images
+        bpy.context.view_layer.update()
+        set_camera_mvdream(azimuth, 0, distance)
+        render_and_save_smpl(i * (args.random_images+1), object_uid, -1, azimuth, 0, distance, ortho=True)
+    np.save(os.path.join(args.output_dir, object_uid, 'meta.npy'), np.asarray([scale, offset[0], offset[1], offset[1]],np.float32))
+if __name__ == "__main__":
+    try:
+        start_i = time.time()
+        if args.object_path.startswith("http"):
+            local_path = download_object(args.object_path)
+        else:
+            local_path = args.object_path
+        save_images(local_path, args.smpl_path)
+        end_i = time.time()
+        print("Finished", local_path, "in", end_i - start_i, "seconds")
+        # delete the object if it was downloaded
+        if args.object_path.startswith("http"):
+            os.remove(local_path)
+    except Exception as e:
+        print("Failed to render", args.object_path)
+        print(e)

blender/check_render.py ADDED Viewed

	@@ -0,0 +1,46 @@

+import os
+from tqdm import tqdm
+import json
+from icecream import ic
+def check_render(dataset, st=None, end=None):
+    total_lists = []
+    with open(dataset+'.json', 'r') as f:
+        glb_list = json.load(f)
+        for x in glb_list:
+            total_lists.append(x.split('/')[-2] )
+    if st is not None:
+        end = min(end, len(total_lists))
+        total_lists = total_lists[st:end]
+        glb_list = glb_list[st:end]
+    save_dir = '/data/lipeng/human_8view_with_smplx/'+dataset
+    unrendered = set(total_lists) - set(os.listdir(save_dir))
+    num_finish = 0
+    num_failed = len(unrendered)
+    failed_case = []
+    for case in os.listdir(save_dir):
+        if not os.path.exists(os.path.join(save_dir, case, 'smpl_normal', '007.png')):
+            failed_case.append(case)
+            num_failed += 1
+        else:
+            num_finish += 1
+    ic(num_failed)
+    ic(num_finish)
+    need_render = []
+    for full_path in glb_list:
+        for case in failed_case:
+            if case in full_path:
+                need_render.append(full_path)
+    with open('need_render.json', 'w') as f:
+        json.dump(need_render, f, indent=4)
+if __name__ == '__main__':
+    dataset = 'THuman2.1'
+    check_render(dataset)

blender/count.py ADDED Viewed

	@@ -0,0 +1,44 @@

+import os
+import json
+def find_files(directory, extensions):
+    results = []
+    for foldername, subfolders, filenames in os.walk(directory):
+        for filename in filenames:
+            if filename.endswith(extensions):
+                file_path = os.path.abspath(os.path.join(foldername, filename))
+                results.append(file_path)
+    return results
+def count_customhumans(root):
+    directory_path = ['CustomHumans/mesh']
+    extensions = ('.ply', '.obj')
+    lists = []
+    for dataset_path in directory_path:
+        dir = os.path.join(root, dataset_path)
+        file_paths = find_files(dir, extensions)
+        # import pdb;pdb.set_trace()
+        dataset_name = dataset_path.split('/')[0]
+        for file_path in file_paths:
+            lists.append(file_path.replace(root, ""))
+    with open(f'{dataset_name}.json', 'w') as f:
+        json.dump(lists, f, indent=4)
+def count_thuman21(root):
+    directory_path = ['THuman2.1/mesh']
+    extensions = ('.ply', '.obj')
+    lists = []
+    for dataset_path in directory_path:
+        dir = os.path.join(root, dataset_path)
+        file_paths = find_files(dir, extensions)
+        dataset_name = dataset_path.split('/')[0]
+        for file_path in file_paths:
+            lists.append(file_path.replace(root, ""))
+    with open(f'{dataset_name}.json', 'w') as f:
+        json.dump(lists, f, indent=4)
+if __name__ == '__main__':
+    root = '/data/lipeng/human_scan/'
+    # count_customhumans(root)
+    count_thuman21(root)

blender/distribute.py ADDED Viewed

	@@ -0,0 +1,149 @@

+import glob
+import json
+import multiprocessing
+import shutil
+import subprocess
+import time
+from dataclasses import dataclass
+from typing import Optional
+import os
+import boto3
+from glob import glob
+import argparse
+parser = argparse.ArgumentParser(description='distributed rendering')
+parser.add_argument('--workers_per_gpu', type=int, default=10,
+                    help='number of workers per gpu.')
+parser.add_argument('--input_models_path', type=str, default='/data/lipeng/human_scan/',
+                    help='Path to a json file containing a list of 3D object files.')
+parser.add_argument('--num_gpus', type=int, default=-1,
+                    help='number of gpus to use. -1 means all available gpus.')
+parser.add_argument('--gpu_list',nargs='+', type=int,
+                    help='the avalaible gpus')
+parser.add_argument('--resolution', type=int, default=512,
+                    help='')
+parser.add_argument('--random_images', type=int, default=0)
+parser.add_argument('--start_i', type=int, default=0,
+                    help='the index of first object to be rendered.')
+parser.add_argument('--end_i', type=int, default=-1,
+                    help='the index of the last object to be rendered.')
+parser.add_argument('--data_dir', type=str, default='/data/lipeng/human_scan/',
+                    help='Path to a json file containing a list of 3D object files.')
+parser.add_argument('--json_path', type=str, default='2K2K.json')
+parser.add_argument('--save_dir', type=str, default='/data/lipeng/human_8view',
+                    help='Path to a json file containing a list of 3D object files.')
+parser.add_argument('--ortho_scale', type=float, default=1.,
+                    help='ortho rendering usage; how large the object is')
+args = parser.parse_args()
+def parse_obj_list(xs):
+    cases = []
+    # print(xs[:2])
+    for x in xs:
+        if 'THuman3.0' in x:
+            # print(apath)
+            splits = x.split('/')
+            x = os.path.join('THuman3.0', splits[-2])
+        elif 'THuman2.1' in x:
+            splits = x.split('/')
+            x = os.path.join('THuman2.1', splits[-2])
+        elif 'CustomHumans' in x:
+            splits = x.split('/')
+            x = os.path.join('CustomHumans', splits[-2])
+        elif '1M' in x:
+            splits = x.split('/')
+            x = os.path.join('2K2K', splits[-2])
+        elif 'realistic_8k_model' in x:
+            splits = x.split('/')
+            x = os.path.join('realistic_8k_model', splits[-1].split('.')[0])
+        cases.append(f'{args.save_dir}/{x}')
+    return  cases
+with open(args.json_path, 'r') as f:
+    glb_list = json.load(f)
+# glb_list = ['THuman2.1/mesh/1618/1618.obj']
+# glb_list = ['THuman3.0/00024_1/00024_0006/mesh.obj']
+# glb_list = ['CustomHumans/mesh/0383_00070_02_00061/mesh-f00061.obj']
+# glb_list = ['1M/01968/01968.ply', '1M/00103/00103.ply']
+# glb_list = ['realistic_8k_model/01aab099a2fe4af7be120110a385105d.glb']
+total_num_glbs = len(glb_list)
+def worker(
+    queue: multiprocessing.JoinableQueue,
+    count: multiprocessing.Value,
+    gpu: int,
+    s3: Optional[boto3.client],
+) -> None:
+    print("Worker started")
+    while True:
+        case, save_p = queue.get()
+        src_path = os.path.join(args.data_dir, case)
+        smpl_path = src_path.replace('mesh', 'smplx', 1)
+        command = ('blender -b -P blender_render_human_ortho.py'
+        f' -- --object_path {src_path}'
+        f' --smpl_path {smpl_path}'
+        f' --output_dir {save_p} --engine CYCLES'
+        f' --resolution {args.resolution}'
+        f' --random_images {args.random_images}'
+        )
+        print(command)
+        subprocess.run(command, shell=True)
+        with count.get_lock():
+            count.value += 1
+        queue.task_done()
+if __name__ == "__main__":
+    # args = tyro.cli(Args)
+    s3 = None
+    queue = multiprocessing.JoinableQueue()
+    count = multiprocessing.Value("i", 0)
+    # Start worker processes on each of the GPUs
+    for gpu_i in range(args.num_gpus):
+        for worker_i in range(args.workers_per_gpu):
+            worker_i = gpu_i * args.workers_per_gpu + worker_i
+            process = multiprocessing.Process(
+                target=worker, args=(queue, count, args.gpu_list[gpu_i], s3)
+            )
+            process.daemon = True
+            process.start()
+    # Add items to the queue
+    save_dirs = parse_obj_list(glb_list)
+    args.end_i = len(save_dirs) if args.end_i > len(save_dirs) or args.end_i==-1 else args.end_i
+    for case_sub, save_dir in zip(glb_list[args.start_i:args.end_i], save_dirs[args.start_i:args.end_i]):
+        queue.put([case_sub, save_dir])
+    # Wait for all tasks to be completed
+    queue.join()
+    # Add sentinels to the queue to stop the worker processes
+    for i in range(args.num_gpus * args.workers_per_gpu):
+        queue.put(None)

blender/rename_smpl_files.py ADDED Viewed

	@@ -0,0 +1,25 @@

+import os
+from tqdm import tqdm
+from glob import glob
+def rename_customhumans():
+    root = '/data/lipeng/human_scan/CustomHumans/smplx/'
+    file_paths = glob(os.path.join(root, '*/*_smpl.obj'))
+    for file_path in tqdm(file_paths):
+        new_path = file_path.replace('_smpl', '')
+        os.rename(file_path, new_path)
+def rename_thuman21():
+    root = '/data/lipeng/human_scan/THuman2.1/smplx/'
+    file_paths = glob(os.path.join(root, '*/*.obj'))
+    for file_path in tqdm(file_paths):
+        obj_name = file_path.split('/')[-2]
+        folder_name = os.path.dirname(file_path)
+        new_path = os.path.join(folder_name, obj_name+'.obj')
+        # print(new_path)
+        # print(file_path)
+        os.rename(file_path, new_path)
+if __name__ == '__main__':
+    rename_thuman21()
+    rename_customhumans()

blender/render.sh ADDED Viewed

	@@ -0,0 +1,4 @@

+#### install environment
+# ~/pkgs/blender-3.6.4/3.6/python/bin/python3.10 -m pip install openexr opencv-python
+python render_human.py

blender/render_human.py ADDED Viewed

	@@ -0,0 +1,88 @@

+import os
+import json
+import math
+from concurrent.futures import ProcessPoolExecutor
+import threading
+from tqdm import tqdm
+# from glcontext import egl
+# egl.create_context()
+# exit(0)
+LOCAL_RANK = 0
+num_processes = 4
+NODE_RANK = int(os.getenv("SLURM_PROCID"))
+WORLD_SIZE = 1
+NODE_NUM=1
+# NODE_RANK = int(os.getenv("SLURM_NODEID"))
+IS_MAIN = False
+if NODE_RANK == 0 and LOCAL_RANK == 0:
+    IS_MAIN = True
+GLOBAL_RANK = NODE_RANK * (WORLD_SIZE//NODE_NUM) + LOCAL_RANK
+# json_path = "object_lists/Thuman2.0.json"
+# json_path = "object_lists/THuman3.0.json"
+json_path = "object_lists/CustomHumans.json"
+data_dir = '/aifs4su/mmcode/lipeng'
+save_dir = '/aifs4su/mmcode/lipeng/human_8view_new'
+def parse_obj_list(x):
+    if 'THuman3.0' in x:
+        # print(apath)
+        splits = x.split('/')
+        x = os.path.join('THuman3.0', splits[-2])
+    elif 'Thuman2.0' in x:
+        splits = x.split('/')
+        x = os.path.join('Thuman2.0', splits[-2])
+    elif 'CustomHumans' in x:
+        splits = x.split('/')
+        x = os.path.join('CustomHumans', splits[-2])
+        # print(splits[-2])
+    elif '1M' in x:
+        splits = x.split('/')
+        x = os.path.join('2K2K', splits[-2])
+    elif 'realistic_8k_model' in x:
+        splits = x.split('/')
+        x = os.path.join('realistic_8k_model', splits[-1].split('.')[0])
+    return f'{save_dir}/{x}'
+with open(json_path, 'r') as f:
+    glb_list = json.load(f)
+# glb_list = ['Thuman2.0/0011/0011.obj']
+# glb_list = ['THuman3.0/00024_1/00024_0006/mesh.obj']
+# glb_list = ['CustomHumans/mesh/0383_00070_02_00061/mesh-f00061.obj']
+# glb_list = ['realistic_8k_model/1d41f2a72f994306b80e632f1cc8233f.glb']
+total_num_glbs = len(glb_list)
+num_glbs_local = int(math.ceil(total_num_glbs / WORLD_SIZE))
+start_idx = GLOBAL_RANK * num_glbs_local
+end_idx = start_idx + num_glbs_local
+# print(start_idx, end_idx)
+local_glbs = glb_list[start_idx:end_idx]
+if IS_MAIN:
+    pbar = tqdm(total=len(local_glbs))
+    lock = threading.Lock()
+def process_human(glb_path):
+    src_path = os.path.join(data_dir, glb_path)
+    save_path = parse_obj_list(glb_path)
+    # print(save_path)
+    command = ('blender -b -P blender_render_human_script.py'
+        f' -- --object_path {src_path}'
+        f' --output_dir {save_path} ')
+        # 1>/dev/null
+    # print(command)
+    os.system(command)
+    if IS_MAIN:
+        with lock:
+            pbar.update(1)
+with ProcessPoolExecutor(max_workers=num_processes) as executor:
+    executor.map(process_human, local_glbs)

blender/render_single.sh ADDED Viewed

	@@ -0,0 +1,7 @@

+# debug single sample
+blender -b -P blender_render_human_ortho.py \
+         -- --object_path /data/lipeng/human_scan/THuman2.1/mesh/0011/0011.obj \
+         --smpl_path /data/lipeng/human_scan/THuman2.1/smplx/0011/0011.obj \
+         --output_dir debug --engine CYCLES \
+         --resolution 768 \
+         --random_images 0

blender/utils.py ADDED Viewed

	@@ -0,0 +1,128 @@

+import datetime
+import pytz
+import traceback
+from torchvision.utils import make_grid
+from PIL import Image, ImageDraw, ImageFont
+import numpy as np
+import torch
+import json
+import os
+from tqdm import tqdm
+import cv2
+import imageio
+def get_time_for_log():
+    return datetime.datetime.now(pytz.timezone('Asia/Shanghai')).strftime(
+        "%Y%m%d %H:%M:%S")
+def get_trace_for_log():
+    return str(traceback.format_exc())
+def make_grid_(imgs, save_file, nrow=10, pad_value=1):
+    if isinstance(imgs, list):
+        if isinstance(imgs[0], Image.Image):
+            imgs = [torch.from_numpy(np.array(img)/255.) for img in imgs]
+        elif isinstance(imgs[0], np.ndarray):
+            imgs = [torch.from_numpy(img/255.) for img in imgs]
+        imgs = torch.stack(imgs, 0).permute(0, 3, 1, 2)
+    if isinstance(imgs, np.ndarray):
+        imgs = torch.from_numpy(imgs)
+    img_grid = make_grid(imgs, nrow=nrow, padding=2, pad_value=pad_value)
+    img_grid = img_grid.permute(1, 2, 0).numpy()
+    img_grid = (img_grid * 255).astype(np.uint8)
+    img_grid = Image.fromarray(img_grid)
+    img_grid.save(save_file)
+def draw_caption(img, text, pos, size=100, color=(128, 128, 128)):
+    draw = ImageDraw.Draw(img)
+    # font = ImageFont.truetype(size= size)
+    font = ImageFont.load_default()
+    font = font.font_variant(size=size)
+    draw.text(pos, text, color, font=font)
+    return img
+def txt2json(txt_file, json_file):
+    with open(txt_file, 'r') as f:
+        items = f.readlines()
+        items = [x.strip() for x in items]
+    with open(json_file, 'w') as f:
+        json.dump(items.tolist(), f)
+def process_thuman_texture():
+    path = '/aifs4su/mmcode/lipeng/Thuman2.0'
+    cases = os.listdir(path)
+    for case in tqdm(cases):
+        mtl = os.path.join(path, case, 'material0.mtl')
+        with open(mtl, 'r') as f:
+            lines = f.read()
+            lines = lines.replace('png', 'jpeg')
+        with open(mtl, 'w') as f:
+            f.write(lines)
+#### for debug
+os.environ["OPENCV_IO_ENABLE_OPENEXR"] = "1"
+def get_intrinsic_from_fov(fov, H, W, bs=-1):
+    focal_length = 0.5 * H / np.tan(0.5 * fov)
+    intrinsic = np.identity(3, dtype=np.float32)
+    intrinsic[0, 0] = focal_length
+    intrinsic[1, 1] = focal_length
+    intrinsic[0, 2] = W / 2.0
+    intrinsic[1, 2] = H / 2.0
+    if bs > 0:
+        intrinsic = intrinsic[None].repeat(bs, axis=0)
+    return torch.from_numpy(intrinsic)
+def read_data(data_dir, i):
+    """
+    Return:
+    rgb: (H, W, 3) torch.float32
+    depth: (H, W, 1) torch.float32
+    mask: (H, W, 1) torch.float32
+    c2w: (4, 4) torch.float32
+    intrinsic: (3, 3) torch.float32
+    """
+    background_color = torch.tensor([0.0, 0.0, 0.0])
+    rgb_name = os.path.join(data_dir, f'render_%04d.webp' % i)
+    depth_name = os.path.join(data_dir, f'depth_%04d.exr' % i)
+    img = torch.from_numpy(
+                np.asarray(
+                    Image.fromarray(imageio.v2.imread(rgb_name))
+                    .convert("RGBA")
+                )
+                / 255.0
+            ).float()
+    mask = img[:, :, -1:]
+    rgb = img[:, :, :3] * mask + background_color[
+        None, None, :
+    ] * (1 - mask)
+    depth = torch.from_numpy(
+        cv2.imread(depth_name, cv2.IMREAD_UNCHANGED)[..., 0, None]
+    )
+    mask[depth > 100.0] = 0.0
+    depth[~(mask > 0.5)] = 0.0  # set invalid depth to 0
+    meta_path = os.path.join(data_dir, 'meta.json')
+    with open(meta_path, 'r') as f:
+        meta = json.load(f)
+    c2w = torch.as_tensor(
+                meta['locations'][i]["transform_matrix"],
+                dtype=torch.float32,
+            )
+    H, W = rgb.shape[:2]
+    fovy = meta["camera_angle_x"]
+    intrinsic = get_intrinsic_from_fov(fovy, H=H, W=W)
+    return rgb, depth, mask, c2w, intrinsic

configs/inference-768-6view.yaml ADDED Viewed

	@@ -0,0 +1,72 @@

+pretrained_model_name_or_path: 'stabilityai/stable-diffusion-2-1-unclip'
+revision: null
+num_views: 7
+with_smpl: false
+validation_dataset:
+  prompt_embeds_path: mvdiffusion/data/fixed_prompt_embeds_7view
+  root_dir: 'examples/shhq'
+  num_views: ${num_views}
+  bg_color: 'white'
+  img_wh:  [768, 768]
+  num_validation_samples: 1000
+  crop_size: 740
+  margin_size: 50
+  smpl_folder: 'smpl_image_pymaf'
+save_dir: 'mv_results'
+save_mode: 'rgba' # 'concat', 'rgba', 'rgb'
+seed: 42
+validation_batch_size: 1
+dataloader_num_workers: 1
+local_rank: -1
+pipe_kwargs:
+  num_views: ${num_views}
+validation_guidance_scales: 3.0
+pipe_validation_kwargs:
+  num_inference_steps: 40
+  eta: 1.0
+validation_grid_nrow: ${num_views}
+unet_from_pretrained_kwargs:
+  unclip: true
+  sdxl: false
+  num_views: ${num_views}
+  sample_size: 96
+  zero_init_conv_in: false # modify
+  projection_camera_embeddings_input_dim: 2 # 2 for elevation and 6 for focal_length
+  zero_init_camera_projection: false
+  num_regress_blocks: 3
+  cd_attention_last: false
+  cd_attention_mid: false
+  multiview_attention: true
+  sparse_mv_attention: true
+  selfattn_block: self_rowwise
+  mvcd_attention: true
+recon_opt:
+  res_path: out
+  save_glb: False
+  # camera setting
+  num_view: 6
+  scale: 4
+  mode: ortho
+  resolution: 1024
+  cam_path: 'mvdiffusion/data/six_human_pose'
+  # optimization
+  iters: 700
+  clr_iters: 200
+  debug: false
+  snapshot_step: 50
+  lr_clr: 2e-3
+  gpu_id: 0
+  replace_hand: false
+enable_xformers_memory_efficient_attention: true

configs/remesh.yaml ADDED Viewed

	@@ -0,0 +1,18 @@

+res_path: out
+save_glb: False
+imgs_path: examples/debug
+mv_path: ./
+# camera setting
+num_view: 6
+scale: 4
+mode: ortho
+resolution: 1024
+cam_path: 'mvdiffusion/data/six_human_pose'
+# optimization
+iters: 700
+clr_iters: 200
+debug: false
+snapshot_step: 50
+lr_clr: 2e-3
+gpu_id: 0
+replace_hand: false

configs/train-768-6view-onlyscan_face.yaml ADDED Viewed

	@@ -0,0 +1,145 @@

+pretrained_model_name_or_path: stabilityai/stable-diffusion-2-1-unclip
+pretrained_unet_path: null
+revision: null
+with_smpl: false
+data_common:
+  root_dir: /aifs4su/mmcode/lipeng/human_8view_new/
+  predict_relative_views: [0, 1, 2, 4, 6, 7]
+  num_validation_samples: 8
+  img_wh: [768, 768]
+  read_normal: true
+  read_color: true
+  read_depth: false
+  exten: .png
+  prompt_embeds_path: mvdiffusion/data/fixed_prompt_embeds_7view
+  object_list:
+  - data_lists/human_only_scan.json
+  invalid_list:
+  -
+train_dataset:
+  root_dir: ${data_common.root_dir}
+  azi_interval: 45.0
+  random_views: 3
+  predict_relative_views: ${data_common.predict_relative_views}
+  bg_color: three_choices
+  object_list: ${data_common.object_list}
+  invalid_list: ${data_common.invalid_list}
+  img_wh: ${data_common.img_wh}
+  validation: false
+  num_validation_samples: ${data_common.num_validation_samples}
+  read_normal: ${data_common.read_normal}
+  read_color: ${data_common.read_color}
+  read_depth: ${data_common.read_depth}
+  load_cache: false
+  exten: ${data_common.exten}
+  prompt_embeds_path: ${data_common.prompt_embeds_path}
+  side_views_rate: 0.3
+  elevation_list: null
+validation_dataset:
+  prompt_embeds_path: ${data_common.prompt_embeds_path}
+  root_dir: examples/debug
+  num_views: ${num_views}
+  bg_color: white
+  img_wh: ${data_common.img_wh}
+  num_validation_samples: 1000
+  crop_size: 740
+validation_train_dataset:
+  root_dir: ${data_common.root_dir}
+  azi_interval: 45.0
+  random_views: 3
+  predict_relative_views: ${data_common.predict_relative_views}
+  bg_color: white
+  object_list: ${data_common.object_list}
+  invalid_list: ${data_common.invalid_list}
+  img_wh: ${data_common.img_wh}
+  validation: false
+  num_validation_samples: ${data_common.num_validation_samples}
+  read_normal: ${data_common.read_normal}
+  read_color: ${data_common.read_color}
+  read_depth: ${data_common.read_depth}
+  num_samples: ${data_common.num_validation_samples}
+  load_cache: false
+  exten: ${data_common.exten}
+  prompt_embeds_path: ${data_common.prompt_embeds_path}
+  elevation_list: null
+output_dir:  output/unit-unclip-768-6view-onlyscan-onlyortho-faceinself-scale0.5
+checkpoint_prefix: ../human_checkpoint_backup/
+seed: 42
+train_batch_size: 2
+validation_batch_size: 1
+validation_train_batch_size: 1
+max_train_steps: 30000
+gradient_accumulation_steps: 2
+gradient_checkpointing: true
+learning_rate: 0.0001
+scale_lr: false
+lr_scheduler: piecewise_constant
+step_rules:  1:2000,0.5
+lr_warmup_steps: 10
+snr_gamma: 5.0
+use_8bit_adam: false
+allow_tf32: true
+use_ema: true
+dataloader_num_workers: 32
+adam_beta1: 0.9
+adam_beta2: 0.999
+adam_weight_decay: 0.01
+adam_epsilon: 1.0e-08
+max_grad_norm: 1.0
+prediction_type: null
+logging_dir: logs
+vis_dir: vis
+mixed_precision: fp16
+report_to: wandb
+local_rank: 0
+checkpointing_steps: 2500
+checkpoints_total_limit: 2
+resume_from_checkpoint: latest
+enable_xformers_memory_efficient_attention: true
+validation_steps: 2500 #
+validation_sanity_check: true
+tracker_project_name: PSHuman
+trainable_modules: null
+use_classifier_free_guidance: true
+condition_drop_rate: 0.05
+scale_input_latents: true
+regress_elevation: false
+regress_focal_length: false
+elevation_loss_weight: 1.0
+focal_loss_weight: 0.0
+pipe_kwargs:
+  num_views: ${num_views}
+pipe_validation_kwargs:
+  eta: 1.0
+unet_from_pretrained_kwargs:
+  unclip: true
+  num_views: ${num_views}
+  sample_size: 96
+  zero_init_conv_in: true
+  regress_elevation: ${regress_elevation}
+  regress_focal_length: ${regress_focal_length}
+  num_regress_blocks: 2
+  camera_embedding_type: e_de_da_sincos
+  projection_camera_embeddings_input_dim: 2
+  zero_init_camera_projection: true # modified
+  init_mvattn_with_selfattn: false
+  cd_attention_last: false
+  cd_attention_mid: false
+  multiview_attention: true
+  sparse_mv_attention: true
+  selfattn_block: self_rowwise
+  mvcd_attention: true
+  addition_downsample: false
+  use_face_adapter: false
+validation_guidance_scales:
+- 3.0
+validation_grid_nrow: ${num_views}
+camera_embedding_lr_mult: 1.0
+plot_pose_acc: false
+num_views: 7
+pred_type: joint
+drop_type: drop_as_a_whole

configs/train-768-6view-onlyscan_face_smplx.yaml ADDED Viewed

	@@ -0,0 +1,154 @@

+pretrained_model_name_or_path: stabilityai/stable-diffusion-2-1-unclip
+pretrained_unet_path: null
+revision: null
+with_smpl: true
+data_common:
+  root_dir: /aifs4su/mmcode/lipeng/human_8view_with_smplx/
+  predict_relative_views: [0, 1, 2, 4, 6, 7]
+  num_validation_samples: 8
+  img_wh: [768, 768]
+  read_normal: true
+  read_color: true
+  read_depth: false
+  exten: .png
+  prompt_embeds_path: mvdiffusion/data/fixed_prompt_embeds_7view
+  object_list:
+  - data_lists/human_only_scan_with_smplx.json  # modified
+  invalid_list:
+  -
+  with_smpl: ${with_smpl}
+train_dataset:
+  root_dir: ${data_common.root_dir}
+  azi_interval: 45.0
+  random_views: 0
+  predict_relative_views: ${data_common.predict_relative_views}
+  bg_color: three_choices
+  object_list: ${data_common.object_list}
+  invalid_list: ${data_common.invalid_list}
+  img_wh: ${data_common.img_wh}
+  validation: false
+  num_validation_samples: ${data_common.num_validation_samples}
+  read_normal: ${data_common.read_normal}
+  read_color: ${data_common.read_color}
+  read_depth: ${data_common.read_depth}
+  load_cache: false
+  exten: ${data_common.exten}
+  prompt_embeds_path: ${data_common.prompt_embeds_path}
+  side_views_rate: 0.3
+  elevation_list: null
+  with_smpl: ${with_smpl}
+validation_dataset:
+  prompt_embeds_path: ${data_common.prompt_embeds_path}
+  root_dir: examples/debug
+  num_views: ${num_views}
+  bg_color: white
+  img_wh: ${data_common.img_wh}
+  num_validation_samples: 1000
+  margin_size: 10
+  # crop_size: 720
+validation_train_dataset:
+  root_dir: ${data_common.root_dir}
+  azi_interval: 45.0
+  random_views: 0
+  predict_relative_views: ${data_common.predict_relative_views}
+  bg_color: white
+  object_list: ${data_common.object_list}
+  invalid_list: ${data_common.invalid_list}
+  img_wh: ${data_common.img_wh}
+  validation: false
+  num_validation_samples: ${data_common.num_validation_samples}
+  read_normal: ${data_common.read_normal}
+  read_color: ${data_common.read_color}
+  read_depth: ${data_common.read_depth}
+  num_samples: ${data_common.num_validation_samples}
+  load_cache: false
+  exten: ${data_common.exten}
+  prompt_embeds_path: ${data_common.prompt_embeds_path}
+  elevation_list: null
+  with_smpl: ${with_smpl}
+output_dir: output/unit-unclip-768-6view-onlyscan-onlyortho-faceinself-scale0.5-smplx
+checkpoint_prefix: ../human_checkpoint_backup/
+seed: 42
+train_batch_size: 2
+validation_batch_size: 1
+validation_train_batch_size: 1
+max_train_steps: 30000
+gradient_accumulation_steps: 2
+gradient_checkpointing: true
+learning_rate: 0.0001
+scale_lr: false
+lr_scheduler: piecewise_constant
+step_rules:  1:2000,0.5
+lr_warmup_steps: 10
+snr_gamma: 5.0
+use_8bit_adam: false
+allow_tf32: true
+use_ema: true
+dataloader_num_workers: 32
+adam_beta1: 0.9
+adam_beta2: 0.999
+adam_weight_decay: 0.01
+adam_epsilon: 1.0e-08
+max_grad_norm: 1.0
+prediction_type: null
+logging_dir: logs
+vis_dir: vis
+mixed_precision: fp16
+report_to: wandb
+local_rank: 0
+checkpointing_steps: 5000
+checkpoints_total_limit: 2
+resume_from_checkpoint: latest
+enable_xformers_memory_efficient_attention: true
+validation_steps: 2500 #
+validation_sanity_check: true
+tracker_project_name: PSHuman
+trainable_modules: null
+use_classifier_free_guidance: true
+condition_drop_rate: 0.05
+scale_input_latents: true
+regress_elevation: false
+regress_focal_length: false
+elevation_loss_weight: 1.0
+focal_loss_weight: 0.0
+pipe_kwargs:
+  num_views: ${num_views}
+pipe_validation_kwargs:
+  eta: 1.0
+unet_from_pretrained_kwargs:
+  unclip: true
+  num_views: ${num_views}
+  sample_size: 96
+  zero_init_conv_in: true
+  regress_elevation: ${regress_elevation}
+  regress_focal_length: ${regress_focal_length}
+  num_regress_blocks: 2
+  camera_embedding_type: e_de_da_sincos
+  projection_camera_embeddings_input_dim: 2
+  zero_init_camera_projection: true # modified
+  init_mvattn_with_selfattn: false
+  cd_attention_last: false
+  cd_attention_mid: false
+  multiview_attention: true
+  sparse_mv_attention: true
+  selfattn_block: self_rowwise
+  mvcd_attention: true
+  addition_downsample: false
+  use_face_adapter: false
+  in_channels: 12
+validation_guidance_scales:
+- 3.0
+validation_grid_nrow: ${num_views}
+camera_embedding_lr_mult: 1.0
+plot_pose_acc: false
+num_views: 7
+pred_type: joint
+drop_type: drop_as_a_whole

core/opt.py ADDED Viewed

	@@ -0,0 +1,197 @@

+from copy import deepcopy
+import time
+import torch
+import torch_scatter
+from core.remesh import calc_edge_length, calc_edges, calc_face_collapses, calc_face_normals, calc_vertex_normals, collapse_edges, flip_edges, pack, prepend_dummies, remove_dummies, split_edges
+@torch.no_grad()
+def remesh(
+        vertices_etc:torch.Tensor, #V,D
+        faces:torch.Tensor, #F,3 long
+        min_edgelen:torch.Tensor, #V
+        max_edgelen:torch.Tensor, #V
+        flip:bool,
+        max_vertices=1e6
+        ):
+    # dummies
+    vertices_etc,faces = prepend_dummies(vertices_etc,faces)
+    vertices = vertices_etc[:,:3] #V,3
+    nan_tensor = torch.tensor([torch.nan],device=min_edgelen.device)
+    min_edgelen = torch.concat((nan_tensor,min_edgelen))
+    max_edgelen = torch.concat((nan_tensor,max_edgelen))
+    # collapse
+    edges,face_to_edge = calc_edges(faces) #E,2 F,3
+    edge_length = calc_edge_length(vertices,edges) #E
+    face_normals = calc_face_normals(vertices,faces,normalize=False) #F,3
+    vertex_normals = calc_vertex_normals(vertices,faces,face_normals) #V,3
+    face_collapse = calc_face_collapses(vertices,faces,edges,face_to_edge,edge_length,face_normals,vertex_normals,min_edgelen,area_ratio=0.5)
+    shortness = (1 - edge_length / min_edgelen[edges].mean(dim=-1)).clamp_min_(0) #e[0,1] 0...ok, 1...edgelen=0
+    priority = face_collapse.float() + shortness
+    vertices_etc,faces = collapse_edges(vertices_etc,faces,edges,priority)
+    # split
+    if vertices.shape[0]<max_vertices:
+        edges,face_to_edge = calc_edges(faces) #E,2 F,3
+        vertices = vertices_etc[:,:3] #V,3
+        edge_length = calc_edge_length(vertices,edges) #E
+        splits = edge_length > max_edgelen[edges].mean(dim=-1)
+        vertices_etc,faces = split_edges(vertices_etc,faces,edges,face_to_edge,splits,pack_faces=False)
+    vertices_etc,faces = pack(vertices_etc,faces)
+    vertices = vertices_etc[:,:3]
+    if flip:
+        edges,_,edge_to_face = calc_edges(faces,with_edge_to_face=True) #E,2 F,3
+        flip_edges(vertices,faces,edges,edge_to_face,with_border=False)
+    return remove_dummies(vertices_etc,faces)
+def lerp_unbiased(a:torch.Tensor,b:torch.Tensor,weight:float,step:int):
+    """lerp with adam's bias correction"""
+    c_prev = 1-weight**(step-1)
+    c = 1-weight**step
+    a_weight = weight*c_prev/c
+    b_weight = (1-weight)/c
+    a.mul_(a_weight).add_(b, alpha=b_weight)
+class MeshOptimizer:
+    """Use this like a pytorch Optimizer, but after calling opt.step(), do vertices,faces = opt.remesh()."""
+    def __init__(self,
+            vertices:torch.Tensor, #V,3
+            faces:torch.Tensor, #F,3
+            lr=0.3, #learning rate
+            betas=(0.8,0.8,0), #betas[0:2] are the same as in Adam, betas[2] may be used to time-smooth the relative velocity nu
+            gammas=(0,0,0), #optional spatial smoothing for m1,m2,nu, values between 0 (no smoothing) and 1 (max. smoothing)
+            nu_ref=0.3, #reference velocity for edge length controller
+            edge_len_lims=(.01,.15), #smallest and largest allowed reference edge length
+            edge_len_tol=.5, #edge length tolerance for split and collapse
+            gain=.2,  #gain value for edge length controller
+            laplacian_weight=.02, #for laplacian smoothing/regularization
+            ramp=1, #learning rate ramp, actual ramp width is ramp/(1-betas[0])
+            grad_lim=10., #gradients are clipped to m1.abs()*grad_lim
+            remesh_interval=1, #larger intervals are faster but with worse mesh quality
+            local_edgelen=True, #set to False to use a global scalar reference edge length instead
+            remesh_milestones= [500], #list of steps at which to remesh
+            # total_steps=1000, #total number of steps
+            ):
+        self._vertices = vertices
+        self._faces = faces
+        self._lr = lr
+        self._betas = betas
+        self._gammas = gammas
+        self._nu_ref = nu_ref
+        self._edge_len_lims = edge_len_lims
+        self._edge_len_tol = edge_len_tol
+        self._gain = gain
+        self._laplacian_weight = laplacian_weight
+        self._ramp = ramp
+        self._grad_lim = grad_lim
+        # self._remesh_interval = remesh_interval
+        # self._remseh_milestones = [ for remesh_milestones]
+        self._local_edgelen = local_edgelen
+        self._step = 0
+        self._start = time.time()
+        V = self._vertices.shape[0]
+        # prepare continuous tensor for all vertex-based data
+        self._vertices_etc = torch.zeros([V,9],device=vertices.device)
+        self._split_vertices_etc()
+        self.vertices.copy_(vertices) #initialize vertices
+        self._vertices.requires_grad_()
+        self._ref_len.fill_(edge_len_lims[1])
+    @property
+    def vertices(self):
+        return self._vertices
+    @property
+    def faces(self):
+        return self._faces
+    def _split_vertices_etc(self):
+        self._vertices = self._vertices_etc[:,:3]
+        self._m2 = self._vertices_etc[:,3]
+        self._nu = self._vertices_etc[:,4]
+        self._m1 = self._vertices_etc[:,5:8]
+        self._ref_len = self._vertices_etc[:,8]
+        with_gammas = any(g!=0 for g in self._gammas)
+        self._smooth = self._vertices_etc[:,:8] if with_gammas else self._vertices_etc[:,:3]
+    def zero_grad(self):
+        self._vertices.grad = None
+    @torch.no_grad()
+    def step(self):
+        eps = 1e-8
+        self._step += 1
+        # spatial smoothing
+        edges,_ = calc_edges(self._faces) #E,2
+        E = edges.shape[0]
+        edge_smooth = self._smooth[edges] #E,2,S
+        neighbor_smooth = torch.zeros_like(self._smooth) #V,S
+        torch_scatter.scatter_mean(src=edge_smooth.flip(dims=[1]).reshape(E*2,-1),index=edges.reshape(E*2,1),dim=0,out=neighbor_smooth)
+        #apply optional smoothing of m1,m2,nu
+        if self._gammas[0]:
+            self._m1.lerp_(neighbor_smooth[:,5:8],self._gammas[0])
+        if self._gammas[1]:
+            self._m2.lerp_(neighbor_smooth[:,3],self._gammas[1])
+        if self._gammas[2]:
+            self._nu.lerp_(neighbor_smooth[:,4],self._gammas[2])
+        #add laplace smoothing to gradients
+        laplace = self._vertices - neighbor_smooth[:,:3]
+        grad = torch.addcmul(self._vertices.grad, laplace, self._nu[:,None], value=self._laplacian_weight)
+        #gradient clipping
+        if self._step>1:
+            grad_lim = self._m1.abs().mul_(self._grad_lim)
+            grad.clamp_(min=-grad_lim,max=grad_lim)
+        # moment updates
+        lerp_unbiased(self._m1, grad, self._betas[0], self._step)
+        lerp_unbiased(self._m2, (grad**2).sum(dim=-1), self._betas[1], self._step)
+        velocity = self._m1 / self._m2[:,None].sqrt().add_(eps) #V,3
+        speed = velocity.norm(dim=-1) #V
+        if self._betas[2]:
+            lerp_unbiased(self._nu,speed,self._betas[2],self._step) #V
+        else:
+            self._nu.copy_(speed) #V
+        # update vertices
+        ramped_lr = self._lr * min(1,self._step * (1-self._betas[0]) / self._ramp)
+        self._vertices.add_(velocity * self._ref_len[:,None], alpha=-ramped_lr)
+        # update target edge length
+        if self._step < 500:
+            self._remesh_interval = 4
+        elif self._step < 800:
+            self._remesh_interval = 2
+        else:
+            self._remesh_interval = 1
+        if self._step % self._remesh_interval == 0:
+            if self._local_edgelen:
+                len_change = (1 + (self._nu - self._nu_ref) * self._gain)
+            else:
+                len_change = (1 + (self._nu.mean() - self._nu_ref) * self._gain)
+            self._ref_len *= len_change
+            self._ref_len.clamp_(*self._edge_len_lims)
+    def remesh(self, flip:bool=True)->tuple[torch.Tensor,torch.Tensor]:
+        min_edge_len = self._ref_len * (1 - self._edge_len_tol)
+        max_edge_len = self._ref_len * (1 + self._edge_len_tol)
+        self._vertices_etc,self._faces = remesh(self._vertices_etc,self._faces,min_edge_len,max_edge_len,flip)
+        self._split_vertices_etc()
+        self._vertices.requires_grad_()
+        return self._vertices, self._faces

core/remesh.py ADDED Viewed

	@@ -0,0 +1,359 @@

+import torch
+import torch.nn.functional as tfunc
+import torch_scatter
+def prepend_dummies(
+        vertices:torch.Tensor, #V,D
+        faces:torch.Tensor, #F,3 long
+    )->tuple[torch.Tensor,torch.Tensor]:
+    """prepend dummy elements to vertices and faces to enable "masked" scatter operations"""
+    V,D = vertices.shape
+    vertices = torch.concat((torch.full((1,D),fill_value=torch.nan,device=vertices.device),vertices),dim=0)
+    faces = torch.concat((torch.zeros((1,3),dtype=torch.long,device=faces.device),faces+1),dim=0)
+    return vertices,faces
+def remove_dummies(
+        vertices:torch.Tensor, #V,D - first vertex all nan and unreferenced
+        faces:torch.Tensor, #F,3 long - first face all zeros
+    )->tuple[torch.Tensor,torch.Tensor]:
+    """remove dummy elements added with prepend_dummies()"""
+    return vertices[1:],faces[1:]-1
+def calc_edges(
+        faces: torch.Tensor,  # F,3 long - first face may be dummy with all zeros
+        with_edge_to_face: bool = False
+    ) -> tuple[torch.Tensor, ...]:
+    """
+    returns tuple of
+    - edges E,2 long, 0 for unused, lower vertex index first
+    - face_to_edge F,3 long
+    - (optional) edge_to_face shape=E,[left,right],[face,side]
+    o-<-----e1     e0,e1...edge, e0<e1
+    |      /A      L,R....left and right face
+    |  L /  |      both triangles ordered counter clockwise
+    |  / R  |      normals pointing out of screen
+    V/      |
+    e0---->-o
+    """
+    F = faces.shape[0]
+    # make full edges, lower vertex index first
+    face_edges = torch.stack((faces,faces.roll(-1,1)),dim=-1) #F*3,3,2
+    full_edges = face_edges.reshape(F*3,2)
+    sorted_edges,_ = full_edges.sort(dim=-1) #F*3,2 TODO min/max faster?
+    # make unique edges
+    edges,full_to_unique = torch.unique(input=sorted_edges,sorted=True,return_inverse=True,dim=0) #(E,2),(F*3)
+    E = edges.shape[0]
+    face_to_edge = full_to_unique.reshape(F,3) #F,3
+    if not with_edge_to_face:
+        return edges, face_to_edge
+    is_right = full_edges[:,0]!=sorted_edges[:,0] #F*3
+    edge_to_face = torch.zeros((E,2,2),dtype=torch.long,device=faces.device) #E,LR=2,S=2
+    scatter_src = torch.cartesian_prod(torch.arange(0,F,device=faces.device),torch.arange(0,3,device=faces.device)) #F*3,2
+    edge_to_face.reshape(2*E,2).scatter_(dim=0,index=(2*full_to_unique+is_right)[:,None].expand(F*3,2),src=scatter_src) #E,LR=2,S=2
+    edge_to_face[0] = 0
+    return edges, face_to_edge, edge_to_face
+def calc_edge_length(
+        vertices:torch.Tensor, #V,3 first may be dummy
+        edges:torch.Tensor, #E,2 long, lower vertex index first, (0,0) for unused
+        )->torch.Tensor: #E
+    full_vertices = vertices[edges] #E,2,3
+    a,b = full_vertices.unbind(dim=1) #E,3
+    return torch.norm(a-b,p=2,dim=-1)
+def calc_face_normals(
+        vertices:torch.Tensor, #V,3 first vertex may be unreferenced
+        faces:torch.Tensor, #F,3 long, first face may be all zero
+        normalize:bool=False,
+        )->torch.Tensor: #F,3
+    """
+         n
+         |
+         c0     corners ordered counterclockwise when
+        / \     looking onto surface (in neg normal direction)
+      c1---c2
+    """
+    full_vertices = vertices[faces] #F,C=3,3
+    v0,v1,v2 = full_vertices.unbind(dim=1) #F,3
+    face_normals = torch.cross(v1-v0,v2-v0, dim=1) #F,3
+    if normalize:
+        face_normals = tfunc.normalize(face_normals, eps=1e-6, dim=1) #TODO inplace?
+    return face_normals #F,3
+def calc_vertex_normals(
+        vertices:torch.Tensor, #V,3 first vertex may be unreferenced
+        faces:torch.Tensor, #F,3 long, first face may be all zero
+        face_normals:torch.Tensor=None, #F,3, not normalized
+        )->torch.Tensor: #F,3
+    F = faces.shape[0]
+    if face_normals is None:
+        face_normals = calc_face_normals(vertices,faces)
+    vertex_normals = torch.zeros((vertices.shape[0],3,3),dtype=vertices.dtype,device=vertices.device) #V,C=3,3
+    vertex_normals.scatter_add_(dim=0,index=faces[:,:,None].expand(F,3,3),src=face_normals[:,None,:].expand(F,3,3))
+    vertex_normals = vertex_normals.sum(dim=1) #V,3
+    return tfunc.normalize(vertex_normals, eps=1e-6, dim=1)
+def calc_face_ref_normals(
+        faces:torch.Tensor, #F,3 long, 0 for unused
+        vertex_normals:torch.Tensor, #V,3 first unused
+        normalize:bool=False,
+        )->torch.Tensor: #F,3
+    """calculate reference normals for face flip detection"""
+    full_normals = vertex_normals[faces] #F,C=3,3
+    ref_normals = full_normals.sum(dim=1) #F,3
+    if normalize:
+        ref_normals = tfunc.normalize(ref_normals, eps=1e-6, dim=1)
+    return ref_normals
+def pack(
+        vertices:torch.Tensor, #V,3 first unused and nan
+        faces:torch.Tensor, #F,3 long, 0 for unused
+        )->tuple[torch.Tensor,torch.Tensor]: #(vertices,faces), keeps first vertex unused
+    """removes unused elements in vertices and faces"""
+    V = vertices.shape[0]
+    # remove unused faces
+    used_faces = faces[:,0]!=0
+    used_faces[0] = True
+    faces = faces[used_faces] #sync
+    # remove unused vertices
+    used_vertices = torch.zeros(V,3,dtype=torch.bool,device=vertices.device)
+    used_vertices.scatter_(dim=0,index=faces,value=True,reduce='add') #TODO int faster?
+    used_vertices = used_vertices.any(dim=1)
+    used_vertices[0] = True
+    vertices = vertices[used_vertices] #sync
+    # update used faces
+    ind = torch.zeros(V,dtype=torch.long,device=vertices.device)
+    V1 = used_vertices.sum()
+    ind[used_vertices] =  torch.arange(0,V1,device=vertices.device) #sync
+    faces = ind[faces]
+    return vertices,faces
+def split_edges(
+        vertices:torch.Tensor, #V,3 first unused
+        faces:torch.Tensor, #F,3 long, 0 for unused
+        edges:torch.Tensor, #E,2 long 0 for unused, lower vertex index first
+        face_to_edge:torch.Tensor, #F,3 long 0 for unused
+        splits, #E bool
+        pack_faces:bool=True,
+        )->tuple[torch.Tensor,torch.Tensor]: #(vertices,faces)
+    #   c2                    c2               c...corners = faces
+    #    . .                   . .             s...side_vert, 0 means no split
+    #    .   .                 .N2 .           S...shrunk_face
+    #    .     .               .     .         Ni...new_faces
+    #   s2      s1           s2|c2...s1|c1
+    #    .        .            .     .  .
+    #    .          .          . S .      .
+    #    .            .        . .     N1    .
+    #   c0...(s0=0)....c1    s0|c0...........c1
+    #
+    # pseudo-code:
+    #   S = [s0|c0,s1|c1,s2|c2] example:[c0,s1,s2]
+    #   split = side_vert!=0 example:[False,True,True]
+    #   N0 = split[0]*[c0,s0,s2|c2] example:[0,0,0]
+    #   N1 = split[1]*[c1,s1,s0|c0] example:[c1,s1,c0]
+    #   N2 = split[2]*[c2,s2,s1|c1] example:[c2,s2,s1]
+    V = vertices.shape[0]
+    F = faces.shape[0]
+    S = splits.sum().item() #sync
+    if S==0:
+        return vertices,faces
+    edge_vert = torch.zeros_like(splits, dtype=torch.long) #E
+    edge_vert[splits] = torch.arange(V,V+S,dtype=torch.long,device=vertices.device) #E 0 for no split, sync
+    side_vert = edge_vert[face_to_edge] #F,3 long, 0 for no split
+    split_edges = edges[splits] #S sync
+    #vertices
+    split_vertices = vertices[split_edges].mean(dim=1) #S,3
+    vertices = torch.concat((vertices,split_vertices),dim=0)
+    #faces
+    side_split = side_vert!=0 #F,3
+    shrunk_faces = torch.where(side_split,side_vert,faces) #F,3 long, 0 for no split
+    new_faces = side_split[:,:,None] * torch.stack((faces,side_vert,shrunk_faces.roll(1,dims=-1)),dim=-1) #F,N=3,C=3
+    faces = torch.concat((shrunk_faces,new_faces.reshape(F*3,3))) #4F,3
+    if pack_faces:
+        mask = faces[:,0]!=0
+        mask[0] = True
+        faces = faces[mask] #F',3 sync
+    return vertices,faces
+def collapse_edges(
+        vertices:torch.Tensor, #V,3 first unused
+        faces:torch.Tensor, #F,3 long 0 for unused
+        edges:torch.Tensor, #E,2 long 0 for unused, lower vertex index first
+        priorities:torch.Tensor, #E float
+        stable:bool=False, #only for unit testing
+        )->tuple[torch.Tensor,torch.Tensor]: #(vertices,faces)
+    V = vertices.shape[0]
+    # check spacing
+    _,order = priorities.sort(stable=stable) #E
+    rank = torch.zeros_like(order)
+    rank[order] = torch.arange(0,len(rank),device=rank.device)
+    vert_rank = torch.zeros(V,dtype=torch.long,device=vertices.device) #V
+    edge_rank = rank #E
+    for i in range(3):
+        torch_scatter.scatter_max(src=edge_rank[:,None].expand(-1,2).reshape(-1),index=edges.reshape(-1),dim=0,out=vert_rank)
+        edge_rank,_ = vert_rank[edges].max(dim=-1) #E
+    candidates = edges[(edge_rank==rank).logical_and_(priorities>0)] #E',2
+    # check connectivity
+    vert_connections = torch.zeros(V,dtype=torch.long,device=vertices.device) #V
+    vert_connections[candidates[:,0]] = 1 #start
+    edge_connections = vert_connections[edges].sum(dim=-1) #E, edge connected to start
+    vert_connections.scatter_add_(dim=0,index=edges.reshape(-1),src=edge_connections[:,None].expand(-1,2).reshape(-1))# one edge from start
+    vert_connections[candidates] = 0 #clear start and end
+    edge_connections = vert_connections[edges].sum(dim=-1) #E, one or two edges from start
+    vert_connections.scatter_add_(dim=0,index=edges.reshape(-1),src=edge_connections[:,None].expand(-1,2).reshape(-1)) #one or two edges from start
+    collapses = candidates[vert_connections[candidates[:,1]] <= 2] # E" not more than two connections between start and end
+    # mean vertices
+    vertices[collapses[:,0]] = vertices[collapses].mean(dim=1) #TODO dim?
+    # update faces
+    dest = torch.arange(0,V,dtype=torch.long,device=vertices.device) #V
+    dest[collapses[:,1]] = dest[collapses[:,0]]
+    faces = dest[faces] #F,3 TODO optimize?
+    c0,c1,c2 = faces.unbind(dim=-1)
+    collapsed = (c0==c1).logical_or_(c1==c2).logical_or_(c0==c2)
+    faces[collapsed] = 0
+    return vertices,faces
+def calc_face_collapses(
+        vertices:torch.Tensor, #V,3 first unused
+        faces:torch.Tensor, #F,3 long, 0 for unused
+        edges:torch.Tensor, #E,2 long 0 for unused, lower vertex index first
+        face_to_edge:torch.Tensor, #F,3 long 0 for unused
+        edge_length:torch.Tensor, #E
+        face_normals:torch.Tensor, #F,3
+        vertex_normals:torch.Tensor, #V,3 first unused
+        min_edge_length:torch.Tensor=None, #V
+        area_ratio = 0.5, #collapse if area < min_edge_length**2 * area_ratio
+        shortest_probability = 0.8
+        )->torch.Tensor: #E edges to collapse
+    E = edges.shape[0]
+    F = faces.shape[0]
+    # face flips
+    ref_normals = calc_face_ref_normals(faces,vertex_normals,normalize=False) #F,3
+    face_collapses = (face_normals*ref_normals).sum(dim=-1)<0 #F
+    # small faces
+    if min_edge_length is not None:
+        min_face_length = min_edge_length[faces].mean(dim=-1) #F
+        min_area = min_face_length**2 * area_ratio #F
+        face_collapses.logical_or_(face_normals.norm(dim=-1) < min_area*2) #F
+        face_collapses[0] = False
+    # faces to edges
+    face_length = edge_length[face_to_edge] #F,3
+    if shortest_probability<1:
+        #select shortest edge with shortest_probability chance
+        randlim = round(2/(1-shortest_probability))
+        rand_ind = torch.randint(0,randlim,size=(F,),device=faces.device).clamp_max_(2) #selected edge local index in face
+        sort_ind = torch.argsort(face_length,dim=-1,descending=True) #F,3
+        local_ind = sort_ind.gather(dim=-1,index=rand_ind[:,None])
+    else:
+        local_ind = torch.argmin(face_length,dim=-1)[:,None] #F,1 0...2 shortest edge local index in face
+    edge_ind = face_to_edge.gather(dim=1,index=local_ind)[:,0] #F 0...E selected edge global index
+    edge_collapses = torch.zeros(E,dtype=torch.long,device=vertices.device)
+    edge_collapses.scatter_add_(dim=0,index=edge_ind,src=face_collapses.long()) #TODO legal for bool?
+    return edge_collapses.bool()
+def flip_edges(
+        vertices:torch.Tensor, #V,3 first unused
+        faces:torch.Tensor, #F,3 long, first must be 0, 0 for unused
+        edges:torch.Tensor, #E,2 long, first must be 0, 0 for unused, lower vertex index first
+        edge_to_face:torch.Tensor, #E,[left,right],[face,side]
+        with_border:bool=True, #handle border edges (D=4 instead of D=6)
+        with_normal_check:bool=True, #check face normal flips
+        stable:bool=False, #only for unit testing
+        ):
+    V = vertices.shape[0]
+    E = edges.shape[0]
+    device=vertices.device
+    vertex_degree = torch.zeros(V,dtype=torch.long,device=device) #V long
+    vertex_degree.scatter_(dim=0,index=edges.reshape(E*2),value=1,reduce='add')
+    neighbor_corner = (edge_to_face[:,:,1] + 2) % 3 #go from side to corner
+    neighbors = faces[edge_to_face[:,:,0],neighbor_corner] #E,LR=2
+    edge_is_inside = neighbors.all(dim=-1) #E
+    if with_border:
+        # inside vertices should have D=6, border edges D=4, so we subtract 2 for all inside vertices
+        # need to use float for masks in order to use scatter(reduce='multiply')
+        vertex_is_inside = torch.ones(V,2,dtype=torch.float32,device=vertices.device) #V,2 float
+        src = edge_is_inside.type(torch.float32)[:,None].expand(E,2) #E,2 float
+        vertex_is_inside.scatter_(dim=0,index=edges,src=src,reduce='multiply')
+        vertex_is_inside = vertex_is_inside.prod(dim=-1,dtype=torch.long) #V long
+        vertex_degree -= 2 * vertex_is_inside #V long
+    neighbor_degrees = vertex_degree[neighbors] #E,LR=2
+    edge_degrees = vertex_degree[edges] #E,2
+    #
+    # loss = Sum_over_affected_vertices((new_degree-6)**2)
+    # loss_change = Sum_over_neighbor_vertices((degree+1-6)**2-(degree-6)**2)
+    #                   + Sum_over_edge_vertices((degree-1-6)**2-(degree-6)**2)
+    #             = 2 * (2 + Sum_over_neighbor_vertices(degree) - Sum_over_edge_vertices(degree))
+    #
+    loss_change = 2 + neighbor_degrees.sum(dim=-1) - edge_degrees.sum(dim=-1) #E
+    candidates = torch.logical_and(loss_change<0, edge_is_inside) #E
+    loss_change = loss_change[candidates] #E'
+    if loss_change.shape[0]==0:
+        return
+    edges_neighbors = torch.concat((edges[candidates],neighbors[candidates]),dim=-1) #E',4
+    _,order = loss_change.sort(descending=True, stable=stable) #E'
+    rank = torch.zeros_like(order)
+    rank[order] = torch.arange(0,len(rank),device=rank.device)
+    vertex_rank = torch.zeros((V,4),dtype=torch.long,device=device) #V,4
+    torch_scatter.scatter_max(src=rank[:,None].expand(-1,4),index=edges_neighbors,dim=0,out=vertex_rank)
+    vertex_rank,_ = vertex_rank.max(dim=-1) #V
+    neighborhood_rank,_ = vertex_rank[edges_neighbors].max(dim=-1) #E'
+    flip = rank==neighborhood_rank #E'
+    if with_normal_check:
+        #  cl-<-----e1     e0,e1...edge, e0<e1
+        #   |      /A      L,R....left and right face
+        #   |  L /  |      both triangles ordered counter clockwise
+        #   |  / R  |      normals pointing out of screen
+        #   V/      |
+        #   e0---->-cr
+        v = vertices[edges_neighbors] #E",4,3
+        v = v - v[:,0:1] #make relative to e0
+        e1 = v[:,1]
+        cl = v[:,2]
+        cr = v[:,3]
+        n = torch.cross(e1,cl) + torch.cross(cr,e1) #sum of old normal vectors
+        flip.logical_and_(torch.sum(n*torch.cross(cr,cl),dim=-1)>0) #first new face
+        flip.logical_and_(torch.sum(n*torch.cross(cl-e1,cr-e1),dim=-1)>0) #second new face
+    flip_edges_neighbors = edges_neighbors[flip] #E",4
+    flip_edge_to_face = edge_to_face[candidates,:,0][flip] #E",2
+    flip_faces = flip_edges_neighbors[:,[[0,3,2],[1,2,3]]] #E",2,3
+    faces.scatter_(dim=0,index=flip_edge_to_face.reshape(-1,1).expand(-1,3),src=flip_faces.reshape(-1,3))

econdataset.py ADDED Viewed

	@@ -0,0 +1,370 @@

+# -*- coding: utf-8 -*-
+# Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V. (MPG) is
+# holder of all proprietary rights on this computer program.
+# You can only use this computer program if you have closed
+# a license agreement with MPG or you get the right to use the computer
+# program from someone who is authorized to grant you that right.
+# Any use of the computer program without a valid license is prohibited and
+# liable to prosecution.
+#
+# Copyright©2019 Max-Planck-Gesellschaft zur Förderung
+# der Wissenschaften e.V. (MPG). acting on behalf of its Max Planck Institute
+# for Intelligent Systems. All rights reserved.
+#
+# Contact: ps-license@tuebingen.mpg.de
+from lib.hybrik.models.simple3dpose import HybrIKBaseSMPLCam
+from lib.pixielib.utils.config import cfg as pixie_cfg
+from lib.pixielib.pixie import PIXIE
+import lib.smplx as smplx
+# from lib.pare.pare.core.tester import PARETester
+from lib.pymaf.utils.geometry import rot6d_to_rotmat, batch_rodrigues, rotation_matrix_to_angle_axis
+from lib.pymaf.utils.imutils import process_image
+from lib.common.imutils import econ_process_image
+from lib.pymaf.core import path_config
+from lib.pymaf.models import pymaf_net
+from lib.common.config import cfg
+from lib.common.render import Render
+from lib.dataset.body_model import TetraSMPLModel
+from lib.dataset.mesh_util import get_visibility
+from utils.smpl_util import SMPLX
+import os.path as osp
+import os
+import torch
+import numpy as np
+import random
+from termcolor import colored
+from PIL import ImageFile
+from torchvision.models import detection
+ImageFile.LOAD_TRUNCATED_IMAGES = True
+class SMPLDataset():
+    def __init__(self, cfg, device):
+        random.seed(1993)
+        self.image_dir = cfg['image_dir']
+        self.seg_dir = cfg['seg_dir']
+        self.hps_type = cfg['hps_type']
+        self.smpl_type = 'smpl' if cfg['hps_type'] != 'pixie' else 'smplx'
+        self.smpl_gender = 'neutral'
+        self.colab = cfg['colab']
+        self.device = device
+        keep_lst = [f"{self.image_dir}/{i}" for i in  sorted(os.listdir(self.image_dir))]
+        img_fmts = ['jpg', 'png', 'jpeg', "JPG", 'bmp']
+        keep_lst = [item for item in keep_lst if item.split(".")[-1] in img_fmts]
+        self.subject_list = [item for item in keep_lst if item.split(".")[-1] in img_fmts]
+        if self.colab:
+            self.subject_list = [self.subject_list[0]]
+        # smpl related
+        self.smpl_data = SMPLX()
+        # smpl-smplx correspondence
+        self.smpl_joint_ids_24 = np.arange(22).tolist() + [68, 73]
+        self.smpl_joint_ids_24_pixie = np.arange(22).tolist() + [68 + 61, 72 + 68]
+        self.get_smpl_model = lambda smpl_type, smpl_gender: smplx.create(model_path=self.smpl_data.
+                                                                          model_dir,
+                                                                          gender=smpl_gender,
+                                                                          model_type=smpl_type,
+                                                                          ext='npz')
+        # Load SMPL model
+        self.smpl_model = self.get_smpl_model(self.smpl_type, self.smpl_gender).to(self.device)
+        self.faces = self.smpl_model.faces
+        if self.hps_type == 'pymaf':
+            self.hps = pymaf_net(path_config.SMPL_MEAN_PARAMS, pretrained=True).to(self.device)
+            self.hps.load_state_dict(torch.load(path_config.CHECKPOINT_FILE)['model'], strict=True)
+            self.hps.eval()
+        elif self.hps_type == 'pare':
+            self.hps = PARETester(path_config.CFG, path_config.CKPT).model
+        elif self.hps_type == 'pixie':
+            self.hps = PIXIE(config=pixie_cfg, device=self.device)
+            self.smpl_model = self.hps.smplx
+        elif self.hps_type == 'hybrik':
+            smpl_path = osp.join(self.smpl_data.model_dir, "smpl/SMPL_NEUTRAL.pkl")
+            self.hps = HybrIKBaseSMPLCam(cfg_file=path_config.HYBRIK_CFG,
+                                         smpl_path=smpl_path,
+                                         data_path=path_config.hybrik_data_dir)
+            self.hps.load_state_dict(torch.load(path_config.HYBRIK_CKPT, map_location='cpu'),
+                                     strict=False)
+            self.hps.to(self.device)
+        elif self.hps_type == 'bev':
+            try:
+                import bev
+            except:
+                print('Could not find bev, installing via pip install --upgrade simple-romp')
+                os.system('pip install simple-romp==1.0.3')
+                import bev
+            settings = bev.main.default_settings
+            # change the argparse settings of bev here if you prefer other settings.
+            settings.mode = 'image'
+            settings.GPU = int(str(self.device).split(':')[1])
+            settings.show_largest = True
+            # settings.show = True # uncommit this to show the original BEV predictions
+            self.hps = bev.BEV(settings)
+        self.detector=detection.maskrcnn_resnet50_fpn(pretrained=True)
+        self.detector.eval()
+        print(colored(f"Using {self.hps_type} as HPS Estimator\n", "green"))
+        self.render = Render(size=512, device=device)
+    def __len__(self):
+        return len(self.subject_list)
+    def compute_vis_cmap(self, smpl_verts, smpl_faces):
+        (xy, z) = torch.as_tensor(smpl_verts).split([2, 1], dim=1)
+        smpl_vis = get_visibility(xy, -z, torch.as_tensor(smpl_faces).long())
+        smpl_cmap = self.smpl_data.cmap_smpl_vids(self.smpl_type)
+        return {
+            'smpl_vis': smpl_vis.unsqueeze(0).to(self.device),
+            'smpl_cmap': smpl_cmap.unsqueeze(0).to(self.device),
+            'smpl_verts': smpl_verts.unsqueeze(0)
+        }
+    def compute_voxel_verts(self, body_pose, global_orient, betas, trans, scale):
+        smpl_path = osp.join(self.smpl_data.model_dir, "smpl/SMPL_NEUTRAL.pkl")
+        tetra_path = osp.join(self.smpl_data.tedra_dir, 'tetra_neutral_adult_smpl.npz')
+        smpl_model = TetraSMPLModel(smpl_path, tetra_path, 'adult')
+        pose = torch.cat([global_orient[0], body_pose[0]], dim=0)
+        smpl_model.set_params(rotation_matrix_to_angle_axis(rot6d_to_rotmat(pose)), beta=betas[0])
+        verts = np.concatenate([smpl_model.verts, smpl_model.verts_added],
+                               axis=0) * scale.item() + trans.detach().cpu().numpy()
+        faces = np.loadtxt(osp.join(self.smpl_data.tedra_dir, 'tetrahedrons_neutral_adult.txt'),
+                           dtype=np.int32) - 1
+        pad_v_num = int(8000 - verts.shape[0])
+        pad_f_num = int(25100 - faces.shape[0])
+        verts = np.pad(verts,
+                       ((0, pad_v_num),
+                        (0, 0)), mode='constant', constant_values=0.0).astype(np.float32) * 0.5
+        faces = np.pad(faces, ((0, pad_f_num), (0, 0)), mode='constant',
+                       constant_values=0.0).astype(np.int32)
+        verts[:, 2] *= -1.0
+        voxel_dict = {
+            'voxel_verts': torch.from_numpy(verts).to(self.device).unsqueeze(0).float(),
+            'voxel_faces': torch.from_numpy(faces).to(self.device).unsqueeze(0).long(),
+            'pad_v_num': torch.tensor(pad_v_num).to(self.device).unsqueeze(0).long(),
+            'pad_f_num': torch.tensor(pad_f_num).to(self.device).unsqueeze(0).long()
+        }
+        return voxel_dict
+    def __getitem__(self, index):
+        img_path = self.subject_list[index]
+        img_name = img_path.split("/")[-1].rsplit(".", 1)[0]
+        print(img_name)
+        # smplx_param_path=f'./data/thuman2/smplx/{img_name[:-2]}.pkl'
+        # smplx_param = np.load(smplx_param_path, allow_pickle=True)
+        if self.seg_dir is None:
+            img_icon, img_hps, img_ori, img_mask, uncrop_param = process_image(
+                img_path, self.hps_type, 512, self.device)
+            data_dict = {
+                'name': img_name,
+                'image': img_icon.to(self.device).unsqueeze(0),
+                'ori_image': img_ori,
+                'mask': img_mask,
+                'uncrop_param': uncrop_param
+            }
+        else:
+            img_icon, img_hps, img_ori, img_mask, uncrop_param, segmentations = process_image(
+                img_path,
+                self.hps_type,
+                512,
+                self.device,
+                seg_path=os.path.join(self.seg_dir, f'{img_name}.json'))
+            data_dict = {
+                'name': img_name,
+                'image': img_icon.to(self.device).unsqueeze(0),
+                'ori_image': img_ori,
+                'mask': img_mask,
+                'uncrop_param': uncrop_param,
+                'segmentations': segmentations
+            }
+        arr_dict=econ_process_image(img_path,self.hps_type,True,512,self.detector)
+        data_dict['hands_visibility']=arr_dict['hands_visibility']
+        with torch.no_grad():
+            # import ipdb; ipdb.set_trace()
+            preds_dict = self.hps.forward(img_hps)
+        data_dict['smpl_faces'] = torch.Tensor(self.faces.astype(np.int64)).long().unsqueeze(0).to(
+            self.device)
+        if self.hps_type == 'pymaf':
+            output = preds_dict['smpl_out'][-1]
+            scale, tranX, tranY = output['theta'][0, :3]
+            data_dict['betas'] = output['pred_shape']
+            data_dict['body_pose'] = output['rotmat'][:, 1:]
+            data_dict['global_orient'] = output['rotmat'][:, 0:1]
+            data_dict['smpl_verts'] = output['verts']     # 不确定尺度是否一样
+            data_dict["type"] = "smpl"
+        elif self.hps_type == 'pare':
+            data_dict['body_pose'] = preds_dict['pred_pose'][:, 1:]
+            data_dict['global_orient'] = preds_dict['pred_pose'][:, 0:1]
+            data_dict['betas'] = preds_dict['pred_shape']
+            data_dict['smpl_verts'] = preds_dict['smpl_vertices']
+            scale, tranX, tranY = preds_dict['pred_cam'][0, :3]
+            data_dict["type"] = "smpl"
+        elif self.hps_type == 'pixie':
+            data_dict.update(preds_dict)
+            data_dict['body_pose'] = preds_dict['body_pose']
+            data_dict['global_orient'] = preds_dict['global_pose']
+            data_dict['betas'] = preds_dict['shape']
+            data_dict['smpl_verts'] = preds_dict['vertices']
+            scale, tranX, tranY = preds_dict['cam'][0, :3]
+            data_dict["type"] = "smplx"
+        elif self.hps_type == 'hybrik':
+            data_dict['body_pose'] = preds_dict['pred_theta_mats'][:, 1:]
+            data_dict['global_orient'] = preds_dict['pred_theta_mats'][:, [0]]
+            data_dict['betas'] = preds_dict['pred_shape']
+            data_dict['smpl_verts'] = preds_dict['pred_vertices']
+            scale, tranX, tranY = preds_dict['pred_camera'][0, :3]
+            scale = scale * 2
+            data_dict["type"] = "smpl"
+        elif self.hps_type == 'bev':
+            data_dict['betas'] = torch.from_numpy(preds_dict['smpl_betas'])[[0], :10].to(
+                self.device).float()
+            pred_thetas = batch_rodrigues(
+                torch.from_numpy(preds_dict['smpl_thetas'][0]).reshape(-1, 3)).float()
+            data_dict['body_pose'] = pred_thetas[1:][None].to(self.device)
+            data_dict['global_orient'] = pred_thetas[[0]][None].to(self.device)
+            data_dict['smpl_verts'] = torch.from_numpy(preds_dict['verts'][[0]]).to(
+                self.device).float()
+            tranX = preds_dict['cam_trans'][0, 0]
+            tranY = preds_dict['cam'][0, 1] + 0.28
+            scale = preds_dict['cam'][0, 0] * 1.1
+            data_dict["type"] = "smpl"
+        data_dict['scale'] = scale
+        data_dict['trans'] = torch.tensor([tranX, tranY, 0.0]).unsqueeze(0).to(self.device).float()
+        # data_dict info (key-shape):
+        # scale, tranX, tranY - tensor.float
+        # betas - [1,10] / [1, 200]
+        # body_pose - [1, 23, 3, 3] / [1, 21, 3, 3]
+        # global_orient - [1, 1, 3, 3]
+        # smpl_verts - [1, 6890, 3] / [1, 10475, 3]
+        # from rot_mat to rot_6d for better optimization
+        N_body = data_dict["body_pose"].shape[1]
+        data_dict["body_pose"] = data_dict["body_pose"][:, :, :, :2].reshape(1, N_body, -1)
+        data_dict["global_orient"] = data_dict["global_orient"][:, :, :, :2].reshape(1, 1, -1)
+        return data_dict
+    def render_normal(self, verts, faces):
+        # render optimized mesh (normal, T_normal, image [-1,1])
+        self.render.load_meshes(verts, faces)
+        return self.render.get_rgb_image()
+    def render_depth(self, verts, faces):
+        # render optimized mesh (normal, T_normal, image [-1,1])
+        self.render.load_meshes(verts, faces)
+        return self.render.get_depth_map(cam_ids=[0, 2])
+    def visualize_alignment(self, data):
+        import vedo
+        import trimesh
+        if self.hps_type != 'pixie':
+            smpl_out = self.smpl_model(betas=data['betas'],
+                                       body_pose=data['body_pose'],
+                                       global_orient=data['global_orient'],
+                                       pose2rot=False)
+            smpl_verts = ((smpl_out.vertices + data['trans']) *
+                          data['scale']).detach().cpu().numpy()[0]
+        else:
+            smpl_verts, _, _ = self.smpl_model(shape_params=data['betas'],
+                                               expression_params=data['exp'],
+                                               body_pose=data['body_pose'],
+                                               global_pose=data['global_orient'],
+                                               jaw_pose=data['jaw_pose'],
+                                               left_hand_pose=data['left_hand_pose'],
+                                               right_hand_pose=data['right_hand_pose'])
+            smpl_verts = ((smpl_verts + data['trans']) * data['scale']).detach().cpu().numpy()[0]
+        smpl_verts *= np.array([1.0, -1.0, -1.0])
+        faces = data['smpl_faces'][0].detach().cpu().numpy()
+        image_P = data['image']
+        image_F, image_B = self.render_normal(smpl_verts, faces)
+        # create plot
+        vp = vedo.Plotter(title="", size=(1500, 1500))
+        vis_list = []
+        image_F = (0.5 * (1.0 + image_F[0].permute(1, 2, 0).detach().cpu().numpy()) * 255.0)
+        image_B = (0.5 * (1.0 + image_B[0].permute(1, 2, 0).detach().cpu().numpy()) * 255.0)
+        image_P = (0.5 * (1.0 + image_P[0].permute(1, 2, 0).detach().cpu().numpy()) * 255.0)
+        vis_list.append(
+            vedo.Picture(image_P * 0.5 + image_F * 0.5).scale(2.0 / image_P.shape[0]).pos(
+                -1.0, -1.0, 1.0))
+        vis_list.append(vedo.Picture(image_F).scale(2.0 / image_F.shape[0]).pos(-1.0, -1.0, -0.5))
+        vis_list.append(vedo.Picture(image_B).scale(2.0 / image_B.shape[0]).pos(-1.0, -1.0, -1.0))
+        # create a mesh
+        mesh = trimesh.Trimesh(smpl_verts, faces, process=False)
+        mesh.visual.vertex_colors = [200, 200, 0]
+        vis_list.append(mesh)
+        vp.show(*vis_list, bg="white", axes=1, interactive=True)
+if __name__ == '__main__':
+    cfg.merge_from_file("./configs/icon-filter.yaml")
+    cfg.merge_from_file('./lib/pymaf/configs/pymaf_config.yaml')
+    cfg_show_list = ['test_gpus', ['0'], 'mcube_res', 512, 'clean_mesh', False]
+    cfg.merge_from_list(cfg_show_list)
+    cfg.freeze()
+    device = torch.device('cuda:0')
+    dataset = SMPLDataset(
+        {
+            'image_dir': "./examples",
+            'has_det': True,  # w/ or w/o detection
+            'hps_type': 'bev'  # pymaf/pare/pixie/hybrik/bev
+        },
+        device)
+    for i in range(len(dataset)):
+        dataset.visualize_alignment(dataset[i])

examples/02986d0998ce01aa0aa67a99fbd1e09a.png ADDED Viewed

examples/16171.png ADDED Viewed

examples/26d2e846349647ff04c536816e0e8ca1.png ADDED Viewed

examples/30755.png ADDED Viewed

examples/3930.png ADDED Viewed

examples/4656716-3016170581.png ADDED Viewed

examples/663dcd6db19490de0b790da430bd5681.png ADDED Viewed

Git LFS Details

SHA256: b499922b6df6d6874fea68c571ff3271f68aa6bc40420396f4898e5c58d74dc8
Pointer size: 132 Bytes
Size of remote file: 1 MB

examples/7332.png ADDED Viewed

examples/85891251f52a2399e660a63c2a7fdf40.png ADDED Viewed

examples/a689a48d23d6b8d58d67ff5146c6e088.png ADDED Viewed

examples/b0d178743c7e3e09700aaee8d2b1ec47.png ADDED Viewed

examples/case5.png ADDED Viewed

examples/d40776a1e1582179d97907d36f84d776.png ADDED Viewed

examples/durant.png ADDED Viewed

examples/eedb9018-e0eb-45be-33bd-5a0108ca0d8b.png ADDED Viewed

examples/f14f7d40b72062928461b21c6cc877407e69ee0c_high.png ADDED Viewed

examples/f6317ac1b0498f4e6ef9d12bd991a9bd1ff4ae04f898-IQTEBw_fw1200.png ADDED Viewed

examples/pexels-barbara-olsen-7869640.png ADDED Viewed

examples/pexels-julia-m-cameron-4145040.png ADDED Viewed

examples/pexels-marta-wave-6437749.png ADDED Viewed

examples/pexels-photo-6311555-removebg.png ADDED Viewed

examples/pexels-zdmit-6780091.png ADDED Viewed

inference.py ADDED Viewed

	@@ -0,0 +1,221 @@

+import argparse
+import os
+from typing import Dict, Optional, Tuple, List
+from omegaconf import OmegaConf
+from PIL import Image
+from dataclasses import dataclass
+from collections import defaultdict
+import torch
+import torch.utils.checkpoint
+from torchvision.utils import make_grid, save_image
+from accelerate.utils import  set_seed
+from tqdm.auto import tqdm
+import torch.nn.functional as F
+from einops import rearrange
+from rembg import remove, new_session
+import pdb
+from mvdiffusion.pipelines.pipeline_mvdiffusion_unclip import StableUnCLIPImg2ImgPipeline
+from econdataset import SMPLDataset
+from reconstruct import ReMesh
+providers = [
+    ('CUDAExecutionProvider', {
+        'device_id': 0,
+        'arena_extend_strategy': 'kSameAsRequested',
+        'gpu_mem_limit': 8 * 1024 * 1024 * 1024,
+        'cudnn_conv_algo_search': 'HEURISTIC',
+    })
+]
+session = new_session(providers=providers)
+weight_dtype = torch.float16
+def tensor_to_numpy(tensor):
+    return tensor.mul(255).add_(0.5).clamp_(0, 255).permute(1, 2, 0).to("cpu", torch.uint8).numpy()
+@dataclass
+class TestConfig:
+    pretrained_model_name_or_path: str
+    revision: Optional[str]
+    validation_dataset: Dict
+    save_dir: str
+    seed: Optional[int]
+    validation_batch_size: int
+    dataloader_num_workers: int
+    # save_single_views: bool
+    save_mode: str
+    local_rank: int
+    pipe_kwargs: Dict
+    pipe_validation_kwargs: Dict
+    unet_from_pretrained_kwargs: Dict
+    validation_guidance_scales: float
+    validation_grid_nrow: int
+    num_views: int
+    enable_xformers_memory_efficient_attention: bool
+    with_smpl: Optional[bool]
+    recon_opt: Dict
+def convert_to_numpy(tensor):
+    return tensor.mul(255).add_(0.5).clamp_(0, 255).permute(1, 2, 0).to("cpu", torch.uint8).numpy()
+def convert_to_pil(tensor):
+    return Image.fromarray(convert_to_numpy(tensor))
+def save_image(tensor, fp):
+    ndarr = convert_to_numpy(tensor)
+    # pdb.set_trace()
+    save_image_numpy(ndarr, fp)
+    return ndarr
+def save_image_numpy(ndarr, fp):
+    im = Image.fromarray(ndarr)
+    im.save(fp)
+def run_inference(dataloader, econdata, pipeline, carving, cfg: TestConfig,  save_dir):
+    pipeline.set_progress_bar_config(disable=True)
+    if cfg.seed is None:
+        generator = None
+    else:
+        generator = torch.Generator(device=pipeline.unet.device).manual_seed(cfg.seed)
+    images_cond, pred_cat = [], defaultdict(list)
+    for case_id, batch in tqdm(enumerate(dataloader)):
+        images_cond.append(batch['imgs_in'][:, 0])
+        imgs_in = torch.cat([batch['imgs_in']]*2, dim=0)
+        num_views = imgs_in.shape[1]
+        imgs_in = rearrange(imgs_in, "B Nv C H W -> (B Nv) C H W")# (B*Nv, 3, H, W)
+        if cfg.with_smpl:
+            smpl_in = torch.cat([batch['smpl_imgs_in']]*2, dim=0)
+            smpl_in = rearrange(smpl_in, "B Nv C H W -> (B Nv) C H W")
+        else:
+            smpl_in = None
+        normal_prompt_embeddings, clr_prompt_embeddings = batch['normal_prompt_embeddings'], batch['color_prompt_embeddings']
+        prompt_embeddings = torch.cat([normal_prompt_embeddings, clr_prompt_embeddings], dim=0)
+        prompt_embeddings = rearrange(prompt_embeddings, "B Nv N C -> (B Nv) N C")
+        with torch.autocast("cuda"):
+            # B*Nv images
+            guidance_scale = cfg.validation_guidance_scales
+            unet_out = pipeline(
+                imgs_in, None, prompt_embeds=prompt_embeddings,
+                dino_feature=None, smpl_in=smpl_in,
+                generator=generator, guidance_scale=guidance_scale, output_type='pt', num_images_per_prompt=1,
+                **cfg.pipe_validation_kwargs
+            )
+            out = unet_out.images
+            bsz = out.shape[0] // 2
+            normals_pred = out[:bsz]
+            images_pred = out[bsz:]
+            if cfg.save_mode == 'concat': ## save concatenated color and normal---------------------
+                pred_cat[f"cfg{guidance_scale:.1f}"].append(torch.cat([normals_pred, images_pred], dim=-1)) # b, 3, h, w
+                cur_dir = os.path.join(save_dir, f"cropsize-{cfg.validation_dataset.crop_size}-cfg{guidance_scale:.1f}-seed{cfg.seed}-smpl-{cfg.with_smpl}")
+                os.makedirs(cur_dir, exist_ok=True)
+                for i in range(bsz//num_views):
+                    scene =  batch['filename'][i].split('.')[0]
+                    img_in_ = images_cond[-1][i].to(out.device)
+                    vis_ = [img_in_]
+                    for j in range(num_views):
+                        idx = i*num_views + j
+                        normal = normals_pred[idx]
+                        color = images_pred[idx]
+                        vis_.append(color)
+                        vis_.append(normal)
+                    out_filename = f"{cur_dir}/{scene}.png"
+                    vis_ = torch.stack(vis_, dim=0)
+                    vis_ = make_grid(vis_, nrow=len(vis_), padding=0, value_range=(0, 1))
+                    save_image(vis_, out_filename)
+            elif cfg.save_mode == 'rgb':
+                for i in range(bsz//num_views):
+                    scene =  batch['filename'][i].split('.')[0]
+                    img_in_ = images_cond[-1][i].to(out.device)
+                    normals, colors = [], []
+                    for j in range(num_views):
+                        idx = i*num_views + j
+                        normal = normals_pred[idx]
+                        if j == 0:
+                            color = imgs_in[0].to(out.device)
+                        else:
+                            color = images_pred[idx]
+                        if j in [3, 4]:
+                            normal = torch.flip(normal, dims=[2])
+                            color = torch.flip(color, dims=[2])
+                        colors.append(color)
+                        if j == 6:
+                            normal = F.interpolate(normal.unsqueeze(0), size=(256, 256), mode='bilinear', align_corners=False).squeeze(0)
+                        normals.append(normal)
+                        ## save color and normal---------------------
+                        # normal_filename = f"normals_{view}_masked.png"
+                        # rgb_filename = f"color_{view}_masked.png"
+                        # save_image(normal, os.path.join(scene_dir, normal_filename))
+                        # save_image(color, os.path.join(scene_dir, rgb_filename))
+                    normals[0][:, :256, 256:512] =  normals[-1]
+                    colors = [remove(convert_to_pil(tensor), session=session) for tensor in colors[:6]]
+                    normals = [remove(convert_to_pil(tensor), session=session) for tensor in normals[:6]]
+        pose = econdata.__getitem__(case_id)
+        carving.optimize_case(scene, pose, colors, normals)
+        torch.cuda.empty_cache()
+def load_pshuman_pipeline(cfg):
+    pipeline = StableUnCLIPImg2ImgPipeline.from_pretrained(cfg.pretrained_model_name_or_path, torch_dtype=weight_dtype)
+    pipeline.unet.enable_xformers_memory_efficient_attention()
+    if torch.cuda.is_available():
+        pipeline.to('cuda')
+    return pipeline
+def main(
+    cfg: TestConfig
+):
+    # If passed along, set the training seed now.
+    if cfg.seed is not None:
+        set_seed(cfg.seed)
+    pipeline = load_pshuman_pipeline(cfg)
+    if cfg.with_smpl:
+        from mvdiffusion.data.testdata_with_smpl import SingleImageDataset
+    else:
+        from mvdiffusion.data.single_image_dataset import SingleImageDataset
+    # Get the  dataset
+    validation_dataset = SingleImageDataset(
+        **cfg.validation_dataset
+    )
+    validation_dataloader = torch.utils.data.DataLoader(
+        validation_dataset, batch_size=cfg.validation_batch_size, shuffle=False, num_workers=cfg.dataloader_num_workers
+    )
+    dataset_param = {'image_dir': validation_dataset.root_dir, 'seg_dir': None, 'colab': False, 'has_det': True, 'hps_type': 'pixie'}
+    econdata = SMPLDataset(dataset_param, device='cuda')
+    carving = ReMesh(cfg.recon_opt, econ_dataset=econdata)
+    run_inference(validation_dataloader, econdata, pipeline, carving, cfg, cfg.save_dir)
+if __name__ == '__main__':
+    parser = argparse.ArgumentParser()
+    parser.add_argument('--config', type=str, required=True)
+    args, extras = parser.parse_known_args()
+    from utils.misc import load_config
+    # parse YAML config to OmegaConf
+    cfg = load_config(args.config, cli_args=extras)
+    schema = OmegaConf.structured(TestConfig)
+    cfg = OmegaConf.merge(schema, cfg)
+    main(cfg)

lib/__init__.py ADDED Viewed

File without changes

lib/common/__init__.py ADDED Viewed

File without changes

lib/common/cloth_extraction.py ADDED Viewed

	@@ -0,0 +1,182 @@

+import numpy as np
+import json
+import os
+import itertools
+import trimesh
+from matplotlib.path import Path
+from collections import Counter
+from sklearn.neighbors import KNeighborsClassifier
+def load_segmentation(path, shape):
+    """
+    Get a segmentation mask for a given image
+    Arguments:
+        path: path to the segmentation json file
+        shape: shape of the output mask
+    Returns:
+        Returns a segmentation mask
+    """
+    with open(path) as json_file:
+        dict = json.load(json_file)
+        segmentations = []
+        for key, val in dict.items():
+            if not key.startswith('item'):
+                continue
+            # Each item can have multiple polygons. Combine them to one
+            # segmentation_coord = list(itertools.chain.from_iterable(val['segmentation']))
+            # segmentation_coord = np.round(np.array(segmentation_coord)).astype(int)
+            coordinates = []
+            for segmentation_coord in val['segmentation']:
+                # The format before is [x1,y1, x2, y2, ....]
+                x = segmentation_coord[::2]
+                y = segmentation_coord[1::2]
+                xy = np.vstack((x, y)).T
+                coordinates.append(xy)
+            segmentations.append({
+                'type': val['category_name'],
+                'type_id': val['category_id'],
+                'coordinates': coordinates
+            })
+        return segmentations
+def smpl_to_recon_labels(recon, smpl, k=1):
+    """
+    Get the bodypart labels for the recon object by using the labels from the corresponding smpl object
+    Arguments:
+        recon: trimesh object (fully clothed model)
+        shape: trimesh object (smpl model)
+        k: number of nearest neighbours to use
+    Returns:
+        Returns a dictionary containing the bodypart and the corresponding indices
+    """
+    smpl_vert_segmentation = json.load(
+        open(
+            os.path.join(os.path.dirname(__file__),
+                         'smpl_vert_segmentation.json')))
+    n = smpl.vertices.shape[0]
+    y = np.array([None] * n)
+    for key, val in smpl_vert_segmentation.items():
+        y[val] = key
+    classifier = KNeighborsClassifier(n_neighbors=1)
+    classifier.fit(smpl.vertices, y)
+    y_pred = classifier.predict(recon.vertices)
+    recon_labels = {}
+    for key in smpl_vert_segmentation.keys():
+        recon_labels[key] = list(
+            np.argwhere(y_pred == key).flatten().astype(int))
+    return recon_labels
+def extract_cloth(recon, segmentation, K, R, t, smpl=None):
+    """
+    Extract a portion of a mesh using 2d segmentation coordinates
+    Arguments:
+        recon: fully clothed mesh
+        seg_coord: segmentation coordinates in 2D (NDC)
+        K: intrinsic matrix of the projection
+        R: rotation matrix of the projection
+        t: translation vector of the projection
+    Returns:
+        Returns a submesh using the segmentation coordinates
+    """
+    seg_coord = segmentation['coord_normalized']
+    mesh = trimesh.Trimesh(recon.vertices, recon.faces)
+    extrinsic = np.zeros((3, 4))
+    extrinsic[:3, :3] = R
+    extrinsic[:, 3] = t
+    P = K[:3, :3] @ extrinsic
+    P_inv = np.linalg.pinv(P)
+    # Each segmentation can contain multiple polygons
+    # We need to check them separately
+    points_so_far = []
+    faces = recon.faces
+    for polygon in seg_coord:
+        n = len(polygon)
+        coords_h = np.hstack((polygon, np.ones((n, 1))))
+        # Apply the inverse projection on homogeneus 2D coordinates to get the corresponding 3d Coordinates
+        XYZ = P_inv @ coords_h[:, :, None]
+        XYZ = XYZ.reshape((XYZ.shape[0], XYZ.shape[1]))
+        XYZ = XYZ[:, :3] / XYZ[:, 3, None]
+        p = Path(XYZ[:, :2])
+        grid = p.contains_points(recon.vertices[:, :2])
+        indeces = np.argwhere(grid == True)
+        points_so_far += list(indeces.flatten())
+    if smpl is not None:
+        num_verts = recon.vertices.shape[0]
+        recon_labels = smpl_to_recon_labels(recon, smpl)
+        body_parts_to_remove = [
+            'rightHand', 'leftToeBase', 'leftFoot', 'rightFoot', 'head',
+            'leftHandIndex1', 'rightHandIndex1', 'rightToeBase', 'leftHand',
+            'rightHand'
+        ]
+        type = segmentation['type_id']
+        # Remove additional bodyparts that are most likely not part of the segmentation but might intersect (e.g. hand in front of torso)
+        # https://github.com/switchablenorms/DeepFashion2
+        # Short sleeve clothes
+        if type == 1 or type == 3 or type == 10:
+            body_parts_to_remove += ['leftForeArm', 'rightForeArm']
+        # No sleeves at all or lower body clothes
+        elif type == 5 or type == 6 or type == 12 or type == 13 or type == 8 or type == 9:
+            body_parts_to_remove += [
+                'leftForeArm', 'rightForeArm', 'leftArm', 'rightArm'
+            ]
+        # Shorts
+        elif type == 7:
+            body_parts_to_remove += [
+                'leftLeg', 'rightLeg', 'leftForeArm', 'rightForeArm',
+                'leftArm', 'rightArm'
+            ]
+        verts_to_remove = list(
+            itertools.chain.from_iterable(
+                [recon_labels[part] for part in body_parts_to_remove]))
+        label_mask = np.zeros(num_verts, dtype=bool)
+        label_mask[verts_to_remove] = True
+        seg_mask = np.zeros(num_verts, dtype=bool)
+        seg_mask[points_so_far] = True
+        # Remove points that belong to other bodyparts
+        # If a vertice in pointsSoFar is included in the bodyparts to remove, then these points should be removed
+        extra_verts_to_remove = np.array(list(seg_mask) and list(label_mask))
+        combine_mask = np.zeros(num_verts, dtype=bool)
+        combine_mask[points_so_far] = True
+        combine_mask[extra_verts_to_remove] = False
+        all_indices = np.argwhere(combine_mask == True).flatten()
+    i_x = np.where(np.in1d(faces[:, 0], all_indices))[0]
+    i_y = np.where(np.in1d(faces[:, 1], all_indices))[0]
+    i_z = np.where(np.in1d(faces[:, 2], all_indices))[0]
+    faces_to_keep = np.array(list(set(i_x).union(i_y).union(i_z)))
+    mask = np.zeros(len(recon.faces), dtype=bool)
+    if len(faces_to_keep) > 0:
+        mask[faces_to_keep] = True
+        mesh.update_faces(mask)
+        mesh.remove_unreferenced_vertices()
+        # mesh.rezero()
+        return mesh
+    return None

lib/common/config.py ADDED Viewed

	@@ -0,0 +1,218 @@

+# -*- coding: utf-8 -*-
+# Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V. (MPG) is
+# holder of all proprietary rights on this computer program.
+# You can only use this computer program if you have closed
+# a license agreement with MPG or you get the right to use the computer
+# program from someone who is authorized to grant you that right.
+# Any use of the computer program without a valid license is prohibited and
+# liable to prosecution.
+#
+# Copyright©2019 Max-Planck-Gesellschaft zur Förderung
+# der Wissenschaften e.V. (MPG). acting on behalf of its Max Planck Institute
+# for Intelligent Systems. All rights reserved.
+#
+# Contact: ps-license@tuebingen.mpg.de
+from yacs.config import CfgNode as CN
+import os
+_C = CN(new_allowed=True)
+# needed by trainer
+_C.name = 'default'
+_C.gpus = [0]
+_C.test_gpus = [1]
+_C.root = "./data/"
+_C.ckpt_dir = './data/ckpt/'
+_C.resume_path = ''
+_C.normal_path = ''
+_C.corr_path = ''
+_C.results_path = './data/results/'
+_C.projection_mode = 'orthogonal'
+_C.num_views = 1
+_C.sdf = False
+_C.sdf_clip = 5.0
+_C.lr_G = 1e-3
+_C.lr_C = 1e-3
+_C.lr_N = 2e-4
+_C.weight_decay = 0.0
+_C.momentum = 0.0
+_C.optim = 'Adam'
+_C.schedule = [5, 10, 15]
+_C.gamma = 0.1
+_C.overfit = False
+_C.resume = False
+_C.test_mode = False
+_C.test_uv = False
+_C.draw_geo_thres = 0.60
+_C.num_sanity_val_steps = 2
+_C.fast_dev = 0
+_C.get_fit = False
+_C.agora = False
+_C.optim_cloth = False
+_C.optim_body = False
+_C.mcube_res = 256
+_C.clean_mesh = True
+_C.remesh = False
+_C.batch_size = 4
+_C.num_threads = 8
+_C.num_epoch = 10
+_C.freq_plot = 0.01
+_C.freq_show_train = 0.1
+_C.freq_show_val = 0.2
+_C.freq_eval = 0.5
+_C.accu_grad_batch = 4
+_C.test_items = ['sv', 'mv', 'mv-fusion', 'hybrid', 'dc-pred', 'gt']
+_C.net = CN()
+_C.net.gtype = 'HGPIFuNet'
+_C.net.ctype = 'resnet18'
+_C.net.classifierIMF = 'MultiSegClassifier'
+_C.net.netIMF = 'resnet18'
+_C.net.norm = 'group'
+_C.net.norm_mlp = 'group'
+_C.net.norm_color = 'group'
+_C.net.hg_down = 'conv128' #'ave_pool'
+_C.net.num_views = 1
+# kernel_size, stride, dilation, padding
+_C.net.conv1 = [7, 2, 1, 3]
+_C.net.conv3x3 = [3, 1, 1, 1]
+_C.net.num_stack = 4
+_C.net.num_hourglass = 2
+_C.net.hourglass_dim = 256
+_C.net.voxel_dim = 32
+_C.net.resnet_dim = 120
+_C.net.mlp_dim = [320, 1024, 512, 256, 128, 1]
+_C.net.mlp_dim_knn = [320, 1024, 512, 256, 128, 3]
+_C.net.mlp_dim_color = [513, 1024, 512, 256, 128, 3]
+_C.net.mlp_dim_multiseg = [1088, 2048, 1024, 500]
+_C.net.res_layers = [2, 3, 4]
+_C.net.filter_dim = 256
+_C.net.smpl_dim = 3
+_C.net.cly_dim = 3
+_C.net.soft_dim = 64
+_C.net.z_size = 200.0
+_C.net.N_freqs = 10
+_C.net.geo_w = 0.1
+_C.net.norm_w = 0.1
+_C.net.dc_w = 0.1
+_C.net.C_cat_to_G = False
+_C.net.skip_hourglass = True
+_C.net.use_tanh = False
+_C.net.soft_onehot = True
+_C.net.no_residual = False
+_C.net.use_attention = False
+_C.net.prior_type = "sdf"
+_C.net.smpl_feats = ['sdf', 'cmap', 'norm', 'vis']
+_C.net.use_filter = True
+_C.net.use_cc = False
+_C.net.use_PE = False
+_C.net.use_IGR = False
+_C.net.in_geo = ()
+_C.net.in_nml = ()
+_C.dataset = CN()
+_C.dataset.root = ''
+_C.dataset.set_splits = [0.95, 0.04]
+_C.dataset.types = [
+    "3dpeople", "axyz", "renderpeople", "renderpeople_p27", "humanalloy"
+]
+_C.dataset.scales = [1.0, 100.0, 1.0, 1.0, 100.0 / 39.37]
+_C.dataset.rp_type = "pifu900"
+_C.dataset.th_type = 'train'
+_C.dataset.input_size = 512
+_C.dataset.rotation_num = 3
+_C.dataset.num_sample_ray=128  # volume rendering
+_C.dataset.num_precomp = 10  # Number of segmentation classifiers
+_C.dataset.num_multiseg = 500  # Number of categories per classifier
+_C.dataset.num_knn = 10  # for loss/error
+_C.dataset.num_knn_dis = 20  # for accuracy
+_C.dataset.num_verts_max = 20000
+_C.dataset.zray_type = False
+_C.dataset.online_smpl = False
+_C.dataset.noise_type = ['z-trans', 'pose', 'beta']
+_C.dataset.noise_scale = [0.0, 0.0, 0.0]
+_C.dataset.num_sample_geo = 10000
+_C.dataset.num_sample_color = 0
+_C.dataset.num_sample_seg = 0
+_C.dataset.num_sample_knn = 10000
+_C.dataset.sigma_geo = 5.0
+_C.dataset.sigma_color = 0.10
+_C.dataset.sigma_seg = 0.10
+_C.dataset.thickness_threshold = 20.0
+_C.dataset.ray_sample_num = 2
+_C.dataset.semantic_p = False
+_C.dataset.remove_outlier = False
+_C.dataset.train_bsize = 1.0
+_C.dataset.val_bsize = 1.0
+_C.dataset.test_bsize = 1.0
+def get_cfg_defaults():
+    """Get a yacs CfgNode object with default values for my_project."""
+    # Return a clone so that the defaults will not be altered
+    # This is for the "local variable" use pattern
+    return _C.clone()
+# Alternatively, provide a way to import the defaults as
+# a global singleton:
+cfg = _C  # users can `from config import cfg`
+# cfg = get_cfg_defaults()
+# cfg.merge_from_file('./configs/example.yaml')
+# # Now override from a list (opts could come from the command line)
+# opts = ['dataset.root', './data/XXXX', 'learning_rate', '1e-2']
+# cfg.merge_from_list(opts)
+def update_cfg(cfg_file):
+    # cfg = get_cfg_defaults()
+    _C.merge_from_file(cfg_file)
+    # return cfg.clone()
+    return _C
+def parse_args(args):
+    cfg_file = args.cfg_file
+    if args.cfg_file is not None:
+        cfg = update_cfg(args.cfg_file)
+    else:
+        cfg = get_cfg_defaults()
+    # if args.misc is not None:
+    #     cfg.merge_from_list(args.misc)
+    return cfg
+def parse_args_extend(args):
+    if args.resume:
+        if not os.path.exists(args.log_dir):
+            raise ValueError(
+                'Experiment are set to resume mode, but log directory does not exist.'
+            )
+        # load log's cfg
+        cfg_file = os.path.join(args.log_dir, 'cfg.yaml')
+        cfg = update_cfg(cfg_file)
+        if args.misc is not None:
+            cfg.merge_from_list(args.misc)
+    else:
+        parse_args(args)

lib/common/imutils.py ADDED Viewed

	@@ -0,0 +1,364 @@

+import os
+os.environ["OPENCV_IO_ENABLE_OPENEXR"]="1"
+import cv2
+import mediapipe as mp
+import torch
+import numpy as np
+import torch.nn.functional as F
+from PIL import Image
+from lib.pymafx.core import constants
+from rembg import remove
+# from rembg.session_factory import new_session
+from torchvision import transforms
+from kornia.geometry.transform import get_affine_matrix2d, warp_affine
+def transform_to_tensor(res, mean=None, std=None, is_tensor=False):
+    all_ops = []
+    if res is not None:
+        all_ops.append(transforms.Resize(size=res))
+    if not is_tensor:
+        all_ops.append(transforms.ToTensor())
+    if mean is not None and std is not None:
+        all_ops.append(transforms.Normalize(mean=mean, std=std))
+    return transforms.Compose(all_ops)
+def get_affine_matrix_wh(w1, h1, w2, h2):
+    transl = torch.tensor([(w2 - w1) / 2.0, (h2 - h1) / 2.0]).unsqueeze(0)
+    center = torch.tensor([w1 / 2.0, h1 / 2.0]).unsqueeze(0)
+    scale = torch.min(torch.tensor([w2 / w1, h2 / h1])).repeat(2).unsqueeze(0)
+    M = get_affine_matrix2d(transl, center, scale, angle=torch.tensor([0.]))
+    return M
+def get_affine_matrix_box(boxes, w2, h2):
+    # boxes [left, top, right, bottom]
+    width = boxes[:, 2] - boxes[:, 0]    #(N,)
+    height = boxes[:, 3] - boxes[:, 1]    #(N,)
+    center = torch.tensor(
+        [(boxes[:, 0] + boxes[:, 2]) / 2.0, (boxes[:, 1] + boxes[:, 3]) / 2.0]
+    ).T    #(N,2)
+    scale = torch.min(torch.tensor([w2 / width, h2 / height]),
+                      dim=0)[0].unsqueeze(1).repeat(1, 2) * 0.9    #(N,2)
+    transl = torch.cat([w2 / 2.0 - center[:, 0:1], h2 / 2.0 - center[:, 1:2]], dim=1)   #(N,2)
+    M = get_affine_matrix2d(transl, center, scale, angle=torch.tensor([0.,]*transl.shape[0]))
+    return M
+def load_img(img_file):
+    if img_file.endswith("exr"):
+        img = cv2.imread(img_file, cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)
+    else :
+        img = cv2.imread(img_file, cv2.IMREAD_UNCHANGED)
+    # considering non 8-bit image
+    if img.dtype != np.uint8 :
+        img = cv2.normalize(img, None, 0, 255, cv2.NORM_MINMAX, dtype=cv2.CV_8U)
+    if len(img.shape) == 2:
+        img = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)
+    if not img_file.endswith("png"):
+        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
+    else:
+        img = cv2.cvtColor(img, cv2.COLOR_RGBA2BGR)
+    return torch.tensor(img).permute(2, 0, 1).unsqueeze(0).float(), img.shape[:2]
+def get_keypoints(image):
+    def collect_xyv(x, body=True):
+        lmk = x.landmark
+        all_lmks = []
+        for i in range(len(lmk)):
+            visibility = lmk[i].visibility if body else 1.0
+            all_lmks.append(torch.Tensor([lmk[i].x, lmk[i].y, lmk[i].z, visibility]))
+        return torch.stack(all_lmks).view(-1, 4)
+    mp_holistic = mp.solutions.holistic
+    with mp_holistic.Holistic(
+        static_image_mode=True,
+        model_complexity=2,
+    ) as holistic:
+        results = holistic.process(image)
+    fake_kps = torch.zeros(33, 4)
+    result = {}
+    result["body"] = collect_xyv(results.pose_landmarks) if results.pose_landmarks else fake_kps
+    result["lhand"] = collect_xyv(
+        results.left_hand_landmarks, False
+    ) if results.left_hand_landmarks else fake_kps
+    result["rhand"] = collect_xyv(
+        results.right_hand_landmarks, False
+    ) if results.right_hand_landmarks else fake_kps
+    result["face"] = collect_xyv(
+        results.face_landmarks, False
+    ) if results.face_landmarks else fake_kps
+    return result
+def get_pymafx(image, landmarks):
+    # image [3,512,512]
+    item = {
+        'img_body':
+            F.interpolate(image.unsqueeze(0), size=224, mode='bicubic', align_corners=True)[0]
+    }
+    for part in ['lhand', 'rhand', 'face']:
+        kp2d = landmarks[part]
+        kp2d_valid = kp2d[kp2d[:, 3] > 0.]
+        if len(kp2d_valid) > 0:
+            bbox = [
+                min(kp2d_valid[:, 0]),
+                min(kp2d_valid[:, 1]),
+                max(kp2d_valid[:, 0]),
+                max(kp2d_valid[:, 1])
+            ]
+            center_part = [(bbox[2] + bbox[0]) / 2., (bbox[3] + bbox[1]) / 2.]
+            scale_part = 2. * max(bbox[2] - bbox[0], bbox[3] - bbox[1]) / 2
+        # handle invalid part keypoints
+        if len(kp2d_valid) < 1 or scale_part < 0.01:
+            center_part = [0, 0]
+            scale_part = 0.5
+            kp2d[:, 3] = 0
+        center_part = torch.tensor(center_part).float()
+        theta_part = torch.zeros(1, 2, 3)
+        theta_part[:, 0, 0] = scale_part
+        theta_part[:, 1, 1] = scale_part
+        theta_part[:, :, -1] = center_part
+        grid = F.affine_grid(theta_part, torch.Size([1, 3, 224, 224]), align_corners=False)
+        img_part = F.grid_sample(image.unsqueeze(0), grid, align_corners=False).squeeze(0).float()
+        item[f'img_{part}'] = img_part
+        theta_i_inv = torch.zeros_like(theta_part)
+        theta_i_inv[:, 0, 0] = 1. / theta_part[:, 0, 0]
+        theta_i_inv[:, 1, 1] = 1. / theta_part[:, 1, 1]
+        theta_i_inv[:, :, -1] = -theta_part[:, :, -1] / theta_part[:, 0, 0].unsqueeze(-1)
+        item[f'{part}_theta_inv'] = theta_i_inv[0]
+    return item
+def remove_floats(mask):
+    # 1. find all the contours
+    # 2. fillPoly "True" for the largest one
+    # 3. fillPoly "False" for its childrens
+    new_mask = np.zeros(mask.shape)
+    cnts, hier = cv2.findContours(mask.astype(np.uint8), cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
+    cnt_index = sorted(range(len(cnts)), key=lambda k: cv2.contourArea(cnts[k]), reverse=True)
+    body_cnt = cnts[cnt_index[0]]
+    childs_cnt_idx = np.where(np.array(hier)[0, :, -1] == cnt_index[0])[0]
+    childs_cnt = [cnts[idx] for idx in childs_cnt_idx]
+    cv2.fillPoly(new_mask, [body_cnt], 1)
+    cv2.fillPoly(new_mask, childs_cnt, 0)
+    return new_mask
+def econ_process_image(img_file, hps_type, single, input_res, detector):
+    img_raw, (in_height, in_width) = load_img(img_file)
+    tgt_res = input_res * 2
+    M_square = get_affine_matrix_wh(in_width, in_height, tgt_res, tgt_res)
+    img_square = warp_affine(
+        img_raw,
+        M_square[:, :2], (tgt_res, ) * 2,
+        mode='bilinear',
+        padding_mode='zeros',
+        align_corners=True
+    )
+    # detection for bbox
+    predictions = detector(img_square / 255.)[0]
+    if single:
+        top_score = predictions["scores"][predictions["labels"] == 1].max()
+        human_ids = torch.where(predictions["scores"] == top_score)[0]
+    else:
+        human_ids = torch.logical_and(predictions["labels"] == 1,
+                                      predictions["scores"] > 0.9).nonzero().squeeze(1)
+    boxes = predictions["boxes"][human_ids, :].detach().cpu().numpy()
+    masks = predictions["masks"][human_ids, :, :].permute(0, 2, 3, 1).detach().cpu().numpy()
+    M_crop = get_affine_matrix_box(boxes, input_res, input_res)
+    img_icon_lst = []
+    img_crop_lst = []
+    img_hps_lst = []
+    img_mask_lst = []
+    landmark_lst = []
+    hands_visibility_lst = []
+    img_pymafx_lst = []
+    uncrop_param = {
+        "ori_shape": [in_height, in_width],
+        "box_shape": [input_res, input_res],
+        "square_shape": [tgt_res, tgt_res],
+        "M_square": M_square,
+        "M_crop": M_crop
+    }
+    for idx in range(len(boxes)):
+        # mask out the pixels of others
+        if len(masks) > 1:
+            mask_detection = (masks[np.arange(len(masks)) != idx]).max(axis=0)
+        else:
+            mask_detection = masks[0] * 0.
+        img_square_rgba = torch.cat(
+            [img_square.squeeze(0).permute(1, 2, 0),
+             torch.tensor(mask_detection < 0.4) * 255],
+            dim=2
+        )
+        img_crop = warp_affine(
+            img_square_rgba.unsqueeze(0).permute(0, 3, 1, 2),
+            M_crop[idx:idx + 1, :2], (input_res, ) * 2,
+            mode='bilinear',
+            padding_mode='zeros',
+            align_corners=True
+        ).squeeze(0).permute(1, 2, 0).numpy().astype(np.uint8)
+        # get accurate person segmentation mask
+        img_rembg = remove(img_crop) #post_process_mask=True)
+        img_mask = remove_floats(img_rembg[:, :, [3]])
+        mean_icon = std_icon = (0.5, 0.5, 0.5)
+        img_np = (img_rembg[..., :3] * img_mask).astype(np.uint8)
+        img_icon = transform_to_tensor(512, mean_icon, std_icon)(
+            Image.fromarray(img_np)
+        ) * torch.tensor(img_mask).permute(2, 0, 1)
+        img_hps = transform_to_tensor(224, constants.IMG_NORM_MEAN,
+                                      constants.IMG_NORM_STD)(Image.fromarray(img_np))
+        landmarks = get_keypoints(img_np)
+        # get hands visibility
+        hands_visibility = [True, True]
+        if landmarks['lhand'][:, -1].mean() == 0.:
+            hands_visibility[0] = False
+        if landmarks['rhand'][:, -1].mean() == 0.:
+            hands_visibility[1] = False
+        hands_visibility_lst.append(hands_visibility)
+        if hps_type == 'pymafx':
+            img_pymafx_lst.append(
+                get_pymafx(
+                    transform_to_tensor(512, constants.IMG_NORM_MEAN,
+                                        constants.IMG_NORM_STD)(Image.fromarray(img_np)), landmarks
+                )
+            )
+        img_crop_lst.append(torch.tensor(img_crop).permute(2, 0, 1) / 255.0)
+        img_icon_lst.append(img_icon)
+        img_hps_lst.append(img_hps)
+        img_mask_lst.append(torch.tensor(img_mask[..., 0]))
+        landmark_lst.append(landmarks['body'])
+    # required image tensors / arrays
+    # img_icon  (tensor): (-1, 1),          [3,512,512]
+    # img_hps   (tensor): (-2.11, 2.44),    [3,224,224]
+    # img_np    (array): (0, 255),          [512,512,3]
+    # img_rembg (array): (0, 255),          [512,512,4]
+    # img_mask  (array): (0, 1),            [512,512,1]
+    # img_crop  (array): (0, 255),          [512,512,4]
+    return_dict = {
+        "img_icon": torch.stack(img_icon_lst).float(),    #[N, 3, res, res]
+        "img_crop": torch.stack(img_crop_lst).float(),    #[N, 4, res, res]
+        "img_hps": torch.stack(img_hps_lst).float(),    #[N, 3, res, res]
+        "img_raw": img_raw,    #[1, 3, H, W]
+        "img_mask": torch.stack(img_mask_lst).float(),    #[N, res, res]
+        "uncrop_param": uncrop_param,
+        "landmark": torch.stack(landmark_lst),    #[N, 33, 4]
+        "hands_visibility": hands_visibility_lst,
+    }
+    img_pymafx = {}
+    if len(img_pymafx_lst) > 0:
+        for idx in range(len(img_pymafx_lst)):
+            for key in img_pymafx_lst[idx].keys():
+                if key not in img_pymafx.keys():
+                    img_pymafx[key] = [img_pymafx_lst[idx][key]]
+                else:
+                    img_pymafx[key] += [img_pymafx_lst[idx][key]]
+        for key in img_pymafx.keys():
+            img_pymafx[key] = torch.stack(img_pymafx[key]).float()
+        return_dict.update({"img_pymafx": img_pymafx})
+    return return_dict
+def blend_rgb_norm(norms, data):
+    # norms [N, 3, res, res]
+    masks = (norms.sum(dim=1) != norms[0, :, 0, 0].sum()).float().unsqueeze(1)
+    norm_mask = F.interpolate(
+        torch.cat([norms, masks], dim=1).detach(),
+        size=data["uncrop_param"]["box_shape"],
+        mode="bilinear",
+        align_corners=False
+    )
+    final = data["img_raw"].type_as(norm_mask)
+    for idx in range(len(norms)):
+        norm_pred = (norm_mask[idx:idx + 1, :3, :, :] + 1.0) * 255.0 / 2.0
+        mask_pred = norm_mask[idx:idx + 1, 3:4, :, :].repeat(1, 3, 1, 1)
+        norm_ori = unwrap(norm_pred, data["uncrop_param"], idx)
+        mask_ori = unwrap(mask_pred, data["uncrop_param"], idx)
+        final = final * (1.0 - mask_ori) + norm_ori * mask_ori
+    return final.detach().cpu()
+def unwrap(image, uncrop_param, idx):
+    device = image.device
+    img_square = warp_affine(
+        image,
+        torch.inverse(uncrop_param["M_crop"])[idx:idx + 1, :2].to(device),
+        uncrop_param["square_shape"],
+        mode='bilinear',
+        padding_mode='zeros',
+        align_corners=True
+    )
+    img_ori = warp_affine(
+        img_square,
+        torch.inverse(uncrop_param["M_square"])[:, :2].to(device),
+        uncrop_param["ori_shape"],
+        mode='bilinear',
+        padding_mode='zeros',
+        align_corners=True
+    )
+    return img_ori

lib/common/render.py ADDED Viewed

	@@ -0,0 +1,398 @@

+# -*- coding: utf-8 -*-
+# Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V. (MPG) is
+# holder of all proprietary rights on this computer program.
+# You can only use this computer program if you have closed
+# a license agreement with MPG or you get the right to use the computer
+# program from someone who is authorized to grant you that right.
+# Any use of the computer program without a valid license is prohibited and
+# liable to prosecution.
+#
+# Copyright©2019 Max-Planck-Gesellschaft zur Förderung
+# der Wissenschaften e.V. (MPG). acting on behalf of its Max Planck Institute
+# for Intelligent Systems. All rights reserved.
+#
+# Contact: ps-license@tuebingen.mpg.de
+from pytorch3d.renderer import (
+    BlendParams,
+    blending,
+    look_at_view_transform,
+    FoVOrthographicCameras,
+    PointLights,
+    RasterizationSettings,
+    PointsRasterizationSettings,
+    PointsRenderer,
+    AlphaCompositor,
+    PointsRasterizer,
+    MeshRenderer,
+    MeshRasterizer,
+    SoftPhongShader,
+    SoftSilhouetteShader,
+    TexturesVertex,
+)
+from pytorch3d.renderer.mesh import TexturesVertex
+from pytorch3d.structures import Meshes
+from lib.dataset.mesh_util import get_visibility, get_visibility_color
+import lib.common.render_utils as util
+import torch
+import numpy as np
+from PIL import Image
+from tqdm import tqdm
+import os
+import cv2
+import math
+from termcolor import colored
+def image2vid(images, vid_path):
+    w, h = images[0].size
+    videodims = (w, h)
+    fourcc = cv2.VideoWriter_fourcc(*'XVID')
+    video = cv2.VideoWriter(vid_path, fourcc, len(images) / 5.0, videodims)
+    for image in images:
+        video.write(cv2.cvtColor(np.array(image), cv2.COLOR_RGB2BGR))
+    video.release()
+def query_color(verts, faces, image, device, predicted_color):
+    """query colors from points and image
+    Args:
+        verts ([B, 3]): [query verts]
+        faces ([M, 3]): [query faces]
+        image ([B, 3, H, W]): [full image]
+    Returns:
+        [np.float]: [return colors]
+    """
+    verts = verts.float().to(device)
+    faces = faces.long().to(device)
+    predicted_color=predicted_color.to(device)
+    (xy, z) = verts.split([2, 1], dim=1)
+    visibility = get_visibility_color(xy, z, faces[:, [0, 2, 1]]).flatten()
+    uv = xy.unsqueeze(0).unsqueeze(2)  # [B, N, 2]
+    uv = uv * torch.tensor([1.0, -1.0]).type_as(uv)
+    colors = (torch.nn.functional.grid_sample(
+        image, uv, align_corners=True)[0, :, :, 0].permute(1, 0) +
+              1.0) * 0.5 * 255.0
+    colors[visibility == 0.0]=(predicted_color* 255.0)[visibility == 0.0]
+    return colors.detach().cpu()
+class cleanShader(torch.nn.Module):
+    def __init__(self, device="cpu", cameras=None, blend_params=None):
+        super().__init__()
+        self.cameras = cameras
+        self.blend_params = blend_params if blend_params is not None else BlendParams(
+        )
+    def forward(self, fragments, meshes, **kwargs):
+        cameras = kwargs.get("cameras", self.cameras)
+        if cameras is None:
+            msg = "Cameras must be specified either at initialization \
+                or in the forward pass of TexturedSoftPhongShader"
+            raise ValueError(msg)
+        # get renderer output
+        blend_params = kwargs.get("blend_params", self.blend_params)
+        texels = meshes.sample_textures(fragments)
+        images = blending.softmax_rgb_blend(texels,
+                                            fragments,
+                                            blend_params,
+                                            znear=-256,
+                                            zfar=256)
+        return images
+class Render:
+    def __init__(self, size=512, device=torch.device("cuda:0")):
+        self.device = device
+        self.size = size
+        # camera setting
+        self.dis = 100.0
+        self.scale = 100.0
+        self.mesh_y_center = 0.0
+        self.reload_cam()
+        self.type = "color"
+        self.mesh = None
+        self.deform_mesh = None
+        self.pcd = None
+        self.renderer = None
+        self.meshRas = None
+        self.uv_rasterizer = util.Pytorch3dRasterizer(self.size)
+    def reload_cam(self):
+        self.cam_pos = [
+            (0, self.mesh_y_center, self.dis),
+            (self.dis, self.mesh_y_center, 0),
+            (0, self.mesh_y_center, -self.dis),
+            (-self.dis, self.mesh_y_center, 0),
+            (0,self.mesh_y_center+self.dis,0),
+            (0,self.mesh_y_center-self.dis,0),
+        ]
+    def get_camera(self, cam_id):
+        if cam_id == 4:
+            R, T = look_at_view_transform(
+                eye=[self.cam_pos[cam_id]],
+                at=((0, self.mesh_y_center, 0), ),
+                up=((0, 0, 1), ),
+            )
+        elif cam_id == 5:
+            R, T = look_at_view_transform(
+                eye=[self.cam_pos[cam_id]],
+                at=((0, self.mesh_y_center, 0), ),
+                up=((0, 0, 1), ),
+            )
+        else:
+            R, T = look_at_view_transform(
+                eye=[self.cam_pos[cam_id]],
+                at=((0, self.mesh_y_center, 0), ),
+                up=((0, 1, 0), ),
+            )
+        camera = FoVOrthographicCameras(
+            device=self.device,
+            R=R,
+            T=T,
+            znear=100.0,
+            zfar=-100.0,
+            max_y=100.0,
+            min_y=-100.0,
+            max_x=100.0,
+            min_x=-100.0,
+            scale_xyz=(self.scale * np.ones(3), ),
+        )
+        return camera
+    def init_renderer(self, camera, type="clean_mesh", bg="gray"):
+        if "mesh" in type:
+            # rasterizer
+            self.raster_settings_mesh = RasterizationSettings(
+                image_size=self.size,
+                blur_radius=np.log(1.0 / 1e-4) * 1e-7,
+                faces_per_pixel=30,
+            )
+            self.meshRas = MeshRasterizer(
+                cameras=camera, raster_settings=self.raster_settings_mesh)
+        if bg == "black":
+            blendparam = BlendParams(1e-4, 1e-4, (0.0, 0.0, 0.0))
+        elif bg == "white":
+            blendparam = BlendParams(1e-4, 1e-8, (1.0, 1.0, 1.0))
+        elif bg == "gray":
+            blendparam = BlendParams(1e-4, 1e-8, (0.5, 0.5, 0.5))
+        if type == "ori_mesh":
+            lights = PointLights(
+                device=self.device,
+                ambient_color=((0.8, 0.8, 0.8), ),
+                diffuse_color=((0.2, 0.2, 0.2), ),
+                specular_color=((0.0, 0.0, 0.0), ),
+                location=[[0.0, 200.0, 0.0]],
+            )
+            self.renderer = MeshRenderer(
+                rasterizer=self.meshRas,
+                shader=SoftPhongShader(
+                    device=self.device,
+                    cameras=camera,
+                    lights=None,
+                    blend_params=blendparam,
+                ),
+            )
+        if type == "silhouette":
+            self.raster_settings_silhouette = RasterizationSettings(
+                image_size=self.size,
+                blur_radius=np.log(1.0 / 1e-4 - 1.0) * 5e-5,
+                faces_per_pixel=50,
+                cull_backfaces=True,
+            )
+            self.silhouetteRas = MeshRasterizer(
+                cameras=camera,
+                raster_settings=self.raster_settings_silhouette)
+            self.renderer = MeshRenderer(rasterizer=self.silhouetteRas,
+                                         shader=SoftSilhouetteShader())
+        if type == "pointcloud":
+            self.raster_settings_pcd = PointsRasterizationSettings(
+                image_size=self.size, radius=0.006, points_per_pixel=10)
+            self.pcdRas = PointsRasterizer(
+                cameras=camera, raster_settings=self.raster_settings_pcd)
+            self.renderer = PointsRenderer(
+                rasterizer=self.pcdRas,
+                compositor=AlphaCompositor(background_color=(0, 0, 0)),
+            )
+        if type == "clean_mesh":
+            self.renderer = MeshRenderer(
+                rasterizer=self.meshRas,
+                shader=cleanShader(device=self.device,
+                                   cameras=camera,
+                                   blend_params=blendparam),
+            )
+    def VF2Mesh(self, verts, faces, vertex_texture = None):
+        if not torch.is_tensor(verts):
+            verts = torch.tensor(verts)
+        if not torch.is_tensor(faces):
+            faces = torch.tensor(faces)
+        if verts.ndimension() == 2:
+            verts = verts.unsqueeze(0).float()
+        if faces.ndimension() == 2:
+            faces = faces.unsqueeze(0).long()
+        verts = verts.to(self.device)
+        faces = faces.to(self.device)
+        if vertex_texture is not None:
+            vertex_texture = vertex_texture.to(self.device)
+        mesh = Meshes(verts, faces).to(self.device)
+        if vertex_texture is None:
+            mesh.textures = TexturesVertex(
+                verts_features=(mesh.verts_normals_padded() + 1.0) * 0.5)#modify
+        else:
+            mesh.textures = TexturesVertex(
+                verts_features = vertex_texture.unsqueeze(0))#modify
+        return mesh
+    def load_meshes(self, verts, faces,offset=None, vertex_texture = None):
+        """load mesh into the pytorch3d renderer
+        Args:
+            verts ([N,3]): verts
+            faces ([N,3]): faces
+            offset ([N,3]): offset
+        """
+        if offset is not None:
+            verts = verts + offset
+        if isinstance(verts, list):
+            self.meshes = []
+            for V, F in zip(verts, faces):
+                if vertex_texture is None:
+                    self.meshes.append(self.VF2Mesh(V, F))
+                else:
+                    self.meshes.append(self.VF2Mesh(V, F, vertex_texture))
+        else:
+            if vertex_texture is None:
+                self.meshes = [self.VF2Mesh(verts, faces)]
+            else:
+                self.meshes = [self.VF2Mesh(verts, faces, vertex_texture)]
+    def get_depth_map(self, cam_ids=[0, 2]):
+        depth_maps = []
+        for cam_id in cam_ids:
+            self.init_renderer(self.get_camera(cam_id), "clean_mesh", "gray")
+            fragments = self.meshRas(self.meshes[0])
+            depth_map = fragments.zbuf[..., 0].squeeze(0)
+            if cam_id == 2:
+                depth_map = torch.fliplr(depth_map)
+            depth_maps.append(depth_map)
+        return depth_maps
+    def get_rgb_image(self, cam_ids=[0, 2], bg='gray'):
+        images = []
+        for cam_id in range(len(self.cam_pos)):
+            if cam_id in cam_ids:
+                self.init_renderer(self.get_camera(cam_id), "clean_mesh", bg)
+                if len(cam_ids) == 4:
+                    rendered_img = (self.renderer(
+                        self.meshes[0])[0:1, :, :, :3].permute(0, 3, 1, 2) -
+                                    0.5) * 2.0
+                else:
+                    rendered_img = (self.renderer(
+                        self.meshes[0])[0:1, :, :, :3].permute(0, 3, 1, 2) -
+                                    0.5) * 2.0
+                if cam_id == 2 and len(cam_ids) == 2:
+                    rendered_img = torch.flip(rendered_img, dims=[3])
+                images.append(rendered_img)
+        return images
+    def get_rendered_video(self, images, save_path):
+        self.cam_pos = []
+        for angle in range(360):
+            self.cam_pos.append((
+                100.0 * math.cos(np.pi / 180 * angle),
+                self.mesh_y_center,
+                100.0 * math.sin(np.pi / 180 * angle),
+            ))
+        old_shape = np.array(images[0].shape[:2])
+        new_shape = np.around(
+            (self.size / old_shape[0]) * old_shape).astype(np.int)
+        fourcc = cv2.VideoWriter_fourcc(*"mp4v")
+        video = cv2.VideoWriter(save_path, fourcc, 10,
+                                (self.size * len(self.meshes) +
+                                 new_shape[1] * len(images), self.size))
+        pbar = tqdm(range(len(self.cam_pos)))
+        pbar.set_description(
+            colored(f"exporting video {os.path.basename(save_path)}...",
+                    "blue"))
+        for cam_id in pbar:
+            self.init_renderer(self.get_camera(cam_id), "clean_mesh", "gray")
+            img_lst = [
+                np.array(Image.fromarray(img).resize(new_shape[::-1])).astype(
+                    np.uint8)[:, :, [2, 1, 0]] for img in images
+            ]
+            for mesh in self.meshes:
+                rendered_img = ((self.renderer(mesh)[0, :, :, :3] *
+                                 255.0).detach().cpu().numpy().astype(
+                                     np.uint8))
+                img_lst.append(rendered_img)
+            final_img = np.concatenate(img_lst, axis=1)
+            video.write(final_img)
+        video.release()
+        self.reload_cam()
+    def get_silhouette_image(self, cam_ids=[0, 2]):
+        images = []
+        for cam_id in range(len(self.cam_pos)):
+            if cam_id in cam_ids:
+                self.init_renderer(self.get_camera(cam_id), "silhouette")
+                rendered_img = self.renderer(self.meshes[0])[0:1, :, :, 3]
+                if cam_id == 2 and len(cam_ids) == 2:
+                    rendered_img = torch.flip(rendered_img, dims=[2])
+                images.append(rendered_img)
+        return images