opdmulti-demo / README.md
atwang's picture
finish documentation of repo
17456cf
|
raw
history blame
4.61 kB
metadata
title: Opdmulti Demo
emoji: 🌍
colorFrom: gray
colorTo: red
sdk: docker
app_port: 7860
pinned: false
license: mit

OPDMulti: Openable Part Detection for Multiple Objects

Xiaohao Sun*, Hanxiao Jiang*, Manolis Savva, Angel Xuan Chang

This repository is intended as a deployment of a demo for the OPDMulti project. Please refer there for more information about the proect and implementation.

arXiv  Website

Installation

Requirements

For the docker build, you will just need docker in order to build and run the container, else you will need

  • python 3.10 (this definitely does not work with 3.11, and you may need to downgrade some packages to work with earlier versions of Python)
  • git
  • cmake
  • libosmesa6-dev (for open3d headless rendering)

A full list of other packages can be found in the Dockerfile, or in Open3D/util/install_deps_ubuntu.sh.

BEFORE BUILDING as of writing, you will need to copy the model file manually to .data/models/motion_state_pred_opdformerp_rgb.pth in the repository. This step must occur before the docker build, or if building locally then before running. Future work will make this step no longer required.

Docker (preferred)

To build the docker container, run

docker build -f Dockerfile -t opdmulti-demo .

Local

To setup the environment, run the following (recommended in a virtual environment):

# install base requirements
python3.10 -m pip install -r requirements.txt

# install detectron2 (must be done after some of the libraries in requirements.txt)
python3.10 -m pip install git+https://github.com/facebookresearch/detectron2.git@fc9c33b1f6e5d4c37bbb46dde19af41afc1ddb2a

# build library for model
cd mask2former/modeling/pixel_decoder/ops
python setup.py build install

# INSTALL OPEN3D
# --------------
# Option A: running locally only
pip install open3d==0.17.0

# Option B: running over ssh connection / headless environment
# in a separate folder
git clone https://github.com/isl-org/Open3D.git
cd Open3D/
mkdir build && cd build
cmake -DENABLE_HEADLESS_RENDERING=ON -DBUILD_GUI=OFF -DBUILD_WEBRTC=OFF -DUSE_SYSTEM_GLEW=OFF -DUSE_SYSTEM_GLFW=OFF ..
make -j$(nproc)
make install-pip-package
# to test custom build
cd ../examples/python/visualization/
python headless_rendering.py

Usage

Docker (preferred)

To run the docker container, execute

docker run -d --network host -t opdmulti-demo

If you want to see the output of the container or interact with it,

  • use -it to run in interactive mode, and remove the -d option
  • add bash to the end to open into a console rather than running the app directly

Local

To startup the application locally, run

gradio app.py

You can view the app on the specified port (usually 7860). To run over an ssh connection, setup port forwarding using -L 7860:localhost:7860 when you create your ssh connection. Note that you will need to install Open3D in headless rendering for this to work, as described above.

Citation

If you find this code useful, please consider citing:

@article{sun2023opdmulti,
  title={OPDMulti: Openable Part Detection for Multiple Objects},
  author={Sun, Xiaohao and Jiang, Hanxiao and Savva, Manolis and Chang, Angel Xuan},
  journal={arXiv preprint arXiv:2303.14087},
  year={2023}
}

@article{mao2022multiscan,
  title={MultiScan: Scalable RGBD scanning for 3D environments with articulated objects},
  author={Mao, Yongsen and Zhang, Yiming and Jiang, Hanxiao and Chang, Angel and Savva, Manolis},
  journal={Advances in Neural Information Processing Systems},
  volume={35},
  pages={9058--9071},
  year={2022}
}

@inproceedings{jiang2022opd,
  title={OPD: Single-view 3D openable part detection},
  author={Jiang, Hanxiao and Mao, Yongsen and Savva, Manolis and Chang, Angel X},
  booktitle={Computer Vision--ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23--27, 2022, Proceedings, Part XXXIX},
  pages={410--426},
  year={2022},
  organization={Springer}
}

@inproceedings{cheng2022masked,
  title={Masked-attention mask transformer for universal image segmentation},
  author={Cheng, Bowen and Misra, Ishan and Schwing, Alexander G and Kirillov, Alexander and Girdhar, Rohit},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={1290--1299},
  year={2022}
}