Spaces:

AnTo2209
/

3D_Zeroshot_Neural_Style_Transfer

Runtime error

App Files Files Community

AnTo2209 commited on Sep 5, 2023

Commit

e32c848

1 Parent(s): f130928

refactor

Browse files

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

.gitignore +4 -0
.idea/{3D_Zeroshot_Neural_Style_Transfer.iml → Thesis_3DStyleTransfer.iml} +1 -5
.idea/misc.xml +4 -0
.idea/modules.xml +1 -1
.idea/vcs.xml +0 -1
README.md +0 -13
__pycache__/inference.cpython-38.pyc +0 -0
app.py +0 -4
configs/feature_baseline.yml +65 -0
configs/llff.txt +0 -35
configs/llff_feature.txt +0 -26
configs/llff_style.txt +0 -28
configs/nerf_synthetic.txt +0 -33
configs/nerf_synthetic_feature.txt +0 -24
configs/nerf_synthetic_style.txt +0 -22
configs/style_baseline.yml +80 -0
configs/style_inference.yml +80 -0
dataLoader/__init__.py +0 -13
dataLoader/__pycache__/__init__.cpython-38.pyc +0 -0
dataLoader/__pycache__/nsvf.cpython-38.pyc +0 -0
dataLoader/__pycache__/styleLoader.cpython-38.pyc +0 -0
dataLoader/__pycache__/tankstemple.cpython-38.pyc +0 -0
dataLoader/__pycache__/your_own_data.cpython-38.pyc +0 -0
dataLoader/colmap2nerf.py +0 -305
dataLoader/nsvf.py +0 -160
dataLoader/styleLoader.py +0 -16
dataLoader/tankstemple.py +0 -247
dataLoader/your_own_data.py +0 -129
extra/auto_run_paramsets.py +0 -207
inference.py +75 -0
main.py +75 -0
models/__pycache__/__init__.cpython-38.pyc +0 -0
opt.py +0 -153
requirements.txt +12 -2
scripts/test.sh +0 -5
scripts/test_feature.sh +0 -10
scripts/test_style.sh +0 -13
scripts/train.sh +0 -1
scripts/train_feature.sh +0 -1
scripts/train_style.sh +0 -1
{models → src}/__init__.py +0 -0
src/__pycache__/__init__.cpython-38.pyc +0 -0
src/callback/__init__.py +16 -0
src/callback/__pycache__/__init__.cpython-38.pyc +0 -0
src/dataset/__init__.py +10 -0
src/dataset/__pycache__/__init__.cpython-38.pyc +0 -0
dataLoader/__pycache__/blender.cpython-38.pyc → src/dataset/__pycache__/blender_dataset.cpython-38.pyc +0 -0
dataLoader/__pycache__/llff.cpython-38.pyc → src/dataset/__pycache__/llff_dataset.cpython-38.pyc +0 -0
{dataLoader → src/dataset}/__pycache__/ray_utils.cpython-38.pyc +0 -0
src/dataset/__pycache__/style_dataset.cpython-38.pyc +0 -0

.gitignore CHANGED Viewed

@@ -1,5 +1,9 @@
 data/
 checkpoints/
 venv/
 log_style/

 data/
 checkpoints/
 venv/
+log_feature/
+log/
+trex/
 log_style/

.idea/{3D_Zeroshot_Neural_Style_Transfer.iml → Thesis_3DStyleTransfer.iml} RENAMED Viewed

@@ -2,11 +2,7 @@
 <module type="PYTHON_MODULE" version="4">
   <component name="NewModuleRootManager">
     <content url="file://$MODULE_DIR$" />
-    <orderEntry type="inheritedJdk" />
     <orderEntry type="sourceFolder" forTests="false" />
   </component>
-  <component name="PyDocumentationSettings">
-    <option name="format" value="PLAIN" />
-    <option name="myDocStringFormat" value="Plain" />
-  </component>
 </module>

 <module type="PYTHON_MODULE" version="4">
   <component name="NewModuleRootManager">
     <content url="file://$MODULE_DIR$" />
+    <orderEntry type="jdk" jdkName="Python 3.8 (Thesis_3DStyleTransfer)" jdkType="Python SDK" />
     <orderEntry type="sourceFolder" forTests="false" />
   </component>
 </module>

.idea/misc.xml ADDED Viewed

	@@ -0,0 +1,4 @@

+<?xml version="1.0" encoding="UTF-8"?>
+<project version="4">
+  <component name="ProjectRootManager" version="2" project-jdk-name="Python 3.8 (Thesis_3DStyleTransfer)" project-jdk-type="Python SDK" />
+</project>

.idea/modules.xml CHANGED Viewed

@@ -2,7 +2,7 @@
 <project version="4">
   <component name="ProjectModuleManager">
     <modules>
-      <module fileurl="file://$PROJECT_DIR$/.idea/3D_Zeroshot_Neural_Style_Transfer.iml" filepath="$PROJECT_DIR$/.idea/3D_Zeroshot_Neural_Style_Transfer.iml" />
     </modules>
   </component>
 </project>

 <project version="4">
   <component name="ProjectModuleManager">
     <modules>
+      <module fileurl="file://$PROJECT_DIR$/.idea/Thesis_3DStyleTransfer.iml" filepath="$PROJECT_DIR$/.idea/Thesis_3DStyleTransfer.iml" />
     </modules>
   </component>
 </project>

.idea/vcs.xml CHANGED Viewed

@@ -1,7 +1,6 @@
 <?xml version="1.0" encoding="UTF-8"?>
 <project version="4">
   <component name="VcsDirectoryMappings">
-    <mapping directory="" vcs="Git" />
     <mapping directory="$PROJECT_DIR$" vcs="Git" />
   </component>
 </project>

 <?xml version="1.0" encoding="UTF-8"?>
 <project version="4">
   <component name="VcsDirectoryMappings">
     <mapping directory="$PROJECT_DIR$" vcs="Git" />
   </component>
 </project>

README.md DELETED Viewed

@@ -1,13 +0,0 @@
----
-title: 3D Zeroshot Neural Style Transfer
-emoji: 🌖
-colorFrom: purple
-colorTo: green
-sdk: streamlit
-sdk_version: 1.26.0
-app_file: app.py
-pinned: false
-license: unlicense
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

__pycache__/inference.cpython-38.pyc ADDED Viewed

Binary file (2.57 kB). View file

app.py DELETED Viewed

@@ -1,4 +0,0 @@
-import streamlit as st
-x = st.slider('Select a value')
-st.write(x, 'squared is', x * x)

configs/feature_baseline.yml ADDED Viewed

	@@ -0,0 +1,65 @@

+global:
+  username: totuanan06
+  project_name: thesis
+  name: 3d-style-transfer
+  expname: "lego"
+  base_dir: "./log"
+  save_dir: "./runs"
+  verbose: true
+  pretrained: null
+  resume: 0
+  SEED: 20211202
+dataset:
+  name: BlenderDataset
+  train:
+    params:
+      datadir: "data/nerf_synthetic/lego"
+      split: "train"
+      downsample: 1.0
+      is_stack: True
+      N_vis: -1
+  val:
+    params:
+      datadir: "data/nerf_synthetic/lego"
+      split: "val"
+      downsample: 1.0
+      is_stack: False
+      N_vis: -1
+sampler:
+  name: SimpleSampler
+  params:
+    N_voxel_init: 2097156 # 128**3
+    step_ratio: 0.5
+    batch_size: 2048
+    chunk_size: 1024
+    patch_size: 256
+    n_samples: 1000000
+model:
+  name: StyleRF
+  type: "feature"
+  tensorf:
+    ckpt: "log/lego/lego.th"
+    model_name: "TensorVMSplit"
+    lamb_sh: [48,48,48]
+    rm_weight_mask_thre: 0.01
+    TV_weight_feature: 0
+optimizer:
+  lr_init: 0.02
+  lr_basis: 1e-4
+  lr_decay_iters: -1
+  lr_decay_target_ratio: 0.1
+trainer:
+  use_fp16: false
+  debug: false
+  n_iters: 30000
+  evaluate_interval: 1
+  log_interval: 1
+  save_interval: 1
+callbacks:
+  - name: ModelCheckpoint
+    params:
+      filename: "baseline-{epoch}-{NN:.4f}-{mAP:.4f}-{train_loss:.4f}-{val_loss:.4f}"
+      monitor: "NN"
+      verbose: True
+      save_top_k: 1
+      mode: min

configs/llff.txt DELETED Viewed

@@ -1,35 +0,0 @@
-dataset_name = llff
-datadir = ./data/nerf_llff_data/trex
-expname = trex
-basedir = ./log
-downsample_train = 4.0
-ndc_ray = 1
-n_iters = 25000
-batch_size = 4096
-N_voxel_init = 2097156 # 128**3
-N_voxel_final = 262144000 # 640**3
-upsamp_list = [2000,3000,4000,5500]
-update_AlphaMask_list = [2500]
-N_vis = -1 # vis all testing images
-vis_every = 10000
-render_test = 1
-render_path = 1
-n_lamb_sigma = [16,4,4]
-n_lamb_sh = [48,12,12]
-shadingMode = MLP_Fea
-fea2denseAct = relu
-view_pe = 0
-fea_pe = 0
-TV_weight_density = 1.0
-TV_weight_app = 1.0

configs/llff_feature.txt DELETED Viewed

@@ -1,26 +0,0 @@
-dataset_name = llff
-datadir = ./data/nerf_llff_data/trex
-ckpt = ./log/trex/trex.th
-expname = trex
-basedir = ./log_feature
-TV_weight_feature = 80
-downsample_train = 4.0
-ndc_ray = 1
-n_iters = 25000
-patch_size = 256
-batch_size = 4096
-chunk_size = 4096
-N_voxel_init = 2097156 # 128**3
-N_voxel_final = 262144000 # 640**3
-upsamp_list = [2000,3000,4000,5500]
-update_AlphaMask_list = [2500]
-n_lamb_sigma = [16,4,4]
-n_lamb_sh = [48,12,12]
-fea2denseAct = relu

configs/llff_style.txt DELETED Viewed

@@ -1,28 +0,0 @@
-dataset_name = llff
-datadir = ./data/nerf_llff_data/trex
-ckpt = ./log_feature/trex/trex.th
-expname = trex
-basedir = ./log_style
-nSamples = 300
-patch_size = 256
-chunk_size = 2048
-content_weight = 1
-style_weight = 20
-featuremap_tv_weight = 0
-image_tv_weight = 0
-rm_weight_mask_thre = 0.001
-downsample_train = 4.0
-ndc_ray = 1
-n_iters = 25000
-n_lamb_sigma = [16,4,4]
-n_lamb_sh = [48,12,12]
-N_voxel_init = 2097156 # 128**3
-N_voxel_final = 262144000 # 640**3
-fea2denseAct = relu

configs/nerf_synthetic.txt DELETED Viewed

@@ -1,33 +0,0 @@
-dataset_name = blender
-datadir = ./data/nerf_synthetic/lego
-expname =  lego
-basedir = ./log
-n_iters = 30000
-batch_size = 4096
-N_voxel_init = 2097156 # 128**3
-N_voxel_final = 27000000 # 300**3
-upsamp_list = [2000,3000,4000,5500,7000]
-update_AlphaMask_list = [2000,4000]
-N_vis = 5
-vis_every = 10000
-render_test = 1
-n_lamb_sigma = [16,16,16]
-n_lamb_sh = [48,48,48]
-model_name = TensorVMSplit
-shadingMode = MLP_Fea
-fea2denseAct = softplus
-view_pe = 2
-fea_pe = 2
-L1_weight_inital = 8e-5
-L1_weight_rest = 4e-5
-rm_weight_mask_thre = 1e-4

configs/nerf_synthetic_feature.txt DELETED Viewed

@@ -1,24 +0,0 @@
-dataset_name = blender
-datadir = ./data/nerf_synthetic/lego
-ckpt = ./log/lego/lego.th
-expname = lego
-basedir = ./log_feature
-TV_weight_feature = 10
-n_iters = 25000
-patch_size = 256
-batch_size = 4096
-chunk_size = 4096
-N_voxel_init = 2097156 # 128**3
-N_voxel_final = 27000000 # 300**3
-upsamp_list = [2000,3000,4000,5500,7000]
-update_AlphaMask_list = [2000,4000]
-rm_weight_mask_thre = 0.01
-n_lamb_sigma = [16,16,16]
-n_lamb_sh = [48,48,48]
-fea2denseAct = softplus

configs/nerf_synthetic_style.txt DELETED Viewed

@@ -1,22 +0,0 @@
-dataset_name = blender
-datadir = ./data/nerf_synthetic/lego
-ckpt = ./log_feature/lego/lego.th
-expname = lego
-basedir = ./log_style
-patch_size = 256
-chunk_size = 2048
-content_weight = 1
-style_weight = 20
-rm_weight_mask_thre = 0.01
-n_iters = 25000
-n_lamb_sigma = [16,16,16]
-n_lamb_sh = [48,48,48]
-N_voxel_init = 2097156 # 128**3
-N_voxel_final = 27000000 # 300**3
-fea2denseAct = softplus

configs/style_baseline.yml ADDED Viewed

	@@ -0,0 +1,80 @@

+global:
+  username: totuanan06
+  project_name: thesis
+  name: 3d-style-transfer
+  expname: "trex"
+  base_dir: "./log"
+  save_dir: "./runs"
+  style_img: "image_style/example.jpg"
+  verbose: true
+  pretrained: null
+  resume: 0
+  SEED: 20211202
+dataset:
+  name: BlenderDataset
+  train:
+    params:
+      datadir: "data/nerf_llff/trex"
+      split: "train"
+      downsample: 4.0
+      is_stack: True
+      N_vis: -1
+  val:
+    params:
+      datadir: "data/nerf_llff/trex"
+      split: "val"
+      downsample: 1.0
+      is_stack: False
+      N_vis: -1
+sampler:
+  name: SimpleSampler
+  params:
+    N_voxel_init: 2097156 # 128**3
+    step_ratio: 0.5
+    batch_size: 2048
+    chunk_size: 2048
+    patch_size: 256
+    n_samples: 300
+model:
+  name: StyleRF
+  type: "style"
+  tensorf:
+    ckpt: "log_style/trex/trex.th"
+    model_name: "TensorVMSplit"
+    lamb_sh: [48,12,12]
+    rm_weight_mask_thre: 0.001
+    TV_weight_feature: 0
+    ndc_ray: 1
+style_dataset:
+  name: StyleDataset
+  train:
+    params:
+      datadir: "..."
+      batch_size: 1
+      image_side_length: 256
+      num_workers: 2
+style_config:
+  content_weight: 1
+  style_weight: 20
+  featuremap_tv_weight: 0
+  image_tv_weight: 0
+optimizer:
+  lr_init: 0.02
+  lr_basis: 1e-4
+  lr_decay_iters: -1
+  lr_decay_target_ratio: 0.1
+trainer:
+  use_fp16: false
+  debug: false
+  n_iters: 25000
+  evaluate_interval: 1
+  log_interval: 1
+  save_interval: 1
+callbacks:
+  - name: ModelCheckpoint
+    params:
+      filename: "baseline-{epoch}-{NN:.4f}-{mAP:.4f}-{train_loss:.4f}-{val_loss:.4f}"
+      monitor: "NN"
+      verbose: True
+      save_top_k: 1
+      mode: min

configs/style_inference.yml ADDED Viewed

	@@ -0,0 +1,80 @@

+global:
+  username: totuanan06
+  project_name: thesis
+  name: 3d-style-transfer
+  expname: "trex"
+  base_dir: "./log"
+  save_dir: "./runs"
+  style_img: "image_style/example.jpg"
+  verbose: true
+  pretrained: null
+  resume: 0
+  SEED: 20211202
+dataset:
+  name: LLFFDataset
+  train:
+    params:
+      datadir: "data/nerf_llff_data/trex"
+      split: "train"
+      downsample: 4.0
+      is_stack: True
+      N_vis: -1
+  val:
+    params:
+      datadir: "data/nerf_llff_data/trex"
+      split: "test"
+      downsample: 4.0
+      is_stack: True
+      N_vis: -1
+sampler:
+  name: SimpleSampler
+  params:
+    N_voxel_init: 2097156 # 128**3
+    step_ratio: 0.5
+    batch_size: 2048
+    chunk_size: 1024
+    patch_size: 256
+    n_samples: 300
+model:
+  name: StyleRF
+  type: "style"
+  tensorf:
+    ckpt: "log_style/trex/trex.th"
+    model_name: "TensorVMSplit"
+    lamb_sh: [48,12,12]
+    rm_weight_mask_thre: 0.001
+    TV_weight_feature: 0
+    ndc_ray: 1
+style_dataset:
+  name: StyleDataset
+  train:
+    params:
+      datadir: "..."
+      batch_size: 1
+      image_side_length: 256
+      num_workers: 2
+style_config:
+  content_weight: 1
+  style_weight: 20
+  featuremap_tv_weight: 0
+  image_tv_weight: 0
+optimizer:
+  lr_init: 0.02
+  lr_basis: 1e-4
+  lr_decay_iters: -1
+  lr_decay_target_ratio: 0.1
+trainer:
+  use_fp16: false
+  debug: false
+  n_iters: 5
+  evaluate_interval: 1
+  log_interval: 1
+  save_interval: 1
+callbacks:
+  - name: ModelCheckpoint
+    params:
+      filename: "baseline-{epoch}-{NN:.4f}-{mAP:.4f}-{train_loss:.4f}-{val_loss:.4f}"
+      monitor: "NN"
+      verbose: True
+      save_top_k: 1
+      mode: min

dataLoader/__init__.py DELETED Viewed

@@ -1,13 +0,0 @@
-from .llff import LLFFDataset
-from .blender import BlenderDataset
-from .nsvf import NSVF
-from .tankstemple import TanksTempleDataset
-from .your_own_data import YourOwnDataset
-dataset_dict = {'blender': BlenderDataset,
-               'llff':LLFFDataset,
-               'tankstemple':TanksTempleDataset,
-               'nsvf':NSVF,
-                'own_data':YourOwnDataset}

dataLoader/__pycache__/__init__.cpython-38.pyc DELETED Viewed

Binary file (445 Bytes)

dataLoader/__pycache__/nsvf.cpython-38.pyc DELETED Viewed

Binary file (6.47 kB)

dataLoader/__pycache__/styleLoader.cpython-38.pyc DELETED Viewed

Binary file (735 Bytes)

dataLoader/__pycache__/tankstemple.cpython-38.pyc DELETED Viewed

Binary file (9.75 kB)

dataLoader/__pycache__/your_own_data.cpython-38.pyc DELETED Viewed

Binary file (4.24 kB)

dataLoader/colmap2nerf.py DELETED Viewed

@@ -1,305 +0,0 @@
-#!/usr/bin/env python3
-# Copyright (c) 2020-2022, NVIDIA CORPORATION.  All rights reserved.
-#
-# NVIDIA CORPORATION and its licensors retain all intellectual property
-# and proprietary rights in and to this software, related documentation
-# and any modifications thereto.  Any use, reproduction, disclosure or
-# distribution of this software and related documentation without an express
-# license agreement from NVIDIA CORPORATION is strictly prohibited.
-import argparse
-import os
-from pathlib import Path, PurePosixPath
-import numpy as np
-import json
-import sys
-import math
-import cv2
-import os
-import shutil
-def parse_args():
-	parser = argparse.ArgumentParser(description="convert a text colmap export to nerf format transforms.json; optionally convert video to images, and optionally run colmap in the first place")
-	parser.add_argument("--video_in", default="", help="run ffmpeg first to convert a provided video file into a set of images. uses the video_fps parameter also")
-	parser.add_argument("--video_fps", default=2)
-	parser.add_argument("--time_slice", default="", help="time (in seconds) in the format t1,t2 within which the images should be generated from the video. eg: \"--time_slice '10,300'\" will generate images only from 10th second to 300th second of the video")
-	parser.add_argument("--run_colmap", action="store_true", help="run colmap first on the image folder")
-	parser.add_argument("--colmap_matcher", default="sequential", choices=["exhaustive","sequential","spatial","transitive","vocab_tree"], help="select which matcher colmap should use. sequential for videos, exhaustive for adhoc images")
-	parser.add_argument("--colmap_db", default="colmap.db", help="colmap database filename")
-	parser.add_argument("--images", default="images", help="input path to the images")
-	parser.add_argument("--text", default="colmap_text", help="input path to the colmap text files (set automatically if run_colmap is used)")
-	parser.add_argument("--aabb_scale", default=16, choices=["1","2","4","8","16"], help="large scene scale factor. 1=scene fits in unit cube; power of 2 up to 16")
-	parser.add_argument("--skip_early", default=0, help="skip this many images from the start")
-	parser.add_argument("--out", default="transforms.json", help="output path")
-	args = parser.parse_args()
-	return args
-def do_system(arg):
-	print(f"==== running: {arg}")
-	err = os.system(arg)
-	if err:
-		print("FATAL: command failed")
-		sys.exit(err)
-def run_ffmpeg(args):
-	if not os.path.isabs(args.images):
-		args.images = os.path.join(os.path.dirname(args.video_in), args.images)
-	images = args.images
-	video = args.video_in
-	fps = float(args.video_fps) or 1.0
-	print(f"running ffmpeg with input video file={video}, output image folder={images}, fps={fps}.")
-	if (input(f"warning! folder '{images}' will be deleted/replaced. continue? (Y/n)").lower().strip()+"y")[:1] != "y":
-		sys.exit(1)
-	try:
-		shutil.rmtree(images)
-	except:
-		pass
-	do_system(f"mkdir {images}")
-	time_slice_value = ""
-	time_slice = args.time_slice
-	if time_slice:
-	    start, end = time_slice.split(",")
-	    time_slice_value = f",select='between(t\,{start}\,{end})'"
-	do_system(f"ffmpeg -i {video} -qscale:v 1 -qmin 1 -vf \"fps={fps}{time_slice_value}\" {images}/%04d.jpg")
-def run_colmap(args):
-	db=args.colmap_db
-	images=args.images
-	db_noext=str(Path(db).with_suffix(""))
-	if args.text=="text":
-		args.text=db_noext+"_text"
-	text=args.text
-	sparse=db_noext+"_sparse"
-	print(f"running colmap with:\n\tdb={db}\n\timages={images}\n\tsparse={sparse}\n\ttext={text}")
-	if (input(f"warning! folders '{sparse}' and '{text}' will be deleted/replaced. continue? (Y/n)").lower().strip()+"y")[:1] != "y":
-		sys.exit(1)
-	if os.path.exists(db):
-		os.remove(db)
-	do_system(f"colmap feature_extractor --ImageReader.camera_model OPENCV --SiftExtraction.estimate_affine_shape=true --SiftExtraction.domain_size_pooling=true --ImageReader.single_camera 1 --database_path {db} --image_path {images}")
-	do_system(f"colmap {args.colmap_matcher}_matcher --SiftMatching.guided_matching=true --database_path {db}")
-	try:
-		shutil.rmtree(sparse)
-	except:
-		pass
-	do_system(f"mkdir {sparse}")
-	do_system(f"colmap mapper --database_path {db} --image_path {images} --output_path {sparse}")
-	do_system(f"colmap bundle_adjuster --input_path {sparse}/0 --output_path {sparse}/0 --BundleAdjustment.refine_principal_point 1")
-	try:
-		shutil.rmtree(text)
-	except:
-		pass
-	do_system(f"mkdir {text}")
-	do_system(f"colmap model_converter --input_path {sparse}/0 --output_path {text} --output_type TXT")
-def variance_of_laplacian(image):
-	return cv2.Laplacian(image, cv2.CV_64F).var()
-def sharpness(imagePath):
-	image = cv2.imread(imagePath)
-	gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
-	fm = variance_of_laplacian(gray)
-	return fm
-def qvec2rotmat(qvec):
-	return np.array([
-		[
-			1 - 2 * qvec[2]**2 - 2 * qvec[3]**2,
-			2 * qvec[1] * qvec[2] - 2 * qvec[0] * qvec[3],
-			2 * qvec[3] * qvec[1] + 2 * qvec[0] * qvec[2]
-		], [
-			2 * qvec[1] * qvec[2] + 2 * qvec[0] * qvec[3],
-			1 - 2 * qvec[1]**2 - 2 * qvec[3]**2,
-			2 * qvec[2] * qvec[3] - 2 * qvec[0] * qvec[1]
-		], [
-			2 * qvec[3] * qvec[1] - 2 * qvec[0] * qvec[2],
-			2 * qvec[2] * qvec[3] + 2 * qvec[0] * qvec[1],
-			1 - 2 * qvec[1]**2 - 2 * qvec[2]**2
-		]
-	])
-def rotmat(a, b):
-	a, b = a / np.linalg.norm(a), b / np.linalg.norm(b)
-	v = np.cross(a, b)
-	c = np.dot(a, b)
-	s = np.linalg.norm(v)
-	kmat = np.array([[0, -v[2], v[1]], [v[2], 0, -v[0]], [-v[1], v[0], 0]])
-	return np.eye(3) + kmat + kmat.dot(kmat) * ((1 - c) / (s ** 2 + 1e-10))
-def closest_point_2_lines(oa, da, ob, db): # returns point closest to both rays of form o+t*d, and a weight factor that goes to 0 if the lines are parallel
-	da = da / np.linalg.norm(da)
-	db = db / np.linalg.norm(db)
-	c = np.cross(da, db)
-	denom = np.linalg.norm(c)**2
-	t = ob - oa
-	ta = np.linalg.det([t, db, c]) / (denom + 1e-10)
-	tb = np.linalg.det([t, da, c]) / (denom + 1e-10)
-	if ta > 0:
-		ta = 0
-	if tb > 0:
-		tb = 0
-	return (oa+ta*da+ob+tb*db) * 0.5, denom
-if __name__ == "__main__":
-	args = parse_args()
-	if args.video_in != "":
-		run_ffmpeg(args)
-	if args.run_colmap:
-		run_colmap(args)
-	AABB_SCALE = int(args.aabb_scale)
-	SKIP_EARLY = int(args.skip_early)
-	IMAGE_FOLDER = args.images
-	TEXT_FOLDER = args.text
-	OUT_PATH = args.out
-	print(f"outputting to {OUT_PATH}...")
-	with open(os.path.join(TEXT_FOLDER,"cameras.txt"), "r") as f:
-		angle_x = math.pi / 2
-		for line in f:
-			# 1 SIMPLE_RADIAL 2048 1536 1580.46 1024 768 0.0045691
-			# 1 OPENCV 3840 2160 3178.27 3182.09 1920 1080 0.159668 -0.231286 -0.00123982 0.00272224
-			# 1 RADIAL 1920 1080 1665.1 960 540 0.0672856 -0.0761443
-			if line[0] == "#":
-				continue
-			els = line.split(" ")
-			w = float(els[2])
-			h = float(els[3])
-			fl_x = float(els[4])
-			fl_y = float(els[4])
-			k1 = 0
-			k2 = 0
-			p1 = 0
-			p2 = 0
-			cx = w / 2
-			cy = h / 2
-			if els[1] == "SIMPLE_PINHOLE":
-				cx = float(els[5])
-				cy = float(els[6])
-			elif els[1] == "PINHOLE":
-				fl_y = float(els[5])
-				cx = float(els[6])
-				cy = float(els[7])
-			elif els[1] == "SIMPLE_RADIAL":
-				cx = float(els[5])
-				cy = float(els[6])
-				k1 = float(els[7])
-			elif els[1] == "RADIAL":
-				cx = float(els[5])
-				cy = float(els[6])
-				k1 = float(els[7])
-				k2 = float(els[8])
-			elif els[1] == "OPENCV":
-				fl_y = float(els[5])
-				cx = float(els[6])
-				cy = float(els[7])
-				k1 = float(els[8])
-				k2 = float(els[9])
-				p1 = float(els[10])
-				p2 = float(els[11])
-			else:
-				print("unknown camera model ", els[1])
-			# fl = 0.5 * w / tan(0.5 * angle_x);
-			angle_x = math.atan(w / (fl_x * 2)) * 2
-			angle_y = math.atan(h / (fl_y * 2)) * 2
-			fovx = angle_x * 180 / math.pi
-			fovy = angle_y * 180 / math.pi
-	print(f"camera:\n\tres={w,h}\n\tcenter={cx,cy}\n\tfocal={fl_x,fl_y}\n\tfov={fovx,fovy}\n\tk={k1,k2} p={p1,p2} ")
-	with open(os.path.join(TEXT_FOLDER,"images.txt"), "r") as f:
-		i = 0
-		bottom = np.array([0.0, 0.0, 0.0, 1.0]).reshape([1, 4])
-		out = {
-			"camera_angle_x": angle_x,
-			"camera_angle_y": angle_y,
-			"fl_x": fl_x,
-			"fl_y": fl_y,
-			"k1": k1,
-			"k2": k2,
-			"p1": p1,
-			"p2": p2,
-			"cx": cx,
-			"cy": cy,
-			"w": w,
-			"h": h,
-			"aabb_scale": AABB_SCALE,
-			"frames": [],
-		}
-		up = np.zeros(3)
-		for line in f:
-			line = line.strip()
-			if line[0] == "#":
-				continue
-			i = i + 1
-			if i < SKIP_EARLY*2:
-				continue
-			if  i % 2 == 1:
-				elems=line.split(" ") # 1-4 is quat, 5-7 is trans, 9ff is filename (9, if filename contains no spaces)
-				#name = str(PurePosixPath(Path(IMAGE_FOLDER, elems[9])))
-				# why is this requireing a relitive path while using ^
-				image_rel = os.path.relpath(IMAGE_FOLDER)
-				name = str(f"./{image_rel}/{'_'.join(elems[9:])}")
-				b=sharpness(name)
-				print(name, "sharpness=",b)
-				image_id = int(elems[0])
-				qvec = np.array(tuple(map(float, elems[1:5])))
-				tvec = np.array(tuple(map(float, elems[5:8])))
-				R = qvec2rotmat(-qvec)
-				t = tvec.reshape([3,1])
-				m = np.concatenate([np.concatenate([R, t], 1), bottom], 0)
-				c2w = np.linalg.inv(m)
-				c2w[0:3,2] *= -1 # flip the y and z axis
-				c2w[0:3,1] *= -1
-				c2w = c2w[[1,0,2,3],:] # swap y and z
-				c2w[2,:] *= -1 # flip whole world upside down
-				up += c2w[0:3,1]
-				frame={"file_path":name,"sharpness":b,"transform_matrix": c2w}
-				out["frames"].append(frame)
-	nframes = len(out["frames"])
-	up = up / np.linalg.norm(up)
-	print("up vector was", up)
-	R = rotmat(up,[0,0,1]) # rotate up vector to [0,0,1]
-	R = np.pad(R,[0,1])
-	R[-1, -1] = 1
-	for f in out["frames"]:
-		f["transform_matrix"] = np.matmul(R, f["transform_matrix"]) # rotate up to be the z axis
-	# find a central point they are all looking at
-	print("computing center of attention...")
-	totw = 0.0
-	totp = np.array([0.0, 0.0, 0.0])
-	for f in out["frames"]:
-		mf = f["transform_matrix"][0:3,:]
-		for g in out["frames"]:
-			mg = g["transform_matrix"][0:3,:]
-			p, w = closest_point_2_lines(mf[:,3], mf[:,2], mg[:,3], mg[:,2])
-			if w > 0.01:
-				totp += p*w
-				totw += w
-	totp /= totw
-	print(totp) # the cameras are looking at totp
-	for f in out["frames"]:
-		f["transform_matrix"][0:3,3] -= totp
-	avglen = 0.
-	for f in out["frames"]:
-		avglen += np.linalg.norm(f["transform_matrix"][0:3,3])
-	avglen /= nframes
-	print("avg camera distance from origin", avglen)
-	for f in out["frames"]:
-		f["transform_matrix"][0:3,3] *= 4.0 / avglen # scale to "nerf sized"
-	for f in out["frames"]:
-		f["transform_matrix"] = f["transform_matrix"].tolist()
-	print(nframes,"frames")
-	print(f"writing {OUT_PATH}")
-	with open(OUT_PATH, "w") as outfile:
-		json.dump(out, outfile, indent=2)

dataLoader/nsvf.py DELETED Viewed

@@ -1,160 +0,0 @@
-import torch
-from torch.utils.data import Dataset
-from tqdm import tqdm
-import os
-from PIL import Image
-from torchvision import transforms as T
-from .ray_utils import *
-trans_t = lambda t : torch.Tensor([
-    [1,0,0,0],
-    [0,1,0,0],
-    [0,0,1,t],
-    [0,0,0,1]]).float()
-rot_phi = lambda phi : torch.Tensor([
-    [1,0,0,0],
-    [0,np.cos(phi),-np.sin(phi),0],
-    [0,np.sin(phi), np.cos(phi),0],
-    [0,0,0,1]]).float()
-rot_theta = lambda th : torch.Tensor([
-    [np.cos(th),0,-np.sin(th),0],
-    [0,1,0,0],
-    [np.sin(th),0, np.cos(th),0],
-    [0,0,0,1]]).float()
-def pose_spherical(theta, phi, radius):
-    c2w = trans_t(radius)
-    c2w = rot_phi(phi/180.*np.pi) @ c2w
-    c2w = rot_theta(theta/180.*np.pi) @ c2w
-    c2w = torch.Tensor(np.array([[-1,0,0,0],[0,0,1,0],[0,1,0,0],[0,0,0,1]])) @ c2w
-    return c2w
-class NSVF(Dataset):
-    """NSVF Generic Dataset."""
-    def __init__(self, datadir, split='train', downsample=1.0, wh=[800,800], is_stack=False):
-        self.root_dir = datadir
-        self.split = split
-        self.is_stack = is_stack
-        self.downsample = downsample
-        self.img_wh = (int(wh[0]/downsample),int(wh[1]/downsample))
-        self.define_transforms()
-        self.white_bg = True
-        self.near_far = [0.5,6.0]
-        self.scene_bbox = torch.from_numpy(np.loadtxt(f'{self.root_dir}/bbox.txt')).float()[:6].view(2,3)
-        self.blender2opencv = np.array([[1, 0, 0, 0], [0, -1, 0, 0], [0, 0, -1, 0], [0, 0, 0, 1]])
-        self.read_meta()
-        self.define_proj_mat()
-        self.center = torch.mean(self.scene_bbox, axis=0).float().view(1, 1, 3)
-        self.radius = (self.scene_bbox[1] - self.center).float().view(1, 1, 3)
-    def bbox2corners(self):
-        corners = self.scene_bbox.unsqueeze(0).repeat(4,1,1)
-        for i in range(3):
-            corners[i,[0,1],i] = corners[i,[1,0],i]
-        return corners.view(-1,3)
-    def read_meta(self):
-        with open(os.path.join(self.root_dir, "intrinsics.txt")) as f:
-            focal = float(f.readline().split()[0])
-        self.intrinsics = np.array([[focal,0,400.0],[0,focal,400.0],[0,0,1]])
-        self.intrinsics[:2] *= (np.array(self.img_wh)/np.array([800,800])).reshape(2,1)
-        pose_files = sorted(os.listdir(os.path.join(self.root_dir, 'pose')))
-        img_files  = sorted(os.listdir(os.path.join(self.root_dir, 'rgb')))
-        if self.split == 'train':
-            pose_files = [x for x in pose_files if x.startswith('0_')]
-            img_files = [x for x in img_files if x.startswith('0_')]
-        elif self.split == 'val':
-            pose_files = [x for x in pose_files if x.startswith('1_')]
-            img_files = [x for x in img_files if x.startswith('1_')]
-        elif self.split == 'test':
-            test_pose_files = [x for x in pose_files if x.startswith('2_')]
-            test_img_files = [x for x in img_files if x.startswith('2_')]
-            if len(test_pose_files) == 0:
-                test_pose_files = [x for x in pose_files if x.startswith('1_')]
-                test_img_files = [x for x in img_files if x.startswith('1_')]
-            pose_files = test_pose_files
-            img_files = test_img_files
-        # ray directions for all pixels, same for all images (same H, W, focal)
-        self.directions = get_ray_directions(self.img_wh[1], self.img_wh[0], [self.intrinsics[0,0],self.intrinsics[1,1]], center=self.intrinsics[:2,2])  # (h, w, 3)
-        self.directions = self.directions / torch.norm(self.directions, dim=-1, keepdim=True)
-        self.render_path = torch.stack([pose_spherical(angle, -30.0, 4.0) for angle in np.linspace(-180,180,40+1)[:-1]], 0)
-        self.poses = []
-        self.all_rays = []
-        self.all_rgbs = []
-        assert len(img_files) == len(pose_files)
-        for img_fname, pose_fname in tqdm(zip(img_files, pose_files), desc=f'Loading data {self.split} ({len(img_files)})'):
-            image_path = os.path.join(self.root_dir, 'rgb', img_fname)
-            img = Image.open(image_path)
-            if self.downsample!=1.0:
-                img = img.resize(self.img_wh, Image.LANCZOS)
-            img = self.transform(img)  # (4, h, w)
-            img = img.view(img.shape[0], -1).permute(1, 0)  # (h*w, 4) RGBA
-            if img.shape[-1]==4:
-                img = img[:, :3] * img[:, -1:] + (1 - img[:, -1:])  # blend A to RGB
-            self.all_rgbs += [img]
-            c2w = np.loadtxt(os.path.join(self.root_dir, 'pose', pose_fname)) #@ self.blender2opencv
-            c2w = torch.FloatTensor(c2w)
-            self.poses.append(c2w)  # C2W
-            rays_o, rays_d = get_rays(self.directions, c2w)  # both (h*w, 3)
-            self.all_rays += [torch.cat([rays_o, rays_d], 1)]  # (h*w, 8)
-#             w2c = torch.inverse(c2w)
-#
-        self.poses = torch.stack(self.poses)
-        if 'train' == self.split:
-            if self.is_stack:
-                self.all_rays = torch.stack(self.all_rays, 0).reshape(-1,*self.img_wh[::-1], 6)  # (len(self.meta['frames])*h*w, 3)
-                self.all_rgbs = torch.stack(self.all_rgbs, 0).reshape(-1,*self.img_wh[::-1], 3)  # (len(self.meta['frames])*h*w, 3)
-            else:
-                self.all_rays = torch.cat(self.all_rays, 0)  # (len(self.meta['frames])*h*w, 3)
-                self.all_rgbs = torch.cat(self.all_rgbs, 0)  # (len(self.meta['frames])*h*w, 3)
-        else:
-            self.all_rays = torch.stack(self.all_rays, 0)  # (len(self.meta['frames]),h*w, 3)
-            self.all_rgbs = torch.stack(self.all_rgbs, 0).reshape(-1,*self.img_wh[::-1], 3)  # (len(self.meta['frames]),h,w,3)
-    def define_transforms(self):
-        self.transform = T.ToTensor()
-    def define_proj_mat(self):
-        self.proj_mat = torch.from_numpy(self.intrinsics[:3,:3]).unsqueeze(0).float() @ torch.inverse(self.poses)[:,:3]
-    def world2ndc(self, points):
-        device = points.device
-        return (points - self.center.to(device)) / self.radius.to(device)
-    def __len__(self):
-        if self.split == 'train':
-            return len(self.all_rays)
-        return len(self.all_rgbs)
-    def __getitem__(self, idx):
-        if self.split == 'train':  # use data in the buffers
-            sample = {'rays': self.all_rays[idx],
-                      'rgbs': self.all_rgbs[idx]}
-        else:  # create data for each image separately
-            img = self.all_rgbs[idx]
-            rays = self.all_rays[idx]
-            sample = {'rays': rays,
-                      'rgbs': img}
-        return sample

dataLoader/styleLoader.py DELETED Viewed

@@ -1,16 +0,0 @@
-from torch.utils.data import DataLoader
-from torchvision import datasets
-import torchvision.transforms as T
-def getDataLoader(dataset_path, batch_size, sampler, image_side_length=256, num_workers=2):
-    transform = T.Compose([
-                T.Resize(size=(image_side_length*2, image_side_length*2)),
-                T.RandomCrop(image_side_length),
-                T.ToTensor(),
-            ])
-    train_dataset = datasets.ImageFolder(dataset_path, transform=transform)
-    dataloader = DataLoader(train_dataset, batch_size=batch_size, sampler=sampler(len(train_dataset)), num_workers=num_workers)
-    return dataloader

dataLoader/tankstemple.py DELETED Viewed

@@ -1,247 +0,0 @@
-import torch
-from torch.utils.data import Dataset
-from tqdm import tqdm
-import os
-from PIL import Image
-from torchvision import transforms as T
-import random
-from .ray_utils import *
-def circle(radius=3.5, h=0.0, axis='z', t0=0, r=1):
-    if axis == 'z':
-        return lambda t: [radius * np.cos(r * t + t0), radius * np.sin(r * t + t0), h]
-    elif axis == 'y':
-        return lambda t: [radius * np.cos(r * t + t0), h, radius * np.sin(r * t + t0)]
-    else:
-        return lambda t: [h, radius * np.cos(r * t + t0), radius * np.sin(r * t + t0)]
-def cross(x, y, axis=0):
-    T = torch if isinstance(x, torch.Tensor) else np
-    return T.cross(x, y, axis)
-def normalize(x, axis=-1, order=2):
-    if isinstance(x, torch.Tensor):
-        l2 = x.norm(p=order, dim=axis, keepdim=True)
-        return x / (l2 + 1e-8), l2
-    else:
-        l2 = np.linalg.norm(x, order, axis)
-        l2 = np.expand_dims(l2, axis)
-        l2[l2 == 0] = 1
-        return x / l2,
-def cat(x, axis=1):
-    if isinstance(x[0], torch.Tensor):
-        return torch.cat(x, dim=axis)
-    return np.concatenate(x, axis=axis)
-def look_at_rotation(camera_position, at=None, up=None, inverse=False, cv=False):
-    """
-    This function takes a vector 'camera_position' which specifies the location
-    of the camera in world coordinates and two vectors `at` and `up` which
-    indicate the position of the object and the up directions of the world
-    coordinate system respectively. The object is assumed to be centered at
-    the origin.
-    The output is a rotation matrix representing the transformation
-    from world coordinates -> view coordinates.
-    Input:
-        camera_position: 3
-        at: 1 x 3 or N x 3  (0, 0, 0) in default
-        up: 1 x 3 or N x 3  (0, 1, 0) in default
-    """
-    if at is None:
-        at = torch.zeros_like(camera_position)
-    else:
-        at = torch.tensor(at).type_as(camera_position)
-    if up is None:
-        up = torch.zeros_like(camera_position)
-        up[2] = -1
-    else:
-        up = torch.tensor(up).type_as(camera_position)
-    z_axis = normalize(at - camera_position)[0]
-    x_axis = normalize(cross(up, z_axis))[0]
-    y_axis = normalize(cross(z_axis, x_axis))[0]
-    R = cat([x_axis[:, None], y_axis[:, None], z_axis[:, None]], axis=1)
-    return R
-def gen_path(pos_gen, at=(0, 0, 0), up=(0, -1, 0), frames=180):
-    c2ws = []
-    for t in range(frames):
-        c2w = torch.eye(4)
-        cam_pos = torch.tensor(pos_gen(t * (360.0 / frames) / 180 * np.pi))
-        cam_rot = look_at_rotation(cam_pos, at=at, up=up, inverse=False, cv=True)
-        c2w[:3, 3], c2w[:3, :3] = cam_pos, cam_rot
-        c2ws.append(c2w)
-    return torch.stack(c2ws)
-class TanksTempleDataset(Dataset):
-    """NSVF Generic Dataset."""
-    def __init__(self, datadir, split='train', downsample=4.0, wh=[1920,1080], is_stack=False):
-        self.root_dir = datadir
-        self.split = split
-        self.is_stack = is_stack
-        self.downsample = downsample
-        self.img_wh = (int(wh[0]/downsample),int(wh[1]/downsample))
-        self.define_transforms()
-        self.white_bg = True
-        self.near_far = [0.01,6.0]
-        self.scene_bbox = torch.from_numpy(np.loadtxt(f'{self.root_dir}/bbox.txt')).float()[:6].view(2,3)*1.2
-        self.blender2opencv = np.array([[1, 0, 0, 0], [0, -1, 0, 0], [0, 0, -1, 0], [0, 0, 0, 1]])
-        self.read_meta()
-        self.define_proj_mat()
-        self.center = torch.mean(self.scene_bbox, axis=0).float().view(1, 1, 3)
-        self.radius = (self.scene_bbox[1] - self.center).float().view(1, 1, 3)
-    def bbox2corners(self):
-        corners = self.scene_bbox.unsqueeze(0).repeat(4,1,1)
-        for i in range(3):
-            corners[i,[0,1],i] = corners[i,[1,0],i]
-        return corners.view(-1,3)
-    def read_meta(self):
-        self.intrinsics = np.loadtxt(os.path.join(self.root_dir, "intrinsics.txt"))
-        self.intrinsics[:2] *= (np.array(self.img_wh)/np.array([1920,1080])).reshape(2,1)
-        pose_files = sorted(os.listdir(os.path.join(self.root_dir, 'pose')))
-        img_files  = sorted(os.listdir(os.path.join(self.root_dir, 'rgb')))
-        if self.split == 'train':
-            pose_files = [x for idx,x in enumerate(pose_files) if x.startswith('0_') and idx%3==0]
-            img_files = [x for idx,x in enumerate(img_files) if x.startswith('0_') and idx%3==0]
-        elif self.split == 'test':
-            pose_files = [x for idx,x in enumerate(pose_files) if x.startswith('2_') and idx%3==0]
-            img_files = [x for idx,x in enumerate(img_files) if x.startswith('2_') and idx%3==0]
-            if len(test_pose_files) == 0:
-                test_pose_files = [x for idx,x in enumerate(pose_files) if x.startswith('1_') and idx%3==0]
-                test_img_files = [x for idx,x in enumerate(img_files) if x.startswith('1_') and idx%3==0]
-            pose_files = test_pose_files
-            img_files = test_img_files
-        # ray directions for all pixels, same for all images (same H, W, focal)
-        self.directions = get_ray_directions(self.img_wh[1], self.img_wh[0], [self.intrinsics[0,0],self.intrinsics[1,1]], center=self.intrinsics[:2,2])  # (h, w, 3)
-        self.directions = self.directions / torch.norm(self.directions, dim=-1, keepdim=True)
-        w, h = self.img_wh
-        self.poses = []
-        self.all_rays = []
-        self.all_rgbs = []
-        self.all_masks = []
-        assert len(img_files) == len(pose_files)
-        for img_fname, pose_fname in tqdm(zip(img_files, pose_files), desc=f'Loading data {self.split} ({len(img_files)})'):
-            image_path = os.path.join(self.root_dir, 'rgb', img_fname)
-            img = Image.open(image_path)
-            if self.downsample!=1.0:
-                img = img.resize(self.img_wh, Image.LANCZOS)
-            img = self.transform(img)  # (3, h, w)
-            img = img.view(img.shape[0], -1).permute(1, 0)  # (h*w, 3) RGBA
-            mask =  torch.where(
-                img.sum(-1, keepdim=True) == 3.,
-                1.,
-                0.
-            )
-            self.all_masks.append(mask.reshape(h,w,1)) # (h, w, 1) A
-            if img.shape[-1]==4:
-                img = img[:, :3] * img[:, -1:] + (1 - img[:, -1:])  # blend A to RGB
-            self.all_rgbs.append(img)
-            c2w = np.loadtxt(os.path.join(self.root_dir, 'pose', pose_fname))# @ cam_trans
-            c2w = torch.FloatTensor(c2w)
-            self.poses.append(c2w)  # C2W
-            rays_o, rays_d = get_rays(self.directions, c2w)  # both (h*w, 3)
-            self.all_rays += [torch.cat([rays_o, rays_d], 1)]  # (h*w, 8)
-        self.poses = torch.stack(self.poses)
-        center = torch.mean(self.scene_bbox, dim=0)
-        radius = torch.norm(self.scene_bbox[1]-center)*1.2
-        up = torch.mean(self.poses[:, :3, 1], dim=0).tolist()
-        pos_gen = circle(radius=radius, h=-0.2*up[1], axis='y')
-        self.render_path = gen_path(pos_gen, up=up,frames=100)
-        self.render_path[:, :3, 3] += center
-        all_rays = self.all_rays
-        all_rgbs = self.all_rgbs
-        self.all_masks = torch.stack(self.all_masks) # (n_frames, h, w, 1)
-        self.all_rays = torch.cat(self.all_rays, 0) # (len(self.meta['frames])*h*w,6)
-        self.all_rgbs = torch.cat(self.all_rgbs, 0) # (len(self.meta['frames])*h*w,3)
-        if self.is_stack:
-            self.all_rays_stack = torch.stack(all_rays, 0).reshape(-1,*self.img_wh[::-1], 6)  # (len(self.meta['frames]),h,w,6)
-            avg_pool = torch.nn.AvgPool2d(4, ceil_mode=True)
-            self.ds_all_rays_stack = avg_pool(self.all_rays_stack.permute(0,3,1,2)).permute(0,2,3,1) # (len(self.meta['frames]),h/4,w/4,6)
-            self.all_rgbs_stack = torch.stack(all_rgbs, 0).reshape(-1,*self.img_wh[::-1], 3)  # (len(self.meta['frames]),h,w,3)
-    @torch.no_grad()
-    def prepare_feature_data(self, encoder, chunk=4):
-        '''
-        Prepare feature maps as training data.
-        '''
-        assert self.is_stack, 'Dataset should contain original stacked taining data!'
-        print('====> prepare_feature_data ...')
-        frames_num, h, w, _ = self.all_rgbs_stack.size()
-        features = []
-        for chunk_idx in tqdm(range(frames_num // chunk + int(frames_num % chunk > 0))):
-            rgbs_chunk = self.all_rgbs_stack[chunk_idx*chunk : (chunk_idx+1)*chunk].cuda()
-            features_chunk = encoder(normalize_vgg(rgbs_chunk.permute(0,3,1,2))).relu3_1
-            # resize to the size of rgb map so that rays can match
-            features_chunk = T.functional.resize(features_chunk, size=(h,w),
-                                                 interpolation=T.InterpolationMode.BILINEAR)
-            features.append(features_chunk.detach().cpu().requires_grad_(False))
-        self.all_features_stack = torch.cat(features).permute(0,2,3,1) # (len(self.meta['frames]),h,w,256)
-        self.all_features = self.all_features_stack.reshape(-1, 256)
-        print('prepare_feature_data Done!')
-    def define_transforms(self):
-        self.transform = T.ToTensor()
-    def define_proj_mat(self):
-        self.proj_mat = torch.from_numpy(self.intrinsics[:3,:3]).unsqueeze(0).float() @ torch.inverse(self.poses)[:,:3]
-    def world2ndc(self, points):
-        device = points.device
-        return (points - self.center.to(device)) / self.radius.to(device)
-    def __len__(self):
-        if self.split == 'train':
-            return len(self.all_rays)
-        return len(self.all_rgbs)
-    def __getitem__(self, idx):
-        if self.split == 'train':  # use data in the buffers
-            sample = {'rays': self.all_rays[idx],
-                      'rgbs': self.all_rgbs[idx]}
-        else:  # create data for each image separately
-            img = self.all_rgbs[idx]
-            rays = self.all_rays[idx]
-            sample = {'rays': rays,
-                      'rgbs': img}
-        return sample

dataLoader/your_own_data.py DELETED Viewed

@@ -1,129 +0,0 @@
-import torch,cv2
-from torch.utils.data import Dataset
-import json
-from tqdm import tqdm
-import os
-from PIL import Image
-from torchvision import transforms as T
-from .ray_utils import *
-class YourOwnDataset(Dataset):
-    def __init__(self, datadir, split='train', downsample=1.0, is_stack=False, N_vis=-1):
-        self.N_vis = N_vis
-        self.root_dir = datadir
-        self.split = split
-        self.is_stack = is_stack
-        self.downsample = downsample
-        self.define_transforms()
-        self.scene_bbox = torch.tensor([[-1.5, -1.5, -1.5], [1.5, 1.5, 1.5]])
-        self.blender2opencv = np.array([[1, 0, 0, 0], [0, -1, 0, 0], [0, 0, -1, 0], [0, 0, 0, 1]])
-        self.read_meta()
-        self.define_proj_mat()
-        self.white_bg = True
-        self.near_far = [0.1,100.0]
-        self.center = torch.mean(self.scene_bbox, axis=0).float().view(1, 1, 3)
-        self.radius = (self.scene_bbox[1] - self.center).float().view(1, 1, 3)
-        self.downsample=downsample
-    def read_depth(self, filename):
-        depth = np.array(read_pfm(filename)[0], dtype=np.float32)  # (800, 800)
-        return depth
-    def read_meta(self):
-        with open(os.path.join(self.root_dir, f"transforms_{self.split}.json"), 'r') as f:
-            self.meta = json.load(f)
-        w, h = int(self.meta['w']/self.downsample), int(self.meta['h']/self.downsample)
-        self.img_wh = [w,h]
-        self.focal_x = 0.5 * w / np.tan(0.5 * self.meta['camera_angle_x'])  # original focal length
-        self.focal_y = 0.5 * h / np.tan(0.5 * self.meta['camera_angle_y'])  # original focal length
-        self.cx, self.cy = self.meta['cx'],self.meta['cy']
-        # ray directions for all pixels, same for all images (same H, W, focal)
-        self.directions = get_ray_directions(h, w, [self.focal_x,self.focal_y], center=[self.cx, self.cy])  # (h, w, 3)
-        self.directions = self.directions / torch.norm(self.directions, dim=-1, keepdim=True)
-        self.intrinsics = torch.tensor([[self.focal_x,0,self.cx],[0,self.focal_y,self.cy],[0,0,1]]).float()
-        self.image_paths = []
-        self.poses = []
-        self.all_rays = []
-        self.all_rgbs = []
-        self.all_masks = []
-        self.all_depth = []
-        img_eval_interval = 1 if self.N_vis < 0 else len(self.meta['frames']) // self.N_vis
-        idxs = list(range(0, len(self.meta['frames']), img_eval_interval))
-        for i in tqdm(idxs, desc=f'Loading data {self.split} ({len(idxs)})'):#img_list:#
-            frame = self.meta['frames'][i]
-            pose = np.array(frame['transform_matrix']) @ self.blender2opencv
-            c2w = torch.FloatTensor(pose)
-            self.poses += [c2w]
-            image_path = os.path.join(self.root_dir, f"{frame['file_path']}.png")
-            self.image_paths += [image_path]
-            img = Image.open(image_path)
-            if self.downsample!=1.0:
-                img = img.resize(self.img_wh, Image.LANCZOS)
-            img = self.transform(img)  # (4, h, w)
-            img = img.view(-1, w*h).permute(1, 0)  # (h*w, 4) RGBA
-            if img.shape[-1]==4:
-                img = img[:, :3] * img[:, -1:] + (1 - img[:, -1:])  # blend A to RGB
-            self.all_rgbs += [img]
-            rays_o, rays_d = get_rays(self.directions, c2w)  # both (h*w, 3)
-            self.all_rays += [torch.cat([rays_o, rays_d], 1)]  # (h*w, 6)
-        self.poses = torch.stack(self.poses)
-        if not self.is_stack:
-            self.all_rays = torch.cat(self.all_rays, 0)  # (len(self.meta['frames])*h*w, 3)
-            self.all_rgbs = torch.cat(self.all_rgbs, 0)  # (len(self.meta['frames])*h*w, 3)
-#             self.all_depth = torch.cat(self.all_depth, 0)  # (len(self.meta['frames])*h*w, 3)
-        else:
-            self.all_rays = torch.stack(self.all_rays, 0)  # (len(self.meta['frames]),h*w, 3)
-            self.all_rgbs = torch.stack(self.all_rgbs, 0).reshape(-1,*self.img_wh[::-1], 3)  # (len(self.meta['frames]),h,w,3)
-            # self.all_masks = torch.stack(self.all_masks, 0).reshape(-1,*self.img_wh[::-1])  # (len(self.meta['frames]),h,w,3)
-    def define_transforms(self):
-        self.transform = T.ToTensor()
-    def define_proj_mat(self):
-        self.proj_mat = self.intrinsics.unsqueeze(0) @ torch.inverse(self.poses)[:,:3]
-    def world2ndc(self,points,lindisp=None):
-        device = points.device
-        return (points - self.center.to(device)) / self.radius.to(device)
-    def __len__(self):
-        return len(self.all_rgbs)
-    def __getitem__(self, idx):
-        if self.split == 'train':  # use data in the buffers
-            sample = {'rays': self.all_rays[idx],
-                      'rgbs': self.all_rgbs[idx]}
-        else:  # create data for each image separately
-            img = self.all_rgbs[idx]
-            rays = self.all_rays[idx]
-            mask = self.all_masks[idx] # for quantity evaluation
-            sample = {'rays': rays,
-                      'rgbs': img}
-        return sample

extra/auto_run_paramsets.py DELETED Viewed

@@ -1,207 +0,0 @@
-import os
-import threading, queue
-import numpy as np
-import time
-def getFolderLocker(logFolder):
-    while True:
-        try:
-            os.makedirs(logFolder+"/lockFolder")
-            break
-        except:
-            time.sleep(0.01)
-def releaseFolderLocker(logFolder):
-    os.removedirs(logFolder+"/lockFolder")
-def getStopFolder(logFolder):
-    return os.path.isdir(logFolder+"/stopFolder")
-def get_param_str(key, val):
-    if key == 'data_name':
-        return f'--datadir {datafolder}/{val} '
-    else:
-        return f'--{key} {val} '
-def get_param_list(param_dict):
-    param_keys = list(param_dict.keys())
-    param_modes = len(param_keys)
-    param_nums = [len(param_dict[key]) for key in param_keys]
-    param_ids = np.zeros(param_nums+[param_modes], dtype=int)
-    for i in range(param_modes):
-        broad_tuple = np.ones(param_modes, dtype=int).tolist()
-        broad_tuple[i] = param_nums[i]
-        broad_tuple = tuple(broad_tuple)
-        print(broad_tuple)
-        param_ids[...,i] = np.arange(param_nums[i]).reshape(broad_tuple)
-    param_ids = param_ids.reshape(-1, param_modes)
-    # print(param_ids)
-    print(len(param_ids))
-    params = []
-    expnames = []
-    for i in range(param_ids.shape[0]):
-        one = ""
-        name = ""
-        param_id = param_ids[i]
-        for j in range(param_modes):
-            key = param_keys[j]
-            val = param_dict[key][param_id[j]]
-            if type(key) is tuple:
-                assert len(key) == len(val)
-                for k in range(len(key)):
-                    one += get_param_str(key[k], val[k])
-                    name += f'{val[k]},'
-                name=name[:-1]+'-'
-            else:
-                one += get_param_str(key, val)
-                name += f'{val}-'
-        params.append(one)
-        name=name.replace(' ','')
-        print(name)
-        expnames.append(name[:-1])
-    # print(params)
-    return params, expnames
-if __name__ == '__main__':
-    # nerf
-    expFolder = "nerf/"
-    # parameters to iterate, use tuple to couple multiple parameters
-    datafolder = '/mnt/new_disk_2/anpei/Dataset/nerf_synthetic/'
-    param_dict = {
-        'data_name': ['ship', 'mic', 'chair', 'lego', 'drums', 'ficus', 'hotdog', 'materials'],
-        'data_dim_color': [13, 27, 54]
-    }
-    # n_iters = 30000
-    # for data_name in ['Robot']:#'Bike','Lifestyle','Palace','Robot','Spaceship','Steamtrain','Toad','Wineholder'
-    #     cmd = f'CUDA_VISIBLE_DEVICES={cuda}  python train.py ' \
-    #           f'--dataset_name nsvf --datadir /mnt/new_disk_2/anpei/Dataset/TeRF/Synthetic_NSVF/{data_name} '\
-    #           f'--expname {data_name} --batch_size {batch_size} ' \
-    #           f'--n_iters {n_iters}  ' \
-    #           f'--N_voxel_init {128**3} --N_voxel_final {300**3} '\
-    #           f'--N_vis {5}  ' \
-    #           f'--n_lamb_sigma "[16,16,16]" --n_lamb_sh "[48,48,48]" ' \
-    #           f'--upsamp_list "[2000, 3000, 4000, 5500,7000]" --update_AlphaMask_list "[3000,4000]" ' \
-    #           f'--shadingMode MLP_Fea --fea2denseAct softplus  --view_pe {2} --fea_pe {2} ' \
-    #           f'--L1_weight_inital {8e-5} --L1_weight_rest {4e-5} --rm_weight_mask_thre {1e-4} --add_timestamp 0 ' \
-    #           f'--render_test 1 '
-    #     print(cmd)
-    #     os.system(cmd)
-    # nsvf
-    # expFolder = "nsvf_0227/"
-    # datafolder = '/mnt/new_disk_2/anpei/Dataset/TeRF/Synthetic_NSVF/'
-    # param_dict = {
-    #             'data_name': ['Robot','Steamtrain','Bike','Lifestyle','Palace','Spaceship','Toad','Wineholder'],#'Bike','Lifestyle','Palace','Robot','Spaceship','Steamtrain','Toad','Wineholder'
-    #             'shadingMode': ['SH'],
-    #             ('n_lamb_sigma', 'n_lamb_sh'): [ ("[8,8,8]", "[8,8,8]")],
-    #             ('view_pe', 'fea_pe', 'featureC','fea2denseAct','N_voxel_init') : [(2, 2, 128, 'softplus',128**3)],
-    #             ('L1_weight_inital', 'L1_weight_rest', 'rm_weight_mask_thre'):[(4e-5, 4e-5, 1e-4)],
-    #             ('n_iters','N_voxel_final'): [(30000,300**3)],
-    #             ('dataset_name','N_vis','render_test') : [("nsvf",5,1)],
-    #             ('upsamp_list','update_AlphaMask_list'): [("[2000,3000,4000,5500,7000]","[3000,4000]")]
-    #
-    #     }
-    # tankstemple
-    # expFolder = "tankstemple_0304/"
-    # datafolder = '/mnt/new_disk_2/anpei/Dataset/TeRF/TanksAndTemple/'
-    # param_dict = {
-    #             'data_name': ['Truck','Barn','Caterpillar','Family','Ignatius'],
-    #             'shadingMode': ['MLP_Fea'],
-    #             ('n_lamb_sigma', 'n_lamb_sh'): [("[16,16,16]", "[48,48,48]")],
-    #             ('view_pe', 'fea_pe','fea2denseAct','N_voxel_init','render_test') : [(2, 2, 'softplus',128**3,1)],
-    #             ('TV_weight_density','TV_weight_app'):[(0.1,0.01)],
-    #             # ('L1_weight_inital', 'L1_weight_rest', 'rm_weight_mask_thre'): [(4e-5, 4e-5, 1e-4)],
-    #             ('n_iters','N_voxel_final'): [(15000,300**3)],
-    #             ('dataset_name','N_vis') : [("tankstemple",5)],
-    #             ('upsamp_list','update_AlphaMask_list'): [("[2000,3000,4000,5500,7000]","[2000,4000]")]
-    #     }
-    # llff
-    # expFolder = "real_iconic/"
-    # datafolder = '/mnt/new_disk_2/anpei/Dataset/MVSNeRF/real_iconic/'
-    # List = os.listdir(datafolder)
-    # param_dict = {
-    #             'data_name': List,
-    #             ('shadingMode', 'view_pe', 'fea_pe','fea2denseAct', 'nSamples','N_voxel_init') : [('MLP_Fea', 0, 0, 'relu',512,128**3)],
-    #             ('n_lamb_sigma', 'n_lamb_sh') : [("[16,4,4]", "[48,12,12]")],
-    #             ('TV_weight_density', 'TV_weight_app'):[(1.0,1.0)],
-    #             ('n_iters','N_voxel_final'): [(25000,640**3)],
-    #             ('dataset_name','downsample_train','ndc_ray','N_vis','render_path') : [("llff",4.0, 1,-1,1)],
-    #             ('upsamp_list','update_AlphaMask_list'): [("[2000,3000,4000,5500,7000]","[2500]")],
-    #     }
-    # expFolder = "llff/"
-    # datafolder = '/mnt/new_disk_2/anpei/Dataset/MVSNeRF/nerf_llff_data'
-    # param_dict = {
-    #             'data_name': ['fern', 'flower', 'room', 'leaves', 'horns', 'trex', 'fortress', 'orchids'],#'fern', 'flower', 'room', 'leaves', 'horns', 'trex', 'fortress', 'orchids'
-    #             ('n_lamb_sigma', 'n_lamb_sh'): [("[16,4,4]", "[48,12,12]")],
-    #             ('shadingMode', 'view_pe', 'fea_pe', 'featureC','fea2denseAct', 'nSamples','N_voxel_init') : [('MLP_Fea', 0, 0, 128, 'relu',512,128**3),('SH', 0, 0, 128, 'relu',512,128**3)],
-    #             ('TV_weight_density', 'TV_weight_app'):[(1.0,1.0)],
-    #             ('n_iters','N_voxel_final'): [(25000,640**3)],
-    #             ('dataset_name','downsample_train','ndc_ray','N_vis','render_test','render_path') : [("llff",4.0, 1,-1,1,1)],
-    #             ('upsamp_list','update_AlphaMask_list'): [("[2000,3000,4000,5500,7000]","[2500]")],
-    #     }
-    #setting available gpus
-    gpus_que = queue.Queue(3)
-    for i in [1,2,3]:
-        gpus_que.put(i)
-    os.makedirs(f"log/{expFolder}", exist_ok=True)
-    def run_program(gpu, expname, param):
-        cmd = f'CUDA_VISIBLE_DEVICES={gpu}  python train.py ' \
-            f'--expname {expname} --basedir ./log/{expFolder} --config configs/lego.txt ' \
-            f'{param}' \
-            f'> "log/{expFolder}{expname}/{expname}.txt"'
-        print(cmd)
-        os.system(cmd)
-        gpus_que.put(gpu)
-    params, expnames = get_param_list(param_dict)
-    logFolder=f"log/{expFolder}"
-    os.makedirs(logFolder, exist_ok=True)
-    ths = []
-    for i in range(len(params)):
-        if getStopFolder(logFolder):
-            break
-        targetFolder = f"log/{expFolder}{expnames[i]}"
-        gpu = gpus_que.get()
-        getFolderLocker(logFolder)
-        if os.path.isdir(targetFolder):
-            releaseFolderLocker(logFolder)
-            gpus_que.put(gpu)
-            continue
-        else:
-            os.makedirs(targetFolder, exist_ok=True)
-            print("making",targetFolder, "running",expnames[i], params[i])
-            releaseFolderLocker(logFolder)
-        t = threading.Thread(target=run_program, args=(gpu, expnames[i], params[i]), daemon=True)
-        t.start()
-        ths.append(t)
-    for th in ths:
-        th.join()

inference.py ADDED Viewed

	@@ -0,0 +1,75 @@

+import os
+from pathlib import Path
+import torch
+from lightning_fabric import seed_everything
+from PIL import Image, ImageFile
+from src.dataset import DATASET_REGISTRY
+from src.decoder import DECODER_REGISTRY
+from src.utils.opt import Opts
+import torchvision.transforms as T
+from src.utils.renderer import evaluation_feature, evaluation_feature_path, OctreeRender_trilinear_fast
+def inference(cfg, render_mode: str, image=None):
+    device = "cuda" if torch.cuda.is_available() else "cpu"
+    ckpt = torch.load(cfg["model"]["tensorf"]["ckpt"], map_location=device)
+    kwargs = ckpt['kwargs']
+    kwargs.update({'device': device})
+    print(device)
+    tensorf = DECODER_REGISTRY.get(cfg["model"]["tensorf"]["model_name"])(**kwargs)
+    tensorf.change_to_feature_mod(cfg["model"]["tensorf"]["lamb_sh"], device)
+    tensorf.change_to_style_mod(device)
+    tensorf.load(ckpt)
+    tensorf.eval()
+    tensorf.rayMarch_weight_thres = cfg["model"]["tensorf"]["rm_weight_mask_thre"]
+    logfolder = os.path.dirname("./checkpoints")
+    renderer= OctreeRender_trilinear_fast
+    trans = T.Compose([T.Resize(size=(256, 256)), T.ToTensor()])
+    if image:
+        style_img = trans(image).cuda()[None, ...]
+    else:
+        style_img = trans(Image.open(cfg["global"]["style_img"])).cuda()[None, ...]
+    style_name = Path(cfg["global"]["style_img"]).stem
+    if render_mode == "render_train":
+        dataset = DATASET_REGISTRY.get(cfg["dataset"]["name"])(
+            **cfg["dataset"]["train"]["params"],
+        )
+        os.makedirs(f'{logfolder}/{cfg["global"]["expname"]}/imgs_train_all/{style_name}', exist_ok=True)
+        result = evaluation_feature(dataset, tensorf, renderer, cfg["sampler"]["params"]["chunk_size"],
+                           f'{logfolder}/{cfg["global"]["expname"]}/imgs_train_all/{style_name}',
+                           N_vis=-1, N_samples=-1, white_bg=dataset.white_bg, ndc_ray=cfg["model"]["tensorf"]["ndc_ray"],
+                           style_img=style_img, device=device)
+    if render_mode == "render_test":
+        dataset = DATASET_REGISTRY.get(cfg["dataset"]["name"])(
+            **cfg["dataset"]["val"]["params"],
+        )
+        os.makedirs(f'{logfolder}/{cfg["global"]["expname"]}/imgs_train_all/{style_name}', exist_ok=True)
+        result = evaluation_feature(dataset, tensorf, renderer, cfg["sampler"]["params"]["chunk_size"],
+                           f'{logfolder}/{cfg["global"]["expname"]}/imgs_train_all/{style_name}',
+                           N_vis=-1, N_samples=-1, white_bg=dataset.white_bg, ndc_ray=cfg["model"]["tensorf"]["ndc_ray"],
+                           style_img=style_img, device=device)
+    if render_mode == "render_path":
+        dataset = DATASET_REGISTRY.get(cfg["dataset"]["name"])(
+            **cfg["dataset"]["val"]["params"],
+        )
+        c2ws = dataset.render_path
+        os.makedirs(f'{logfolder}/{cfg["global"]["expname"]}/imgs_path_all/{style_name}', exist_ok=True)
+        result = evaluation_feature_path(dataset, tensorf, c2ws, renderer, cfg["sampler"]["params"]["chunk_size"],
+                                f'{logfolder}/{cfg["global"]["expname"]}/imgs_path_all/{style_name}',
+                                N_vis=-1, N_samples=-1, white_bg=dataset.white_bg, ndc_ray=cfg["model"]["tensorf"]["ndc_ray"],
+                                style_img=style_img, device=device)
+    return result
+if __name__ == "__main__":
+    cfg = Opts(cfg="configs/style_inference.yml").parse_args()
+    seed_everything(seed=cfg["global"]["SEED"])
+    inference(cfg, "render_test")

main.py ADDED Viewed

	@@ -0,0 +1,75 @@

+import streamlit as st
+from PIL import Image
+import imageio.v3 as iio
+from inference import inference
+from src.utils.opt import Opts
+import os
+st.set_page_config(layout='wide')
+st.markdown(
+    """
+    <style>
+        div[data-testid="column"]:nth-of-type(1)
+        {
+        }
+        div[data-testid="column"]:nth-of-type(2)
+        {
+        }
+    </style>
+    """,unsafe_allow_html=True
+)
+col1, col2, col3 = st.columns(3)
+if 'counter' not in st.session_state:
+    st.session_state.video_path = None
+    st.session_state.image = None
+    st.session_state.counter = 0
+def showVideo(image):
+    if st.session_state.image is not None:
+        cfg = Opts(cfg="configs/style_inference.yml").parse_args()
+        result = inference(cfg, "render_test", image=image)
+        st.session_state.video_path = result["video_path"]
+        st.session_state.counter += 1
+    else:
+        col2.write("No uploaded image")
+with col1:
+    col1.subheader("Source multiview images")
+    filteredImages = []
+    for image_file in os.listdir('data/nerf_llff_data/trex/streamlit_images'):
+        filteredImages.append(Image.open(os.path.join('data/nerf_llff_data/trex/streamlit_images', image_file)))
+    id = 0
+    for img in range(0, len(filteredImages), 4):
+        cols = col1.columns(4)
+        cols[0].image(filteredImages[id], use_column_width=True)
+        id +=1
+        cols[1].image(filteredImages[id], use_column_width=True)
+        id +=1
+        cols[2].image(filteredImages[id], use_column_width=True)
+        id +=1
+        cols[3].image(filteredImages[id], use_column_width=True)
+        id +=1
+with col2:
+    col2.subheader("Style image")
+    uploaded_file = col2.file_uploader("Choose a image file")
+    if uploaded_file:
+        st.session_state.image = Image.open(uploaded_file)
+        img = col2.image(st.session_state.image, caption='Style Image', use_column_width=True)
+    col2.button('Run Style Transfer', on_click=showVideo, args=([st.session_state.image]))
+col3.subheader("Style videos")
+if st.session_state.counter > 0:
+    video_file = open(st.session_state.video_path, 'rb')
+    video_bytes = video_file.read()
+    col3.video(video_bytes)

models/__pycache__/__init__.cpython-38.pyc DELETED Viewed

Binary file (142 Bytes)

opt.py DELETED Viewed

@@ -1,153 +0,0 @@
-import configargparse
-def config_parser(cmd=None):
-    parser = configargparse.ArgumentParser()
-    parser.add_argument('--config', is_config_file=True,
-                        help='config file path')
-    parser.add_argument("--expname", type=str,
-                        help='experiment name')
-    parser.add_argument("--basedir", type=str, default='./log',
-                        help='where to store ckpts and logs')
-    parser.add_argument("--add_timestamp", type=int, default=0,
-                        help='add timestamp to dir')
-    parser.add_argument("--datadir", type=str, default='./data/llff/fern',
-                        help='input data directory')
-    parser.add_argument("--wikiartdir", type=str, default='./data/WikiArt',
-                        help='input data directory')
-    parser.add_argument("--progress_refresh_rate", type=int, default=10,
-                        help='how many iterations to show psnrs or iters')
-    parser.add_argument('--with_depth', action='store_true')
-    parser.add_argument('--downsample_train', type=float, default=1.0)
-    parser.add_argument('--downsample_test', type=float, default=1.0)
-    parser.add_argument('--model_name', type=str, default='TensorVMSplit',
-                        choices=['TensorVMSplit', 'TensorCP'])
-    # loader options
-    parser.add_argument("--batch_size", type=int, default=4096)
-    parser.add_argument("--n_iters", type=int, default=30000)
-    parser.add_argument('--dataset_name', type=str, default='blender',
-                        choices=['blender', 'llff', 'nsvf', 'dtu','tankstemple', 'own_data'])
-    # training options
-    parser.add_argument("--patch_size", type=int, default=256,
-                        help='patch_size for training')
-    parser.add_argument("--chunk_size", type=int, default=4096,
-                        help='chunk_size for training')
-    # learning rate
-    parser.add_argument("--lr_init", type=float, default=0.02,
-                        help='learning rate')
-    parser.add_argument("--lr_basis", type=float, default=1e-4,
-                        help='learning rate')
-    parser.add_argument("--lr_finetune", type=float, default=1e-5,
-                        help='learning rate')
-    parser.add_argument("--lr_decay_iters", type=int, default=-1,
-                        help = 'number of iterations the lr will decay to the target ratio; -1 will set it to n_iters')
-    parser.add_argument("--lr_decay_target_ratio", type=float, default=0.1,
-                        help='the target decay ratio; after decay_iters inital lr decays to lr*ratio')
-    parser.add_argument("--lr_upsample_reset", type=int, default=1,
-                        help='reset lr to inital after upsampling')
-    # loss
-    parser.add_argument("--L1_weight_inital", type=float, default=0.0,
-                        help='loss weight')
-    parser.add_argument("--L1_weight_rest", type=float, default=0,
-                        help='loss weight')
-    parser.add_argument("--Ortho_weight", type=float, default=0.0,
-                        help='loss weight')
-    parser.add_argument("--TV_weight_density", type=float, default=0.0,
-                        help='loss weight')
-    parser.add_argument("--TV_weight_app", type=float, default=0.0,
-                        help='loss weight')
-    parser.add_argument("--TV_weight_feature", type=float, default=0.0,
-                        help='loss weight')
-    parser.add_argument("--style_weight", type=float, default=0,
-                        help='loss weight')
-    parser.add_argument("--content_weight", type=float, default=0,
-                        help='loss weight')
-    parser.add_argument("--image_tv_weight", type=float, default=0,
-                        help='loss weight')
-    parser.add_argument("--featuremap_tv_weight", type=float, default=0,
-                        help='loss weight')
-    # model
-    # volume options
-    parser.add_argument("--n_lamb_sigma", type=int, action="append")
-    parser.add_argument("--n_lamb_sh", type=int, action="append")
-    parser.add_argument("--data_dim_color", type=int, default=27)
-    parser.add_argument("--rm_weight_mask_thre", type=float, default=0.0001,
-                        help='mask points in ray marching')
-    parser.add_argument("--alpha_mask_thre", type=float, default=0.0001,
-                        help='threshold for creating alpha mask volume')
-    parser.add_argument("--distance_scale", type=float, default=25,
-                        help='scaling sampling distance for computation')
-    parser.add_argument("--density_shift", type=float, default=-10,
-                        help='shift density in softplus; making density = 0  when feature == 0')
-    # network decoder
-    parser.add_argument("--shadingMode", type=str, default="MLP_Fea",
-                        help='which shading mode to use')
-    parser.add_argument("--pos_pe", type=int, default=6,
-                        help='number of pe for pos')
-    parser.add_argument("--view_pe", type=int, default=6,
-                        help='number of pe for view')
-    parser.add_argument("--fea_pe", type=int, default=6,
-                        help='number of pe for features')
-    parser.add_argument("--featureC", type=int, default=128,
-                        help='hidden feature channel in MLP')
-    # test option
-    parser.add_argument("--ckpt", type=str, default=None,
-                        help='specific weights npy file to reload for coarse network')
-    parser.add_argument("--render_only", type=int, default=0)
-    parser.add_argument("--render_test", type=int, default=0)
-    parser.add_argument("--render_train", type=int, default=0)
-    parser.add_argument("--render_path", type=int, default=0)
-    parser.add_argument("--export_mesh", type=int, default=0)
-    parser.add_argument("--style_img", type=str, required=False)
-    # rendering options
-    parser.add_argument('--lindisp', default=False, action="store_true",
-                        help='use disparity depth sampling')
-    parser.add_argument("--perturb", type=float, default=1.,
-                        help='set to 0. for no jitter, 1. for jitter')
-    parser.add_argument("--accumulate_decay", type=float, default=0.998)
-    parser.add_argument("--fea2denseAct", type=str, default='softplus')
-    parser.add_argument('--ndc_ray', type=int, default=0)
-    parser.add_argument('--nSamples', type=int, default=1e6,
-                        help='sample point each ray, pass 1e6 if automatic adjust')
-    parser.add_argument('--step_ratio',type=float,default=0.5)
-    ## blender flags
-    parser.add_argument("--white_bkgd", action='store_true',
-                        help='set to render synthetic data on a white bkgd (always use for dvoxels)')
-    parser.add_argument('--N_voxel_init',
-                        type=int,
-                        default=100**3)
-    parser.add_argument('--N_voxel_final',
-                        type=int,
-                        default=300**3)
-    parser.add_argument("--upsamp_list", type=int, action="append")
-    parser.add_argument("--update_AlphaMask_list", type=int, action="append")
-    parser.add_argument('--idx_view',
-                        type=int,
-                        default=0)
-    # logging/saving options
-    parser.add_argument("--N_vis", type=int, default=5,
-                        help='N images to vis')
-    parser.add_argument("--vis_every", type=int, default=10000,
-                        help='frequency of visualize the image')
-    if cmd is not None:
-        return parser.parse_args(cmd)
-    else:
-        return parser.parse_args()

requirements.txt CHANGED Viewed

@@ -1,11 +1,21 @@
 tqdm
 configargparse
 Pillow
 imageio
 kornia
 opencv-python
-torch
-torchvision
 scipy
 plyfile
 streamlit

+pytorch_lightning==1.9.0
+transformers==4.26.1
+pytorch_metric_learning==2.0.1
+pandas==1.5.3
+pymeshlab
+tabulate
+torchvision
+matplotlib
+wandb==0.13.10
 tqdm
 configargparse
 Pillow
 imageio
 kornia
 opencv-python
 scipy
 plyfile
 streamlit
+lpips
+imageio[ffmpeg]
+pdf2image

scripts/test.sh DELETED Viewed

@@ -1,5 +0,0 @@
-CUDA_VISIBLE_DEVICES=$1 python train.py \
---config configs/llff.txt \
---ckpt log/trex/trex.th \
---render_only 1
---render_test 1

scripts/test_feature.sh DELETED Viewed

@@ -1,10 +0,0 @@
-expname=trex
-CUDA_VISIBLE_DEVICES=$1 python train_feature.py \
---config configs/llff_feature.txt \
---datadir ./data/nerf_llff_data/trex \
---expname $expname \
---ckpt ./log_feature/$expname/$expname.th \
---render_only 1 \
---render_test 0 \
---render_path 1 \
---chunk_size 1024

scripts/test_style.sh DELETED Viewed

@@ -1,13 +0,0 @@
-expname=trex
-python train_style.py \
---config configs/llff_style.txt \
---datadir ./data/nerf_llff_data/trex \
---expname $expname \
---ckpt log_style/$expname/$expname.th \
---style_img image_style/example.jpg \
---render_only 1 \
---render_train 0 \
---render_test 0 \
---render_path 1 \
---chunk_size 1024 \
---rm_weight_mask_thre 0.0001 \

scripts/train.sh DELETED Viewed

	@@ -1 +0,0 @@
1	- CUDA_VISIBLE_DEVICES=$1 python train.py --config=configs/llff.txt

scripts/train_feature.sh DELETED Viewed

	@@ -1 +0,0 @@
1	- CUDA_VISIBLE_DEVICES=$1 python train_feature.py --config=configs/llff_feature.txt

scripts/train_style.sh DELETED Viewed

	@@ -1 +0,0 @@
1	- CUDA_VISIBLE_DEVICES=$1 python train_style.py --config=configs/llff_style.txt

{models → src}/__init__.py RENAMED Viewed

File without changes

src/__pycache__/__init__.cpython-38.pyc ADDED Viewed

Binary file (154 Bytes). View file

src/callback/__init__.py ADDED Viewed

	@@ -0,0 +1,16 @@

+from pytorch_lightning.callbacks import (
+    ModelCheckpoint,
+    LearningRateMonitor,
+    EarlyStopping,
+)
+from src.utils.registry import Registry
+# from src.callback.visualizer_callbacks import VisualizerCallback
+CALLBACK_REGISTRY = Registry("CALLBACK")
+CALLBACK_REGISTRY.register(EarlyStopping)
+CALLBACK_REGISTRY.register(ModelCheckpoint)
+CALLBACK_REGISTRY.register(LearningRateMonitor)
+# TODO: add WandB visualizer callback
+# CALLBACK_REGISTRY.register(VisualizerCallback)

src/callback/__pycache__/__init__.cpython-38.pyc ADDED Viewed

Binary file (420 Bytes). View file

src/dataset/__init__.py ADDED Viewed

	@@ -0,0 +1,10 @@

+from src.dataset.blender_dataset import BlenderDataset
+from src.dataset.llff_dataset import LLFFDataset
+from src.dataset.style_dataset import StyleDataset
+from src.utils.registry import Registry
+DATASET_REGISTRY = Registry("DATASET")
+DATASET_REGISTRY.register(BlenderDataset)
+DATASET_REGISTRY.register(LLFFDataset)
+DATASET_REGISTRY.register(StyleDataset)

src/dataset/__pycache__/__init__.cpython-38.pyc ADDED Viewed

Binary file (484 Bytes). View file

dataLoader/__pycache__/blender.cpython-38.pyc → src/dataset/__pycache__/blender_dataset.cpython-38.pyc RENAMED Viewed

Binary files a/dataLoader/__pycache__/blender.cpython-38.pyc and b/src/dataset/__pycache__/blender_dataset.cpython-38.pyc differ

dataLoader/__pycache__/llff.cpython-38.pyc → src/dataset/__pycache__/llff_dataset.cpython-38.pyc RENAMED Viewed

Binary files a/dataLoader/__pycache__/llff.cpython-38.pyc and b/src/dataset/__pycache__/llff_dataset.cpython-38.pyc differ

{dataLoader → src/dataset}/__pycache__/ray_utils.cpython-38.pyc RENAMED Viewed

Binary files a/dataLoader/__pycache__/ray_utils.cpython-38.pyc and b/src/dataset/__pycache__/ray_utils.cpython-38.pyc differ

src/dataset/__pycache__/style_dataset.cpython-38.pyc ADDED Viewed

Binary file (954 Bytes). View file