ADSKAILab
/

WaLa-SV-1B

Image-to-3D

English

wala

single-view-to-3d

Model card Files Files and versions Community

Hooman commited on Oct 9, 2024

Commit

b66e1c4

verified ·

1 Parent(s): 0bb6e93

Upload README.md with huggingface_hub

Browse files

Files changed (1) hide show

README.md +8 -8

README.md CHANGED Viewed

@@ -5,10 +5,10 @@ license: other
 license_name: autodesk-non-commercial-3d-generative-v1.0
 tags:
 - wala
-- SV-to-3d
 ---
-# Model Card for WaLa-single-view-1B
 This model is part of the Wavelet Latent Diffusion (WaLa) paper, capable of generating high-quality 3D shapes from single-view images with detailed geometry and complex structures.
@@ -16,7 +16,7 @@ This model is part of the Wavelet Latent Diffusion (WaLa) paper, capable of gene
 ### Model Description
-WaLa-single-view-1B is a large-scale 3D generative model trained on a massive dataset of over 10 million publicly-available 3D shapes. It can efficiently generate a wide range of high-quality 3D shapes from single-view image inputs in just 2-4 seconds. The model uses a wavelet-based compact latent encoding and a billion-parameter architecture to achieve superior performance in terms of geometric detail and structural plausibility.
 - **Developed by:** Aditya Sanghi, Aliasghar Khani, Chinthala Pradyumna Reddy, Arianna Rampini, Derek Cheung, Kamal Rahimi Malekshan, Kanika Madan, Hooman Shayani
 - **Model type:** 3D Generative Model
@@ -26,15 +26,15 @@ For more information please look at the [Project](TBD) [Page](TBD) and [the pape
 ### Model Sources
-- **Repository:** [TBD]
-- **Paper:** [ArXiv:TBD]
-- **Demo:** [TBD]
 ## Uses
 ### Direct Use
-This model is released by Autodesk and intended for academic and research purposes only for the theoretical exploration and demonstration of the WaLa 3D generative framework.  Please see [here](TBD) for inferencing instructions.
 ### Out-of-Scope Use
@@ -119,7 +119,7 @@ On the MAS validation dataset:
 ### Model Architecture and Objective
-The model uses a U-ViT architecture with modifications. It employs a wavelet-based compact latent encoding to effectively capture both coarse and fine details of 3D shapes.
 ### Compute Infrastructure

 license_name: autodesk-non-commercial-3d-generative-v1.0
 tags:
 - wala
+- single-view-to-3d
 ---
+# Model Card for WaLa-SV-1B
 This model is part of the Wavelet Latent Diffusion (WaLa) paper, capable of generating high-quality 3D shapes from single-view images with detailed geometry and complex structures.
 ### Model Description
+WaLa-SV-1B is a large-scale 3D generative model trained on a massive dataset of over 10 million publicly-available 3D shapes. It can efficiently generate a wide range of high-quality 3D shapes from single-view image inputs in just 2.5 seconds. The model uses a wavelet-based compact latent encoding and a billion-parameter architecture to achieve superior performance in terms of geometric detail and structural plausibility.
 - **Developed by:** Aditya Sanghi, Aliasghar Khani, Chinthala Pradyumna Reddy, Arianna Rampini, Derek Cheung, Kamal Rahimi Malekshan, Kanika Madan, Hooman Shayani
 - **Model type:** 3D Generative Model
 ### Model Sources
+- **Repository:** [Github](https://github.com/AutodeskAILab/WaLa)
+- **Paper:** [ArXiv:TBD](TBD)
+- **Demo:** [TBD](TBD)
 ## Uses
 ### Direct Use
+This model is released by Autodesk and intended for academic and research purposes only for the theoretical exploration and demonstration of the WaLa 3D generative framework. Please see [here](TBD) for inferencing instructions.
 ### Out-of-Scope Use
 ### Model Architecture and Objective
+TThe model uses a U-ViT architecture with modifications. It employs a wavelet-based compact latent encoding to effectively capture both coarse and fine details of 3D shapes from single-view inputs. The input view is processed through the DINO v2 encoder to extract feature representations, which then serve as the condition latent vectors for the generative model.
 ### Compute Infrastructure