Image-to-3D
English
wala
single-view-to-3d
Hooman commited on
Commit
b66e1c4
1 Parent(s): 0bb6e93

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +8 -8
README.md CHANGED
@@ -5,10 +5,10 @@ license: other
5
  license_name: autodesk-non-commercial-3d-generative-v1.0
6
  tags:
7
  - wala
8
- - SV-to-3d
9
  ---
10
 
11
- # Model Card for WaLa-single-view-1B
12
 
13
  This model is part of the Wavelet Latent Diffusion (WaLa) paper, capable of generating high-quality 3D shapes from single-view images with detailed geometry and complex structures.
14
 
@@ -16,7 +16,7 @@ This model is part of the Wavelet Latent Diffusion (WaLa) paper, capable of gene
16
 
17
  ### Model Description
18
 
19
- WaLa-single-view-1B is a large-scale 3D generative model trained on a massive dataset of over 10 million publicly-available 3D shapes. It can efficiently generate a wide range of high-quality 3D shapes from single-view image inputs in just 2-4 seconds. The model uses a wavelet-based compact latent encoding and a billion-parameter architecture to achieve superior performance in terms of geometric detail and structural plausibility.
20
 
21
  - **Developed by:** Aditya Sanghi, Aliasghar Khani, Chinthala Pradyumna Reddy, Arianna Rampini, Derek Cheung, Kamal Rahimi Malekshan, Kanika Madan, Hooman Shayani
22
  - **Model type:** 3D Generative Model
@@ -26,15 +26,15 @@ For more information please look at the [Project](TBD) [Page](TBD) and [the pape
26
 
27
  ### Model Sources
28
 
29
- - **Repository:** [TBD]
30
- - **Paper:** [ArXiv:TBD]
31
- - **Demo:** [TBD]
32
 
33
  ## Uses
34
 
35
  ### Direct Use
36
 
37
- This model is released by Autodesk and intended for academic and research purposes only for the theoretical exploration and demonstration of the WaLa 3D generative framework. Please see [here](TBD) for inferencing instructions.
38
 
39
  ### Out-of-Scope Use
40
 
@@ -119,7 +119,7 @@ On the MAS validation dataset:
119
 
120
  ### Model Architecture and Objective
121
 
122
- The model uses a U-ViT architecture with modifications. It employs a wavelet-based compact latent encoding to effectively capture both coarse and fine details of 3D shapes.
123
 
124
  ### Compute Infrastructure
125
 
 
5
  license_name: autodesk-non-commercial-3d-generative-v1.0
6
  tags:
7
  - wala
8
+ - single-view-to-3d
9
  ---
10
 
11
+ # Model Card for WaLa-SV-1B
12
 
13
  This model is part of the Wavelet Latent Diffusion (WaLa) paper, capable of generating high-quality 3D shapes from single-view images with detailed geometry and complex structures.
14
 
 
16
 
17
  ### Model Description
18
 
19
+ WaLa-SV-1B is a large-scale 3D generative model trained on a massive dataset of over 10 million publicly-available 3D shapes. It can efficiently generate a wide range of high-quality 3D shapes from single-view image inputs in just 2.5 seconds. The model uses a wavelet-based compact latent encoding and a billion-parameter architecture to achieve superior performance in terms of geometric detail and structural plausibility.
20
 
21
  - **Developed by:** Aditya Sanghi, Aliasghar Khani, Chinthala Pradyumna Reddy, Arianna Rampini, Derek Cheung, Kamal Rahimi Malekshan, Kanika Madan, Hooman Shayani
22
  - **Model type:** 3D Generative Model
 
26
 
27
  ### Model Sources
28
 
29
+ - **Repository:** [Github](https://github.com/AutodeskAILab/WaLa)
30
+ - **Paper:** [ArXiv:TBD](TBD)
31
+ - **Demo:** [TBD](TBD)
32
 
33
  ## Uses
34
 
35
  ### Direct Use
36
 
37
+ This model is released by Autodesk and intended for academic and research purposes only for the theoretical exploration and demonstration of the WaLa 3D generative framework. Please see [here](TBD) for inferencing instructions.
38
 
39
  ### Out-of-Scope Use
40
 
 
119
 
120
  ### Model Architecture and Objective
121
 
122
+ TThe model uses a U-ViT architecture with modifications. It employs a wavelet-based compact latent encoding to effectively capture both coarse and fine details of 3D shapes from single-view inputs. The input view is processed through the DINO v2 encoder to extract feature representations, which then serve as the condition latent vectors for the generative model.
123
 
124
  ### Compute Infrastructure
125