File size: 2,371 Bytes
e8e536b 19fb693 2a534b4 19fb693 2a534b4 19fb693 2a534b4 19fb693 2a534b4 19fb693 2a534b4 19fb693 2a534b4 a0788d8 2a534b4 a0788d8 2a534b4 a0788d8 2a534b4 a0788d8 19fb693 2a534b4 307486a 2a534b4 a0788d8 2a534b4 a0788d8 2a534b4 a0788d8 2a534b4 a0788d8 2a534b4 19fb693 2a534b4 19fb693 2a534b4 19fb693 2a534b4 19fb693 2a534b4 19fb693 2a534b4 19fb693 2a534b4 19fb693 e8e536b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 |
---
license: apache-2.0
pipeline_tag: image-to-3d
---
# [ECCV 2024] VFusion3D: Learning Scalable 3D Generative Models from Video Diffusion Models
[Porject page](https://junlinhan.github.io/projects/vfusion3d.html), [Paper link](https://arxiv.org/abs/2403.12034)
VFusion3D is a large, feed-forward 3D generative model trained with a small amount of 3D data and a large volume of synthetic multi-view data. It is the first work exploring scalable 3D generative/reconstruction models as a step towards a 3D foundation.
[VFusion3D: Learning Scalable 3D Generative Models from Video Diffusion Models](https://junlinhan.github.io/projects/vfusion3d.html)<br>
[Junlin Han](https://junlinhan.github.io/), [Filippos Kokkinos](https://www.fkokkinos.com/), [Philip Torr](https://www.robots.ox.ac.uk/~phst/)<br>
GenAI, Meta and TVG, University of Oxford<br>
European Conference on Computer Vision (ECCV), 2024
## News
- [25.07.2024] Release weights and inference code for VFusion3D.
## Quick Start
Getting started with VFusion3D is super easy! 🤗 Here’s how you can use the model with Hugging Face:
### Load model directly
```python
from transformers import AutoModel
model = AutoModel.from_pretrained("jadechoghari/vfusion3d", trust_remote_code=True)
```
Check out our [demo app](https://huggingface.co/spaces/jadechoghari/vfusion3d-app) to see VFusion3D in action! 🤗
## Results and Comparisons
### 3D Generation Results
<img src='assets/gif1.gif' width=950>
<img src='assets/gif2.gif' width=950>
### User Study Results
<img src='assets/user.png' width=950>
## Acknowledgement
- This inference code of VFusion3D heavily borrows from [OpenLRM](https://github.com/3DTopia/OpenLRM).
## Citation
If you find this work useful, please cite us:
```
@article{han2024vfusion3d,
title={VFusion3D: Learning Scalable 3D Generative Models from Video Diffusion Models},
author={Junlin Han and Filippos Kokkinos and Philip Torr},
journal={European Conference on Computer Vision (ECCV)},
year={2024}
}
```
## License
- The majority of VFusion3D is licensed under CC-BY-NC, however portions of the project are available under separate license terms: OpenLRM as a whole is licensed under the Apache License, Version 2.0, while certain components are covered by NVIDIA's proprietary license.
- The model weights of VFusion3D is also licensed under CC-BY-NC. |