File size: 4,286 Bytes
a45988a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
# πŸ›Ή RollingDepth: Video Depth without Video Models

[![Website](doc/badges/badge-website.svg)](https://rollingdepth.github.io)
[![Hugging Face Model](https://img.shields.io/badge/πŸ€—%20Hugging%20Face-Model-green)](https://huggingface.co/prs-eth/rollingdepth-v1-0)
<!-- [![arXiv](https://img.shields.io/badge/arXiv-PDF-b31b1b)]() -->

This repository represents the official implementation of the paper titled "Video Depth without Video Models".

[Bingxin Ke](http://www.kebingxin.com/)<sup>1</sup>,
[Dominik Narnhofer](https://scholar.google.com/citations?user=tFx8AhkAAAAJ&hl=en)<sup>1</sup>,
[Shengyu Huang](https://shengyuh.github.io/)<sup>1</sup>,
[Lei Ke](https://www.kelei.site/)<sup>2</sup>,
[Torben Peters](https://scholar.google.com/citations?user=F2C3I9EAAAAJ&hl=de)<sup>1</sup>,
[Katerina Fragkiadaki](https://www.cs.cmu.edu/~katef/)<sup>2</sup>,
[Anton Obukhov](https://www.obukhov.ai/)<sup>1</sup>,
[Konrad Schindler](https://scholar.google.com/citations?user=FZuNgqIAAAAJ&hl=en)<sup>1</sup>


<sup>1</sup>ETH Zurich, 
<sup>2</sup>Carnegie Mellon University



## πŸ“’ News
2024-11-28: Inference code is released.<br>



## πŸ› οΈ Setup
The inference code was tested on: Debian 12, Python 3.12.7 (venv), CUDA 12.4, GeForce RTX 3090

### πŸ“¦ Repository
```bash
git clone https://github.com/prs-eth/RollingDepth.git
cd RollingDepth
```

### 🐍 Python environment
Create python environment:
```bash
# with venv
python -m venv venv/rollingdepth
source venv/rollingdepth/bin/activate

# or with conda
conda create --name rollingdepth python=3.12
conda activate rollingdepth
```

### πŸ’» Dependencies
Install dependicies: 
```bash
pip install -r requirements.txt

# Install modified diffusers with cross-frame self-attention
bash script/install_diffusers_dev.sh 
```
We use [pyav](https://github.com/PyAV-Org/PyAV) for video I/O, which relies on [ffmpeg](https://www.ffmpeg.org/).


## πŸƒ Test on your videos
All scripts are designed to run from the project root directory.

### πŸ“· Prepare input videos
1. Use sample videos:
    ```bash
    bash script/download_sample_data.sh
    ```

1. Or place your videos in a directory, for example, under `data/samples`.

### πŸš€ Run with presets
```bash
python run_video.py \
    -i data/samples \
    -o output/samples_fast \
    -p fast \
    --save-npy true \
    --verbose
```
- `-p` or `--preset`: preset options
    - `fast` for **fast inference**, with dilations [1, 25] (flexible), fp16, without refinement, at max. resolution 768.
    - `fast1024` for **fast inference at resolution 1024**
    - `full` for **better details**, with dilations [1, 10, 25] (flexible), fp16, with 10 refinement steps, at max. resolution 1024.
    - `paper` for **reproducing paper numbers**, with (fixed) dilations [1, 10, 25], fp32, with 10 refinement steps, at max. resolution 768.
- `-i` or `--input-video`: path to input data, can be a single video file, a text file with video paths, or a directory of videos.
- `-o` or `--output-dir`: output directory.

Passing other arguments below may overwrite the preset settings:
- Coming soon
<!-- TODO: explain all arguments in detailed -->


## ⬇ Checkpoint cache
By default, the [checkpoint](https://huggingface.co/prs-eth/rollingdepth-v1-0) is stored in the Hugging Face cache. The HF_HOME environment variable defines its location and can be overridden, e.g.:

```
export HF_HOME=$(pwd)/cache
```

Alternatively, use the following script to download the checkpoint weights locally and specify checkpoint path by `-c checkpoint/rollingdepth-v1-0 `

```bash
bash script/download_weight.sh
```


## 🦿 Evaluation on test datasets
Coming soon


<!-- ## πŸŽ“ Citation
TODO -->


## πŸ™ Acknowledgments
We thank Yue Pan, Shuchang Liu, Nando Metzger, and Nikolai Kalischek for fruitful discussions. 
 
We are grateful to [redmond.ai](https://redmond.ai/) (robin@redmond.ai) for providing GPU resources.

## 🎫 License

This code of this work is licensed under the Apache License, Version 2.0 (as defined in the [LICENSE](LICENSE.txt)).

The model is licensed under RAIL++-M License (as defined in the [LICENSE-MODEL](LICENSE-MODEL.txt))

By downloading and using the code and model you agree to the terms in [LICENSE](LICENSE.txt) and [LICENSE-MODEL](LICENSE-MODEL.txt) respectively.