halimb nielsr HF staff commited on
Commit
6077af5
0 Parent(s):

Duplicate from LiheYoung/depth-anything-large-hf

Browse files

Co-authored-by: Niels Rogge <nielsr@users.noreply.huggingface.co>

Files changed (5) hide show
  1. .gitattributes +35 -0
  2. README.md +98 -0
  3. config.json +81 -0
  4. model.safetensors +3 -0
  5. preprocessor_config.json +26 -0
.gitattributes ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,98 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - vision
5
+ pipeline_tag: depth-estimation
6
+ widget:
7
+ - inference: false
8
+ ---
9
+
10
+ # Depth Anything (large-sized model, Transformers version)
11
+
12
+ Depth Anything model. It was introduced in the paper [Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data](https://arxiv.org/abs/2401.10891) by Lihe Yang et al. and first released in [this repository](https://github.com/LiheYoung/Depth-Anything).
13
+
14
+ [Online demo](https://huggingface.co/spaces/LiheYoung/Depth-Anything) is also provided.
15
+
16
+ Disclaimer: The team releasing Depth Anything did not write a model card for this model so this model card has been written by the Hugging Face team.
17
+
18
+ ## Model description
19
+
20
+ Depth Anything leverages the [DPT](https://huggingface.co/docs/transformers/model_doc/dpt) architecture with a [DINOv2](https://huggingface.co/docs/transformers/model_doc/dinov2) backbone.
21
+
22
+ The model is trained on ~62 million images, obtaining state-of-the-art results for both relative and absolute depth estimation.
23
+
24
+ <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/model_doc/depth_anything_overview.jpg"
25
+ alt="drawing" width="600"/>
26
+
27
+ <small> Depth Anything overview. Taken from the <a href="https://arxiv.org/abs/2401.10891">original paper</a>.</small>
28
+
29
+ ## Intended uses & limitations
30
+
31
+ You can use the raw model for tasks like zero-shot depth estimation. See the [model hub](https://huggingface.co/models?search=depth-anything) to look for
32
+ other versions on a task that interests you.
33
+
34
+ ### How to use
35
+
36
+ Here is how to use this model to perform zero-shot depth estimation:
37
+
38
+ ```python
39
+ from transformers import pipeline
40
+ from PIL import Image
41
+ import requests
42
+
43
+ # load pipe
44
+ pipe = pipeline(task="depth-estimation", model="LiheYoung/depth-anything-large-hf")
45
+
46
+ # load image
47
+ url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
48
+ image = Image.open(requests.get(url, stream=True).raw)
49
+
50
+ # inference
51
+ depth = pipe(image)["depth"]
52
+ ```
53
+
54
+ Alternatively, one can use the classes themselves:
55
+
56
+ ```python
57
+ from transformers import AutoImageProcessor, AutoModelForDepthEstimation
58
+ import torch
59
+ import numpy as np
60
+ from PIL import Image
61
+ import requests
62
+
63
+ url = "http://images.cocodataset.org/val2017/000000039769.jpg"
64
+ image = Image.open(requests.get(url, stream=True).raw)
65
+
66
+ image_processor = AutoImageProcessor.from_pretrained("LiheYoung/depth-anything-large-hf")
67
+ model = AutoModelForDepthEstimation.from_pretrained("LiheYoung/depth-anything-large-hf")
68
+
69
+ # prepare image for the model
70
+ inputs = image_processor(images=image, return_tensors="pt")
71
+
72
+ with torch.no_grad():
73
+ outputs = model(**inputs)
74
+ predicted_depth = outputs.predicted_depth
75
+
76
+ # interpolate to original size
77
+ prediction = torch.nn.functional.interpolate(
78
+ predicted_depth.unsqueeze(1),
79
+ size=image.size[::-1],
80
+ mode="bicubic",
81
+ align_corners=False,
82
+ )
83
+ ```
84
+ For more code examples, we refer to the [documentation](https://huggingface.co/transformers/main/model_doc/depth_anything.html#).
85
+
86
+
87
+ ### BibTeX entry and citation info
88
+
89
+ ```bibtex
90
+ @misc{yang2024depth,
91
+ title={Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data},
92
+ author={Lihe Yang and Bingyi Kang and Zilong Huang and Xiaogang Xu and Jiashi Feng and Hengshuang Zhao},
93
+ year={2024},
94
+ eprint={2401.10891},
95
+ archivePrefix={arXiv},
96
+ primaryClass={cs.CV}
97
+ }
98
+ ```
config.json ADDED
@@ -0,0 +1,81 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_commit_hash": null,
3
+ "architectures": [
4
+ "DepthAnythingForDepthEstimation"
5
+ ],
6
+ "backbone": null,
7
+ "backbone_config": {
8
+ "architectures": [
9
+ "Dinov2Model"
10
+ ],
11
+ "hidden_size": 1024,
12
+ "image_size": 518,
13
+ "model_type": "dinov2",
14
+ "num_attention_heads": 16,
15
+ "num_hidden_layers": 24,
16
+ "out_features": [
17
+ "stage21",
18
+ "stage22",
19
+ "stage23",
20
+ "stage24"
21
+ ],
22
+ "out_indices": [
23
+ 21,
24
+ 22,
25
+ 23,
26
+ 24
27
+ ],
28
+ "patch_size": 14,
29
+ "reshape_hidden_states": false,
30
+ "stage_names": [
31
+ "stem",
32
+ "stage1",
33
+ "stage2",
34
+ "stage3",
35
+ "stage4",
36
+ "stage5",
37
+ "stage6",
38
+ "stage7",
39
+ "stage8",
40
+ "stage9",
41
+ "stage10",
42
+ "stage11",
43
+ "stage12",
44
+ "stage13",
45
+ "stage14",
46
+ "stage15",
47
+ "stage16",
48
+ "stage17",
49
+ "stage18",
50
+ "stage19",
51
+ "stage20",
52
+ "stage21",
53
+ "stage22",
54
+ "stage23",
55
+ "stage24"
56
+ ],
57
+ "torch_dtype": "float32"
58
+ },
59
+ "fusion_hidden_size": 256,
60
+ "head_hidden_size": 32,
61
+ "head_in_index": -1,
62
+ "initializer_range": 0.02,
63
+ "model_type": "depth_anything",
64
+ "neck_hidden_sizes": [
65
+ 256,
66
+ 512,
67
+ 1024,
68
+ 1024
69
+ ],
70
+ "patch_size": 14,
71
+ "reassemble_factors": [
72
+ 4,
73
+ 2,
74
+ 1,
75
+ 0.5
76
+ ],
77
+ "reassemble_hidden_size": 1024,
78
+ "torch_dtype": "float32",
79
+ "transformers_version": null,
80
+ "use_pretrained_backbone": false
81
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bc27360a3e6906e5ddd8f618e2dcde11362327361918b8f76793e42e25de31b3
3
+ size 1341322868
preprocessor_config.json ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "do_normalize": true,
3
+ "do_pad": false,
4
+ "do_rescale": true,
5
+ "do_resize": true,
6
+ "ensure_multiple_of": 14,
7
+ "image_mean": [
8
+ 0.485,
9
+ 0.456,
10
+ 0.406
11
+ ],
12
+ "image_processor_type": "DPTImageProcessor",
13
+ "image_std": [
14
+ 0.229,
15
+ 0.224,
16
+ 0.225
17
+ ],
18
+ "keep_aspect_ratio": true,
19
+ "resample": 3,
20
+ "rescale_factor": 0.00392156862745098,
21
+ "size": {
22
+ "height": 518,
23
+ "width": 518
24
+ },
25
+ "size_divisor": null
26
+ }