Commit
·
2a03f49
1
Parent(s):
ae54f3c
Update README.md
Browse files
README.md
CHANGED
@@ -1,22 +1,16 @@
|
|
1 |
---
|
2 |
-
language:
|
3 |
-
- en
|
4 |
license: mit
|
5 |
tags:
|
6 |
- object-detection
|
7 |
- object-tracking
|
8 |
- video
|
9 |
- video-object-segmentation
|
10 |
-
datasets:
|
11 |
-
- imagenet-1k
|
12 |
-
metrics:
|
13 |
-
- accuracy
|
14 |
---
|
15 |
|
16 |
-
#
|
17 |
|
18 |
## Table of Contents
|
19 |
-
- [
|
20 |
- [Table of Contents](#table-of-contents)
|
21 |
- [Model Details](#model-details)
|
22 |
- [How to Get Started with the Model](#how-to-get-started-with-the-model)
|
@@ -37,76 +31,37 @@ metrics:
|
|
37 |
|
38 |
## Model Details
|
39 |
|
40 |
-
|
41 |
|
42 |
-
EfficientFormer-L3, developed by [Snap Research](https://github.com/snap-research), is one of three EfficientFormer models. The EfficientFormer models were released as part of an effort to prove that properly designed transformers can reach extremely low latency on mobile devices while maintaining high performance.
|
43 |
-
|
44 |
-
This checkpoint of EfficientFormer-L3 was trained for 300 epochs.
|
45 |
-
|
46 |
-
- Developed by: Yanyu Li, Geng Yuan, Yang Wen, Eric Hu, Georgios Evangelidis, Sergey Tulyakov, Yanzhi Wang, Jian Ren
|
47 |
-
- Language(s): English
|
48 |
- License: This model is licensed under the apache-2.0 license
|
49 |
- Resources for more information:
|
50 |
-
- [Research Paper](https://arxiv.org/abs/
|
51 |
-
- [GitHub Repo](https://github.com/
|
52 |
|
53 |
</model_details>
|
54 |
|
55 |
-
<how_to_start>
|
56 |
-
|
57 |
-
## How to Get Started with the Model
|
58 |
-
|
59 |
-
Use the code below to get started with the model.
|
60 |
-
|
61 |
-
```python
|
62 |
-
# A nice code snippet here that describes how to use the model...
|
63 |
-
```
|
64 |
-
</how_to_start>
|
65 |
-
|
66 |
<uses>
|
67 |
|
68 |
## Uses
|
69 |
|
70 |
#### Direct Use
|
71 |
|
72 |
-
This model can be used for
|
73 |
-
|
74 |
-
<Limitations_and_Biases>
|
75 |
-
|
76 |
-
## Limitations and Biases
|
77 |
-
|
78 |
-
Though most designs in EfficientFormer are general-purposed, e.g., dimension- consistent design and 4D block with CONV-BN fusion, the actual speed of EfficientFormer may vary on other platforms. For instance, if GeLU is not well supported while HardSwish is efficiently implemented on specific hardware and compiler, the operator may need to be modified accordingly. The proposed latency-driven slimming is simple and fast. However, better results may be achieved if search cost is not a concern and an enumeration-based brute search is performed.
|
79 |
|
80 |
-
|
81 |
-
|
82 |
-
|
83 |
-
|
84 |
-
<Training>
|
85 |
-
|
86 |
-
## Training
|
87 |
-
|
88 |
-
#### Training Data
|
89 |
-
|
90 |
-
This model was trained on ImageNet-1K.
|
91 |
-
|
92 |
-
See the [data card](https://huggingface.co/datasets/imagenet-1k) for additional information.
|
93 |
-
|
94 |
-
#### Training Procedure
|
95 |
-
|
96 |
-
* Parameters: 31.3 M
|
97 |
-
* GMACs: 3.9
|
98 |
-
* Train. Epochs: 300
|
99 |
-
|
100 |
-
Trained on a cluster with NVIDIA A100 and V100 GPUs.
|
101 |
-
|
102 |
-
</Training>
|
103 |
|
104 |
<Eval_Results>
|
105 |
|
106 |
## Evaluation Results
|
107 |
|
108 |
-
|
109 |
-
|
|
|
|
|
|
|
110 |
|
111 |
</Eval_Results>
|
112 |
|
@@ -115,10 +70,10 @@ Latency: 3.0 ms
|
|
115 |
## Citation Information
|
116 |
|
117 |
```bibtex
|
118 |
-
@
|
119 |
-
title={
|
120 |
-
author={
|
121 |
-
|
122 |
year={2022}
|
123 |
}
|
124 |
```
|
|
|
1 |
---
|
|
|
|
|
2 |
license: mit
|
3 |
tags:
|
4 |
- object-detection
|
5 |
- object-tracking
|
6 |
- video
|
7 |
- video-object-segmentation
|
|
|
|
|
|
|
|
|
8 |
---
|
9 |
|
10 |
+
# unicorn_track_large_mask
|
11 |
|
12 |
## Table of Contents
|
13 |
+
- [unicorn_track_large_mask](#-model_id--defaultmymodelname-true)
|
14 |
- [Table of Contents](#table-of-contents)
|
15 |
- [Model Details](#model-details)
|
16 |
- [How to Get Started with the Model](#how-to-get-started-with-the-model)
|
|
|
31 |
|
32 |
## Model Details
|
33 |
|
34 |
+
Unicorn accomplishes the great unification of the network architecture and the learning paradigm for four tracking tasks. Unicorn puts forwards new state-of-the-art performance on many challenging tracking benchmarks using the same model parameters. This model has an input size of 800x1280.
|
35 |
|
|
|
|
|
|
|
|
|
|
|
|
|
36 |
- License: This model is licensed under the apache-2.0 license
|
37 |
- Resources for more information:
|
38 |
+
- [Research Paper](https://arxiv.org/abs/2111.12085)
|
39 |
+
- [GitHub Repo](https://github.com/MasterBin-IIAU/Unicorn)
|
40 |
|
41 |
</model_details>
|
42 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
43 |
<uses>
|
44 |
|
45 |
## Uses
|
46 |
|
47 |
#### Direct Use
|
48 |
|
49 |
+
This model can be used for:
|
|
|
|
|
|
|
|
|
|
|
|
|
50 |
|
51 |
+
* Single Object Tracking (SOT)
|
52 |
+
* Multiple Object Tracking (MOT)
|
53 |
+
* Video Object Segmentation (VOS)
|
54 |
+
* Multi-Object Tracking and Segmentation (MOTS)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
55 |
|
56 |
<Eval_Results>
|
57 |
|
58 |
## Evaluation Results
|
59 |
|
60 |
+
LaSOT AUC (%): 68.5
|
61 |
+
BDD100K mMOTA (%): 41.2
|
62 |
+
DAVIS17 J&F (%): 69.2
|
63 |
+
BDD100K MOTS mMOTSA (%): 29.6
|
64 |
+
|
65 |
|
66 |
</Eval_Results>
|
67 |
|
|
|
70 |
## Citation Information
|
71 |
|
72 |
```bibtex
|
73 |
+
@inproceedings{unicorn,
|
74 |
+
title={Towards Grand Unification of Object Tracking},
|
75 |
+
author={Yan, Bin and Jiang, Yi and Sun, Peize and Wang, Dong and Yuan, Zehuan and Luo, Ping and Lu, Huchuan},
|
76 |
+
booktitle={ECCV},
|
77 |
year={2022}
|
78 |
}
|
79 |
```
|