zhengrongzhang
commited on
Commit
•
ff1446e
1
Parent(s):
1bddc94
init model
Browse files- README.md +90 -0
- data/widerface/val/wider_val.txt +0 -0
- requirements.txt +14 -0
- utils.py +243 -0
- weights/RetinaFace_int.onnx +3 -0
- widerface_evaluate/README.md +27 -0
- widerface_evaluate/box_overlaps.pyx +55 -0
- widerface_evaluate/evaluation.py +311 -0
- widerface_evaluate/ground_truth/wider_easy_val.mat +0 -0
- widerface_evaluate/ground_truth/wider_face_val.mat +0 -0
- widerface_evaluate/ground_truth/wider_hard_val.mat +0 -0
- widerface_evaluate/ground_truth/wider_medium_val.mat +0 -0
- widerface_evaluate/setup.py +13 -0
- widerface_onnx_evalute.py +136 -0
- widerface_onnx_inference.py +151 -0
README.md
ADDED
@@ -0,0 +1,90 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
datasets:
|
4 |
+
- wider_face
|
5 |
+
tags:
|
6 |
+
- RyzenAI
|
7 |
+
- Object Detection
|
8 |
+
- Computer Vision
|
9 |
+
- Face
|
10 |
+
- WiderFace
|
11 |
+
- MobileNet
|
12 |
+
- RetinaNet
|
13 |
+
- ONNX
|
14 |
+
---
|
15 |
+
|
16 |
+
# Retinaface model trained on WiderFace
|
17 |
+
|
18 |
+
Retinaface trained on WiderFace dataset at resolution 640x640, when Retinaface use mobilenet0.25 as backbone net.
|
19 |
+
It was introduced in the paper [RetinaFace: Single-stage Dense Face Localisation in the Wild](https://arxiv.org/abs/1905.00641) by Jiankang Deng et al.
|
20 |
+
The code version we use from [this repository](https://github.com/biubug6/Pytorch_Retinaface.git).
|
21 |
+
|
22 |
+
We develop a modified version that could be supported by [AMD Ryzen AI](https://ryzenai.docs.amd.com).
|
23 |
+
|
24 |
+
|
25 |
+
## Model description
|
26 |
+
|
27 |
+
Retinaface is an advanced algorithm used for face detection and facial keypoint localization. It is based on deep learning techniques and is capable of accurately detecting faces in images and providing precise positioning of facial landmarks.
|
28 |
+
|
29 |
+
|
30 |
+
## Intended uses & limitations
|
31 |
+
|
32 |
+
You can use the raw model for Face detection. See the [model hub](https://huggingface.co/models?sort=trending&search=amd%2FRetinaface) to look for all available Retinaface models.
|
33 |
+
|
34 |
+
|
35 |
+
## How to use
|
36 |
+
|
37 |
+
### Installation
|
38 |
+
|
39 |
+
Follow [Ryzen AI Installation](https://ryzenai.docs.amd.com/en/latest/inst.html) to prepare the environment for Ryzen AI.
|
40 |
+
Run the following script to install pre-requisites for this model.
|
41 |
+
```bash
|
42 |
+
pip install -r requirements.txt
|
43 |
+
```
|
44 |
+
|
45 |
+
|
46 |
+
### Data Preparation (optional: for accuracy evaluation)
|
47 |
+
1. Download the [WIDERFACE](http://shuoyang1213.me/WIDERFACE/index.html) dataset.
|
48 |
+
2. Organise the dataset directory as follows:
|
49 |
+
(Note: `train` and `test` are not necessary for accuracy evaluation on validation set. wider_val.txt only include val file names but not label information.)
|
50 |
+
```Shell
|
51 |
+
./data/widerface/
|
52 |
+
val/
|
53 |
+
images/
|
54 |
+
wider_val.txt
|
55 |
+
```
|
56 |
+
|
57 |
+
### Test & Evaluation
|
58 |
+
- Run inference for a single image
|
59 |
+
```python
|
60 |
+
python widerface_onnx_inference.py -m .\weights\RetinaFace_int.onnx --image_path \WIDERFACE_VAL_IMAGE_PAT --ipu --provider_config Path\To\vaip_config.json
|
61 |
+
#return three lists: boxes, confs, landms
|
62 |
+
#if you want to change the image path, please select another image from widerface val dataset, set --image_path new_image_path
|
63 |
+
```
|
64 |
+
*Note: __vaip_config.json__ is located at the setup package of Ryzen AI (refer to [Installation](#installation))*
|
65 |
+
|
66 |
+
- Test accuracy of the quantized model
|
67 |
+
1. Generate txt file
|
68 |
+
```python
|
69 |
+
python widerface_onnx_evalute.py --ipu --provider_config Path\To\vaip_config.json
|
70 |
+
```
|
71 |
+
2. Evaluate txt results. Demo come from [Here](https://github.com/wondervictor/WiderFace-Evaluation)
|
72 |
+
```Shell
|
73 |
+
cd ./widerface_evaluate
|
74 |
+
python setup.py build_ext --inplace
|
75 |
+
python evaluation.py #please modify the evaluation path
|
76 |
+
```
|
77 |
+
|
78 |
+
### Performance
|
79 |
+
| Model | easy | medium | hard |
|
80 |
+
|:-|:-:|:-:|:-:|
|
81 |
+
| RetinaFace_onnx_model (608x640) | 88.67% | 82.10% | 52.16% |
|
82 |
+
|
83 |
+
|
84 |
+
```bibtex
|
85 |
+
@inproceedings{deng2019retinaface,
|
86 |
+
title={RetinaFace: Single-stage Dense Face Localisation in the Wild},
|
87 |
+
author={Deng, Jiankang and Guo, Jia and Yuxiang, Zhou and Jinke Yu and Irene Kotsia and Zafeiriou, Stefanos},
|
88 |
+
booktitle={arxiv},
|
89 |
+
year={2019}
|
90 |
+
```
|
data/widerface/val/wider_val.txt
ADDED
The diff for this file is too large to render.
See raw diff
|
|
requirements.txt
ADDED
@@ -0,0 +1,14 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
torch==1.9.1
|
2 |
+
torchvision==0.10.1
|
3 |
+
opencv-python
|
4 |
+
Cython
|
5 |
+
IPython
|
6 |
+
scipy
|
7 |
+
tqdm
|
8 |
+
argparse
|
9 |
+
#numpy
|
10 |
+
#onnxruntime
|
11 |
+
#random
|
12 |
+
#math
|
13 |
+
#itertools
|
14 |
+
#time
|
utils.py
ADDED
@@ -0,0 +1,243 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import numpy as np
|
2 |
+
import cv2
|
3 |
+
from itertools import product as product
|
4 |
+
from math import ceil
|
5 |
+
|
6 |
+
import torch
|
7 |
+
import torch.nn.functional as F
|
8 |
+
|
9 |
+
|
10 |
+
class PriorBox(object):
|
11 |
+
def __init__(self, cfg, image_size=None, phase="train"):
|
12 |
+
super(PriorBox, self).__init__()
|
13 |
+
self.min_sizes = cfg["min_sizes"]
|
14 |
+
self.steps = cfg["steps"]
|
15 |
+
self.clip = cfg["clip"]
|
16 |
+
self.image_size = image_size
|
17 |
+
self.feature_maps = [
|
18 |
+
[ceil(self.image_size[0] / step), ceil(self.image_size[1] / step)]
|
19 |
+
for step in self.steps
|
20 |
+
]
|
21 |
+
|
22 |
+
def forward(self):
|
23 |
+
anchors = []
|
24 |
+
for k, f in enumerate(self.feature_maps):
|
25 |
+
min_sizes = self.min_sizes[k]
|
26 |
+
for i, j in product(range(f[0]), range(f[1])):
|
27 |
+
for min_size in min_sizes:
|
28 |
+
s_kx = min_size / self.image_size[1]
|
29 |
+
s_ky = min_size / self.image_size[0]
|
30 |
+
dense_cx = [
|
31 |
+
x * self.steps[k] / self.image_size[1] for x in [j + 0.5]
|
32 |
+
]
|
33 |
+
dense_cy = [
|
34 |
+
y * self.steps[k] / self.image_size[0] for y in [i + 0.5]
|
35 |
+
]
|
36 |
+
for cy, cx in product(dense_cy, dense_cx):
|
37 |
+
anchors += [cx, cy, s_kx, s_ky]
|
38 |
+
# back to torch land
|
39 |
+
output = torch.Tensor(anchors).view(-1, 4)
|
40 |
+
if self.clip:
|
41 |
+
output.clamp_(max=1, min=0)
|
42 |
+
return output
|
43 |
+
|
44 |
+
|
45 |
+
def py_cpu_nms(dets, thresh):
|
46 |
+
"""Pure Python NMS baseline.
|
47 |
+
Args:
|
48 |
+
dets: detections before nms
|
49 |
+
thresh: nms threshold
|
50 |
+
Return:
|
51 |
+
keep: index after nms
|
52 |
+
"""
|
53 |
+
x1 = dets[:, 0]
|
54 |
+
y1 = dets[:, 1]
|
55 |
+
x2 = dets[:, 2]
|
56 |
+
y2 = dets[:, 3]
|
57 |
+
scores = dets[:, 4]
|
58 |
+
areas = (x2 - x1 + 1) * (y2 - y1 + 1)
|
59 |
+
order = scores.argsort()[::-1]
|
60 |
+
|
61 |
+
keep = []
|
62 |
+
while order.size > 0:
|
63 |
+
i = order[0]
|
64 |
+
keep.append(i)
|
65 |
+
xx1 = np.maximum(x1[i], x1[order[1:]])
|
66 |
+
yy1 = np.maximum(y1[i], y1[order[1:]])
|
67 |
+
xx2 = np.minimum(x2[i], x2[order[1:]])
|
68 |
+
yy2 = np.minimum(y2[i], y2[order[1:]])
|
69 |
+
|
70 |
+
w = np.maximum(0.0, xx2 - xx1 + 1)
|
71 |
+
h = np.maximum(0.0, yy2 - yy1 + 1)
|
72 |
+
inter = w * h
|
73 |
+
ovr = inter / (areas[i] + areas[order[1:]] - inter)
|
74 |
+
|
75 |
+
inds = np.where(ovr <= thresh)[0]
|
76 |
+
order = order[inds + 1]
|
77 |
+
return keep
|
78 |
+
|
79 |
+
|
80 |
+
def decode(loc, priors, variances):
|
81 |
+
"""Decode locations from predictions using priors to undo
|
82 |
+
the encoding we did for offset regression at train time.
|
83 |
+
Args:
|
84 |
+
loc (tensor): location predictions for loc layers,
|
85 |
+
Shape: [num_priors,4]
|
86 |
+
priors (tensor): Prior boxes in center-offset form.
|
87 |
+
Shape: [num_priors,4].
|
88 |
+
variances: (list[float]) Variances of priorboxes
|
89 |
+
Return:
|
90 |
+
decoded bounding box predictions
|
91 |
+
"""
|
92 |
+
|
93 |
+
boxes = torch.cat(
|
94 |
+
(
|
95 |
+
priors[:, :2] + loc[:, :2] * variances[0] * priors[:, 2:],
|
96 |
+
priors[:, 2:] * torch.exp(loc[:, 2:] * variances[1]),
|
97 |
+
),
|
98 |
+
1,
|
99 |
+
)
|
100 |
+
boxes[:, :2] -= boxes[:, 2:] / 2
|
101 |
+
boxes[:, 2:] += boxes[:, :2]
|
102 |
+
return boxes
|
103 |
+
|
104 |
+
|
105 |
+
def decode_landm(pre, priors, variances):
|
106 |
+
"""Decode landm from predictions using priors to undo
|
107 |
+
the encoding we did for offset regression at train time.
|
108 |
+
Args:
|
109 |
+
pre (tensor): landm predictions for loc layers,
|
110 |
+
Shape: [num_priors,10]
|
111 |
+
priors (tensor): Prior boxes in center-offset form.
|
112 |
+
Shape: [num_priors,4].
|
113 |
+
variances: (list[float]) Variances of priorboxes
|
114 |
+
Return:
|
115 |
+
decoded landm predictions
|
116 |
+
"""
|
117 |
+
landms = torch.cat(
|
118 |
+
(
|
119 |
+
priors[:, :2] + pre[:, :2] * variances[0] * priors[:, 2:],
|
120 |
+
priors[:, :2] + pre[:, 2:4] * variances[0] * priors[:, 2:],
|
121 |
+
priors[:, :2] + pre[:, 4:6] * variances[0] * priors[:, 2:],
|
122 |
+
priors[:, :2] + pre[:, 6:8] * variances[0] * priors[:, 2:],
|
123 |
+
priors[:, :2] + pre[:, 8:10] * variances[0] * priors[:, 2:],
|
124 |
+
),
|
125 |
+
dim=1,
|
126 |
+
)
|
127 |
+
return landms
|
128 |
+
|
129 |
+
|
130 |
+
def pad_image(image, h, w, size, padvalue):
|
131 |
+
pad_image = image.copy()
|
132 |
+
pad_h = max(size[0] - h, 0)
|
133 |
+
pad_w = max(size[1] - w, 0)
|
134 |
+
if pad_h > 0 or pad_w > 0:
|
135 |
+
pad_image = cv2.copyMakeBorder(image, 0, pad_h, 0,
|
136 |
+
pad_w, cv2.BORDER_CONSTANT,
|
137 |
+
value=padvalue)
|
138 |
+
return pad_image
|
139 |
+
|
140 |
+
|
141 |
+
def resize_image(image, re_size, keep_ratio=True):
|
142 |
+
"""Resize image
|
143 |
+
Args:
|
144 |
+
image: origin image
|
145 |
+
re_size: resize scale
|
146 |
+
keep_ratio: keep aspect ratio. Default is set to true.
|
147 |
+
Returns:
|
148 |
+
re_image: resized image
|
149 |
+
resize_ratio: resize ratio
|
150 |
+
"""
|
151 |
+
if not keep_ratio:
|
152 |
+
re_image = cv2.resize(image, (re_size[0], re_size[1])).astype('float32')
|
153 |
+
return re_image, 0, 0
|
154 |
+
ratio = re_size[0] * 1.0 / re_size[1]
|
155 |
+
h, w = image.shape[0:2]
|
156 |
+
if h * 1.0 / w <= ratio:
|
157 |
+
resize_ratio = re_size[1] * 1.0 / w
|
158 |
+
re_h, re_w = int(h * resize_ratio), re_size[1]
|
159 |
+
else:
|
160 |
+
resize_ratio = re_size[0] * 1.0 / h
|
161 |
+
re_h, re_w = re_size[0], int(w * resize_ratio)
|
162 |
+
|
163 |
+
re_image = cv2.resize(image, (re_w, re_h)).astype('float32')
|
164 |
+
re_image = pad_image(re_image, re_h, re_w, re_size, (0.0, 0.0, 0.0))
|
165 |
+
return re_image, resize_ratio
|
166 |
+
|
167 |
+
|
168 |
+
def preprocess(img_raw, input_size, device):
|
169 |
+
"""preprocess
|
170 |
+
Args:
|
171 |
+
img_raw: origin image
|
172 |
+
Returns:
|
173 |
+
img: resized image
|
174 |
+
scale: resized image scale
|
175 |
+
resize: resize ratio
|
176 |
+
"""
|
177 |
+
img = np.float32(img_raw)
|
178 |
+
# resize image
|
179 |
+
img, resize = resize_image(img, input_size)
|
180 |
+
scale = torch.Tensor([img.shape[1], img.shape[0], img.shape[1], img.shape[0]])
|
181 |
+
img -= (104, 117, 123)
|
182 |
+
img = img.transpose(2, 0, 1)
|
183 |
+
img = torch.from_numpy(img).unsqueeze(0)
|
184 |
+
img = img.numpy()
|
185 |
+
scale = scale.to(device)
|
186 |
+
return img, scale, resize
|
187 |
+
|
188 |
+
|
189 |
+
def postprocess(cfg, img, outputs, scale, resize, confidence_threshold, nms_threshold, device):
|
190 |
+
"""post_process
|
191 |
+
Args:
|
192 |
+
img: resized image
|
193 |
+
outputs: forward outputs
|
194 |
+
scale: resized image scale
|
195 |
+
resize: resize ratio
|
196 |
+
confidence_threshold: confidence threshold
|
197 |
+
nms_threshold: non-maximum suppression threshold
|
198 |
+
Returns:
|
199 |
+
detetcion results
|
200 |
+
"""
|
201 |
+
_, _, im_height, im_width= img.shape
|
202 |
+
loc = torch.from_numpy(outputs[0])
|
203 |
+
conf = torch.from_numpy(outputs[1])
|
204 |
+
landms = torch.from_numpy(outputs[2])
|
205 |
+
# softmax
|
206 |
+
conf = F.softmax(conf, dim=-1)
|
207 |
+
|
208 |
+
priorbox = PriorBox(cfg, image_size=(im_height, im_width))
|
209 |
+
priors = priorbox.forward()
|
210 |
+
priors = priors.to(device)
|
211 |
+
prior_data = priors.data
|
212 |
+
boxes = decode(loc.squeeze(0), prior_data, cfg["variance"])
|
213 |
+
boxes = boxes * scale / resize
|
214 |
+
boxes = boxes.cpu().numpy()
|
215 |
+
scores = conf.squeeze(0).data.cpu().numpy()[:, 1]
|
216 |
+
landms = decode_landm(landms.squeeze(0), prior_data, cfg["variance"])
|
217 |
+
scale1 = torch.Tensor(
|
218 |
+
[img.shape[3], img.shape[2], img.shape[3], img.shape[2], img.shape[3],
|
219 |
+
img.shape[2], img.shape[3], img.shape[2], img.shape[3], img.shape[2],]
|
220 |
+
)
|
221 |
+
scale1 = scale1.to(device)
|
222 |
+
landms = landms * scale1 / resize
|
223 |
+
landms = landms.cpu().numpy()
|
224 |
+
|
225 |
+
# ignore low scores
|
226 |
+
inds = np.where(scores > confidence_threshold)[0]
|
227 |
+
boxes = boxes[inds]
|
228 |
+
landms = landms[inds]
|
229 |
+
scores = scores[inds]
|
230 |
+
|
231 |
+
# keep top-K before NMS
|
232 |
+
order = scores.argsort()[::-1]
|
233 |
+
boxes = boxes[order]
|
234 |
+
landms = landms[order]
|
235 |
+
scores = scores[order]
|
236 |
+
|
237 |
+
# do NMS
|
238 |
+
dets = np.hstack((boxes, scores[:, np.newaxis])).astype(np.float32, copy=False)
|
239 |
+
keep = py_cpu_nms(dets, nms_threshold)
|
240 |
+
dets = dets[keep, :]
|
241 |
+
landms = landms[keep]
|
242 |
+
dets = np.concatenate((dets, landms), axis=1)
|
243 |
+
return dets
|
weights/RetinaFace_int.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:7377847aafa8aece44f06a6acb6e48e7810e9b2057a599ef0a896c1b726f35b9
|
3 |
+
size 1767381
|
widerface_evaluate/README.md
ADDED
@@ -0,0 +1,27 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# WiderFace-Evaluation
|
2 |
+
Python Evaluation Code for [Wider Face Dataset](http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/)
|
3 |
+
|
4 |
+
|
5 |
+
## Usage
|
6 |
+
|
7 |
+
|
8 |
+
##### before evaluating ....
|
9 |
+
|
10 |
+
````
|
11 |
+
python3 setup.py build_ext --inplace
|
12 |
+
````
|
13 |
+
|
14 |
+
##### evaluating
|
15 |
+
|
16 |
+
**GroungTruth:** `wider_face_val.mat`, `wider_easy_val.mat`, `wider_medium_val.mat`,`wider_hard_val.mat`
|
17 |
+
|
18 |
+
````
|
19 |
+
python3 evaluation.py -p <your prediction dir> -g <groud truth dir>
|
20 |
+
````
|
21 |
+
|
22 |
+
## Bugs & Problems
|
23 |
+
please issue
|
24 |
+
|
25 |
+
## Acknowledgements
|
26 |
+
|
27 |
+
some code borrowed from Sergey Karayev
|
widerface_evaluate/box_overlaps.pyx
ADDED
@@ -0,0 +1,55 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# --------------------------------------------------------
|
2 |
+
# Fast R-CNN
|
3 |
+
# Copyright (c) 2015 Microsoft
|
4 |
+
# Licensed under The MIT License [see LICENSE for details]
|
5 |
+
# Written by Sergey Karayev
|
6 |
+
# --------------------------------------------------------
|
7 |
+
|
8 |
+
cimport cython
|
9 |
+
import numpy as np
|
10 |
+
cimport numpy as np
|
11 |
+
|
12 |
+
DTYPE = np.float64
|
13 |
+
ctypedef np.float_t DTYPE_t
|
14 |
+
|
15 |
+
def bbox_overlaps(
|
16 |
+
np.ndarray[DTYPE_t, ndim=2] boxes,
|
17 |
+
np.ndarray[DTYPE_t, ndim=2] query_boxes):
|
18 |
+
"""
|
19 |
+
Parameters
|
20 |
+
----------
|
21 |
+
boxes: (N, 4) ndarray of float
|
22 |
+
query_boxes: (K, 4) ndarray of float
|
23 |
+
Returns
|
24 |
+
-------
|
25 |
+
overlaps: (N, K) ndarray of overlap between boxes and query_boxes
|
26 |
+
"""
|
27 |
+
cdef unsigned int N = boxes.shape[0]
|
28 |
+
cdef unsigned int K = query_boxes.shape[0]
|
29 |
+
cdef np.ndarray[DTYPE_t, ndim=2] overlaps = np.zeros((N, K), dtype=DTYPE)
|
30 |
+
cdef DTYPE_t iw, ih, box_area
|
31 |
+
cdef DTYPE_t ua
|
32 |
+
cdef unsigned int k, n
|
33 |
+
for k in range(K):
|
34 |
+
box_area = (
|
35 |
+
(query_boxes[k, 2] - query_boxes[k, 0] + 1) *
|
36 |
+
(query_boxes[k, 3] - query_boxes[k, 1] + 1)
|
37 |
+
)
|
38 |
+
for n in range(N):
|
39 |
+
iw = (
|
40 |
+
min(boxes[n, 2], query_boxes[k, 2]) -
|
41 |
+
max(boxes[n, 0], query_boxes[k, 0]) + 1
|
42 |
+
)
|
43 |
+
if iw > 0:
|
44 |
+
ih = (
|
45 |
+
min(boxes[n, 3], query_boxes[k, 3]) -
|
46 |
+
max(boxes[n, 1], query_boxes[k, 1]) + 1
|
47 |
+
)
|
48 |
+
if ih > 0:
|
49 |
+
ua = float(
|
50 |
+
(boxes[n, 2] - boxes[n, 0] + 1) *
|
51 |
+
(boxes[n, 3] - boxes[n, 1] + 1) +
|
52 |
+
box_area - iw * ih
|
53 |
+
)
|
54 |
+
overlaps[n, k] = iw * ih / ua
|
55 |
+
return overlaps
|
widerface_evaluate/evaluation.py
ADDED
@@ -0,0 +1,311 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
"""
|
2 |
+
WiderFace evaluation code
|
3 |
+
author: wondervictor
|
4 |
+
mail: tianhengcheng@gmail.com
|
5 |
+
copyright@wondervictor
|
6 |
+
"""
|
7 |
+
|
8 |
+
import os
|
9 |
+
import tqdm
|
10 |
+
import pickle
|
11 |
+
import argparse
|
12 |
+
import numpy as np
|
13 |
+
from scipy.io import loadmat
|
14 |
+
from bbox import bbox_overlaps
|
15 |
+
from IPython import embed
|
16 |
+
|
17 |
+
|
18 |
+
def get_gt_boxes(gt_dir):
|
19 |
+
"""gt dir: (wider_face_val.mat, wider_easy_val.mat, wider_medium_val.mat, wider_hard_val.mat)"""
|
20 |
+
|
21 |
+
gt_mat = loadmat(os.path.join(gt_dir, "wider_face_val.mat"))
|
22 |
+
hard_mat = loadmat(os.path.join(gt_dir, "wider_hard_val.mat"))
|
23 |
+
medium_mat = loadmat(os.path.join(gt_dir, "wider_medium_val.mat"))
|
24 |
+
easy_mat = loadmat(os.path.join(gt_dir, "wider_easy_val.mat"))
|
25 |
+
|
26 |
+
facebox_list = gt_mat["face_bbx_list"]
|
27 |
+
event_list = gt_mat["event_list"]
|
28 |
+
file_list = gt_mat["file_list"]
|
29 |
+
|
30 |
+
hard_gt_list = hard_mat["gt_list"]
|
31 |
+
medium_gt_list = medium_mat["gt_list"]
|
32 |
+
easy_gt_list = easy_mat["gt_list"]
|
33 |
+
|
34 |
+
return (
|
35 |
+
facebox_list,
|
36 |
+
event_list,
|
37 |
+
file_list,
|
38 |
+
hard_gt_list,
|
39 |
+
medium_gt_list,
|
40 |
+
easy_gt_list,
|
41 |
+
)
|
42 |
+
|
43 |
+
|
44 |
+
def get_gt_boxes_from_txt(gt_path, cache_dir):
|
45 |
+
cache_file = os.path.join(cache_dir, "gt_cache.pkl")
|
46 |
+
if os.path.exists(cache_file):
|
47 |
+
f = open(cache_file, "rb")
|
48 |
+
boxes = pickle.load(f)
|
49 |
+
f.close()
|
50 |
+
return boxes
|
51 |
+
|
52 |
+
f = open(gt_path, "r")
|
53 |
+
state = 0
|
54 |
+
lines = f.readlines()
|
55 |
+
lines = list(map(lambda x: x.rstrip("\r\n"), lines))
|
56 |
+
boxes = {}
|
57 |
+
print(len(lines))
|
58 |
+
f.close()
|
59 |
+
current_boxes = []
|
60 |
+
current_name = None
|
61 |
+
for line in lines:
|
62 |
+
if state == 0 and "--" in line:
|
63 |
+
state = 1
|
64 |
+
current_name = line
|
65 |
+
continue
|
66 |
+
if state == 1:
|
67 |
+
state = 2
|
68 |
+
continue
|
69 |
+
|
70 |
+
if state == 2 and "--" in line:
|
71 |
+
state = 1
|
72 |
+
boxes[current_name] = np.array(current_boxes).astype("float32")
|
73 |
+
current_name = line
|
74 |
+
current_boxes = []
|
75 |
+
continue
|
76 |
+
|
77 |
+
if state == 2:
|
78 |
+
box = [float(x) for x in line.split(" ")[:4]]
|
79 |
+
current_boxes.append(box)
|
80 |
+
continue
|
81 |
+
|
82 |
+
f = open(cache_file, "wb")
|
83 |
+
pickle.dump(boxes, f)
|
84 |
+
f.close()
|
85 |
+
return boxes
|
86 |
+
|
87 |
+
|
88 |
+
def read_pred_file(filepath):
|
89 |
+
with open(filepath, "r") as f:
|
90 |
+
lines = f.readlines()
|
91 |
+
img_file = lines[0].rstrip("\n\r")
|
92 |
+
lines = lines[2:]
|
93 |
+
|
94 |
+
# b = lines[0].rstrip('\r\n').split(' ')[:-1]
|
95 |
+
# c = float(b)
|
96 |
+
# a = map(lambda x: [[float(a[0]), float(a[1]), float(a[2]), float(a[3]), float(a[4])] for a in x.rstrip('\r\n').split(' ')], lines)
|
97 |
+
boxes = []
|
98 |
+
for line in lines:
|
99 |
+
line = line.rstrip("\r\n").split(" ")
|
100 |
+
if line[0] == "":
|
101 |
+
continue
|
102 |
+
# a = float(line[4])
|
103 |
+
boxes.append(
|
104 |
+
[
|
105 |
+
float(line[0]),
|
106 |
+
float(line[1]),
|
107 |
+
float(line[2]),
|
108 |
+
float(line[3]),
|
109 |
+
float(line[4]),
|
110 |
+
]
|
111 |
+
)
|
112 |
+
boxes = np.array(boxes)
|
113 |
+
# boxes = np.array(list(map(lambda x: [float(a) for a in x.rstrip('\r\n').split(' ')], lines))).astype('float')
|
114 |
+
return img_file.split("/")[-1], boxes
|
115 |
+
|
116 |
+
|
117 |
+
def get_preds(pred_dir):
|
118 |
+
events = os.listdir(pred_dir)
|
119 |
+
boxes = dict()
|
120 |
+
pbar = tqdm.tqdm(events)
|
121 |
+
|
122 |
+
for event in pbar:
|
123 |
+
pbar.set_description("Reading Predictions ")
|
124 |
+
event_dir = os.path.join(pred_dir, event)
|
125 |
+
event_images = os.listdir(event_dir)
|
126 |
+
current_event = dict()
|
127 |
+
for imgtxt in event_images:
|
128 |
+
imgname, _boxes = read_pred_file(os.path.join(event_dir, imgtxt))
|
129 |
+
current_event[imgname.rstrip(".jpg")] = _boxes
|
130 |
+
boxes[event] = current_event
|
131 |
+
return boxes
|
132 |
+
|
133 |
+
|
134 |
+
def norm_score(pred):
|
135 |
+
"""norm score
|
136 |
+
pred {key: [[x1,y1,x2,y2,s]]}
|
137 |
+
"""
|
138 |
+
|
139 |
+
max_score = 0
|
140 |
+
min_score = 1
|
141 |
+
|
142 |
+
for _, k in pred.items():
|
143 |
+
for _, v in k.items():
|
144 |
+
if len(v) == 0:
|
145 |
+
continue
|
146 |
+
_min = np.min(v[:, -1])
|
147 |
+
_max = np.max(v[:, -1])
|
148 |
+
max_score = max(_max, max_score)
|
149 |
+
min_score = min(_min, min_score)
|
150 |
+
|
151 |
+
diff = max_score - min_score
|
152 |
+
for _, k in pred.items():
|
153 |
+
for _, v in k.items():
|
154 |
+
if len(v) == 0:
|
155 |
+
continue
|
156 |
+
v[:, -1] = (v[:, -1] - min_score) / diff
|
157 |
+
|
158 |
+
|
159 |
+
def image_eval(pred, gt, ignore, iou_thresh):
|
160 |
+
"""single image evaluation
|
161 |
+
pred: Nx5
|
162 |
+
gt: Nx4
|
163 |
+
ignore:
|
164 |
+
"""
|
165 |
+
|
166 |
+
_pred = pred.copy()
|
167 |
+
_gt = gt.copy()
|
168 |
+
pred_recall = np.zeros(_pred.shape[0])
|
169 |
+
recall_list = np.zeros(_gt.shape[0])
|
170 |
+
proposal_list = np.ones(_pred.shape[0])
|
171 |
+
|
172 |
+
_pred[:, 2] = _pred[:, 2] + _pred[:, 0]
|
173 |
+
_pred[:, 3] = _pred[:, 3] + _pred[:, 1]
|
174 |
+
_gt[:, 2] = _gt[:, 2] + _gt[:, 0]
|
175 |
+
_gt[:, 3] = _gt[:, 3] + _gt[:, 1]
|
176 |
+
|
177 |
+
overlaps = bbox_overlaps(_pred[:, :4], _gt)
|
178 |
+
|
179 |
+
for h in range(_pred.shape[0]):
|
180 |
+
gt_overlap = overlaps[h]
|
181 |
+
max_overlap, max_idx = gt_overlap.max(), gt_overlap.argmax()
|
182 |
+
if max_overlap >= iou_thresh:
|
183 |
+
if ignore[max_idx] == 0:
|
184 |
+
recall_list[max_idx] = -1
|
185 |
+
proposal_list[h] = -1
|
186 |
+
elif recall_list[max_idx] == 0:
|
187 |
+
recall_list[max_idx] = 1
|
188 |
+
|
189 |
+
r_keep_index = np.where(recall_list == 1)[0]
|
190 |
+
pred_recall[h] = len(r_keep_index)
|
191 |
+
return pred_recall, proposal_list
|
192 |
+
|
193 |
+
|
194 |
+
def img_pr_info(thresh_num, pred_info, proposal_list, pred_recall):
|
195 |
+
pr_info = np.zeros((thresh_num, 2)).astype("float")
|
196 |
+
for t in range(thresh_num):
|
197 |
+
thresh = 1 - (t + 1) / thresh_num
|
198 |
+
r_index = np.where(pred_info[:, 4] >= thresh)[0]
|
199 |
+
if len(r_index) == 0:
|
200 |
+
pr_info[t, 0] = 0
|
201 |
+
pr_info[t, 1] = 0
|
202 |
+
else:
|
203 |
+
r_index = r_index[-1]
|
204 |
+
p_index = np.where(proposal_list[: r_index + 1] == 1)[0]
|
205 |
+
pr_info[t, 0] = len(p_index)
|
206 |
+
pr_info[t, 1] = pred_recall[r_index]
|
207 |
+
return pr_info
|
208 |
+
|
209 |
+
|
210 |
+
def dataset_pr_info(thresh_num, pr_curve, count_face):
|
211 |
+
_pr_curve = np.zeros((thresh_num, 2))
|
212 |
+
for i in range(thresh_num):
|
213 |
+
_pr_curve[i, 0] = pr_curve[i, 1] / pr_curve[i, 0]
|
214 |
+
_pr_curve[i, 1] = pr_curve[i, 1] / count_face
|
215 |
+
return _pr_curve
|
216 |
+
|
217 |
+
|
218 |
+
def voc_ap(rec, prec):
|
219 |
+
# correct AP calculation
|
220 |
+
# first append sentinel values at the end
|
221 |
+
mrec = np.concatenate(([0.0], rec, [1.0]))
|
222 |
+
mpre = np.concatenate(([0.0], prec, [0.0]))
|
223 |
+
|
224 |
+
# compute the precision envelope
|
225 |
+
for i in range(mpre.size - 1, 0, -1):
|
226 |
+
mpre[i - 1] = np.maximum(mpre[i - 1], mpre[i])
|
227 |
+
|
228 |
+
# to calculate area under PR curve, look for points
|
229 |
+
# where X axis (recall) changes value
|
230 |
+
i = np.where(mrec[1:] != mrec[:-1])[0]
|
231 |
+
|
232 |
+
# and sum (\Delta recall) * prec
|
233 |
+
ap = np.sum((mrec[i + 1] - mrec[i]) * mpre[i + 1])
|
234 |
+
return ap
|
235 |
+
|
236 |
+
|
237 |
+
def evaluation(pred, gt_path, iou_thresh=0.5):
|
238 |
+
pred = get_preds(pred)
|
239 |
+
norm_score(pred)
|
240 |
+
(
|
241 |
+
facebox_list,
|
242 |
+
event_list,
|
243 |
+
file_list,
|
244 |
+
hard_gt_list,
|
245 |
+
medium_gt_list,
|
246 |
+
easy_gt_list,
|
247 |
+
) = get_gt_boxes(gt_path)
|
248 |
+
event_num = len(event_list)
|
249 |
+
thresh_num = 1000
|
250 |
+
settings = ["easy", "medium", "hard"]
|
251 |
+
setting_gts = [easy_gt_list, medium_gt_list, hard_gt_list]
|
252 |
+
aps = []
|
253 |
+
for setting_id in range(3):
|
254 |
+
# different setting
|
255 |
+
gt_list = setting_gts[setting_id]
|
256 |
+
count_face = 0
|
257 |
+
pr_curve = np.zeros((thresh_num, 2)).astype("float")
|
258 |
+
# [hard, medium, easy]
|
259 |
+
pbar = tqdm.tqdm(range(event_num))
|
260 |
+
for i in pbar:
|
261 |
+
pbar.set_description("Processing {}".format(settings[setting_id]))
|
262 |
+
event_name = str(event_list[i][0][0])
|
263 |
+
img_list = file_list[i][0]
|
264 |
+
pred_list = pred[event_name]
|
265 |
+
sub_gt_list = gt_list[i][0]
|
266 |
+
# img_pr_info_list = np.zeros((len(img_list), thresh_num, 2))
|
267 |
+
gt_bbx_list = facebox_list[i][0]
|
268 |
+
|
269 |
+
for j in range(len(img_list)):
|
270 |
+
pred_info = pred_list[str(img_list[j][0][0])]
|
271 |
+
|
272 |
+
gt_boxes = gt_bbx_list[j][0].astype("float")
|
273 |
+
keep_index = sub_gt_list[j][0]
|
274 |
+
count_face += len(keep_index)
|
275 |
+
|
276 |
+
if len(gt_boxes) == 0 or len(pred_info) == 0:
|
277 |
+
continue
|
278 |
+
ignore = np.zeros(gt_boxes.shape[0])
|
279 |
+
if len(keep_index) != 0:
|
280 |
+
ignore[keep_index - 1] = 1
|
281 |
+
pred_recall, proposal_list = image_eval(
|
282 |
+
pred_info, gt_boxes, ignore, iou_thresh
|
283 |
+
)
|
284 |
+
|
285 |
+
_img_pr_info = img_pr_info(
|
286 |
+
thresh_num, pred_info, proposal_list, pred_recall
|
287 |
+
)
|
288 |
+
|
289 |
+
pr_curve += _img_pr_info
|
290 |
+
pr_curve = dataset_pr_info(thresh_num, pr_curve, count_face)
|
291 |
+
|
292 |
+
propose = pr_curve[:, 0]
|
293 |
+
recall = pr_curve[:, 1]
|
294 |
+
|
295 |
+
ap = voc_ap(recall, propose)
|
296 |
+
aps.append(ap)
|
297 |
+
|
298 |
+
print("==================== Results ====================")
|
299 |
+
print("Easy Val AP: {}".format(aps[0]))
|
300 |
+
print("Medium Val AP: {}".format(aps[1]))
|
301 |
+
print("Hard Val AP: {}".format(aps[2]))
|
302 |
+
print("=================================================")
|
303 |
+
|
304 |
+
|
305 |
+
if __name__ == "__main__":
|
306 |
+
parser = argparse.ArgumentParser()
|
307 |
+
parser.add_argument("-p", "--pred", default="./widerface_txt/")
|
308 |
+
parser.add_argument("-g", "--gt", default="./ground_truth/")
|
309 |
+
|
310 |
+
args = parser.parse_args()
|
311 |
+
evaluation(args.pred, args.gt)
|
widerface_evaluate/ground_truth/wider_easy_val.mat
ADDED
Binary file (409 kB). View file
|
|
widerface_evaluate/ground_truth/wider_face_val.mat
ADDED
Binary file (398 kB). View file
|
|
widerface_evaluate/ground_truth/wider_hard_val.mat
ADDED
Binary file (424 kB). View file
|
|
widerface_evaluate/ground_truth/wider_medium_val.mat
ADDED
Binary file (413 kB). View file
|
|
widerface_evaluate/setup.py
ADDED
@@ -0,0 +1,13 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
"""
|
2 |
+
WiderFace evaluation code
|
3 |
+
author: wondervictor
|
4 |
+
mail: tianhengcheng@gmail.com
|
5 |
+
copyright@wondervictor
|
6 |
+
"""
|
7 |
+
|
8 |
+
from distutils.core import setup, Extension
|
9 |
+
from Cython.Build import cythonize
|
10 |
+
import numpy
|
11 |
+
|
12 |
+
package = Extension("bbox", ["box_overlaps.pyx"], include_dirs=[numpy.get_include()])
|
13 |
+
setup(ext_modules=cythonize([package]))
|
widerface_onnx_evalute.py
ADDED
@@ -0,0 +1,136 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import os
|
2 |
+
import argparse
|
3 |
+
|
4 |
+
import onnxruntime as ort
|
5 |
+
from utils import *
|
6 |
+
|
7 |
+
|
8 |
+
CFG = {
|
9 |
+
"name": "mobilenet0.25",
|
10 |
+
"min_sizes": [[16, 32], [64, 128], [256, 512]],
|
11 |
+
"steps": [8, 16, 32],
|
12 |
+
"variance": [0.1, 0.2],
|
13 |
+
"clip": False,
|
14 |
+
}
|
15 |
+
INPUT_SIZE = [608, 640] #resize scale
|
16 |
+
DEVICE = torch.device("cpu")
|
17 |
+
|
18 |
+
|
19 |
+
def save_result(img_name, dets, save_folder):
|
20 |
+
"""Save detection results
|
21 |
+
Args:
|
22 |
+
img_name: origin image name
|
23 |
+
dets: detection results
|
24 |
+
save_folder: results path
|
25 |
+
"""
|
26 |
+
if not os.path.exists(save_folder):
|
27 |
+
os.makedirs(save_folder)
|
28 |
+
save_name = save_folder + img_name[:-4] + ".txt"
|
29 |
+
dirname = os.path.dirname(save_name)
|
30 |
+
if not os.path.isdir(dirname):
|
31 |
+
os.makedirs(dirname)
|
32 |
+
with open(save_name, "w") as fw:
|
33 |
+
bboxs = dets
|
34 |
+
file_name = os.path.basename(save_name)[:-4] + "\n"
|
35 |
+
bboxs_num = str(len(bboxs)) + "\n"
|
36 |
+
fw.write(file_name)
|
37 |
+
fw.write(bboxs_num)
|
38 |
+
for box in bboxs:
|
39 |
+
x = int(box[0])
|
40 |
+
y = int(box[1])
|
41 |
+
w = int(box[2]) - int(box[0])
|
42 |
+
h = int(box[3]) - int(box[1])
|
43 |
+
confidence = str(box[4])
|
44 |
+
line = (str(x) + " " + str(y) + " " + str(w) + " " + str(h) + " " + confidence + " \n")
|
45 |
+
fw.write(line)
|
46 |
+
|
47 |
+
|
48 |
+
def Retinaface_evalute(run_ort, args):
|
49 |
+
"""Retinaface_evalute function
|
50 |
+
Args:
|
51 |
+
run_ort : run_ort to evaluate.
|
52 |
+
args : parser parameter.
|
53 |
+
Returns:
|
54 |
+
predict result : under "--save_folder" path.
|
55 |
+
"""
|
56 |
+
# testing dataset
|
57 |
+
testset_folder = args.dataset_folder
|
58 |
+
testset_list = args.dataset_folder[:-7] + "wider_val.txt"
|
59 |
+
|
60 |
+
with open(testset_list, "r") as fr:
|
61 |
+
test_dataset = fr.read().split()
|
62 |
+
num_images = len(test_dataset)
|
63 |
+
|
64 |
+
# testing begin
|
65 |
+
for i, img_name in enumerate(test_dataset):
|
66 |
+
image_path = testset_folder + img_name
|
67 |
+
img_raw = cv2.imread(image_path, cv2.IMREAD_COLOR)
|
68 |
+
# preprocess
|
69 |
+
img, scale, resize = preprocess(img_raw, INPUT_SIZE, DEVICE)
|
70 |
+
# forward
|
71 |
+
outputs = run_ort.run(None, {run_ort.get_inputs()[0].name: img})
|
72 |
+
# postprocess
|
73 |
+
dets = postprocess(CFG, img, outputs, scale, resize, args.confidence_threshold, args.nms_threshold, DEVICE)
|
74 |
+
|
75 |
+
# save predict result
|
76 |
+
save_result(img_name, dets, args.save_folder)
|
77 |
+
print("im_detect: {:d}/{:d}".format(i + 1, num_images))
|
78 |
+
|
79 |
+
|
80 |
+
if __name__ == '__main__':
|
81 |
+
parser = argparse.ArgumentParser(description="Retinaface")
|
82 |
+
parser.add_argument(
|
83 |
+
"-m",
|
84 |
+
"--trained_model",
|
85 |
+
default="./weights/RetinaFace_int.onnx",
|
86 |
+
type=str,
|
87 |
+
help="Trained state_dict file path to open",
|
88 |
+
)
|
89 |
+
parser.add_argument(
|
90 |
+
"--save_folder",
|
91 |
+
default="./widerface_evaluate/widerface_txt/",
|
92 |
+
type=str,
|
93 |
+
help="Dir to save txt results",
|
94 |
+
)
|
95 |
+
parser.add_argument(
|
96 |
+
"--dataset_folder",
|
97 |
+
default="./data/widerface/val/images/",
|
98 |
+
type=str,
|
99 |
+
help="dataset path",
|
100 |
+
)
|
101 |
+
parser.add_argument(
|
102 |
+
"--confidence_threshold",
|
103 |
+
default=0.02,
|
104 |
+
type=float,
|
105 |
+
help="confidence_threshold",
|
106 |
+
)
|
107 |
+
parser.add_argument(
|
108 |
+
"--nms_threshold",
|
109 |
+
default=0.4,
|
110 |
+
type=float,
|
111 |
+
help="nms_threshold",
|
112 |
+
)
|
113 |
+
parser.add_argument(
|
114 |
+
"--ipu",
|
115 |
+
action="store_true",
|
116 |
+
help="Use IPU for inference.",
|
117 |
+
)
|
118 |
+
parser.add_argument(
|
119 |
+
"--provider_config",
|
120 |
+
type=str,
|
121 |
+
default="vaip_config.json",
|
122 |
+
help="Path of the config file for seting provider_options.",
|
123 |
+
)
|
124 |
+
|
125 |
+
args = parser.parse_args()
|
126 |
+
if args.ipu:
|
127 |
+
providers = ["VitisAIExecutionProvider"]
|
128 |
+
provider_options = [{"config_file": args.provider_config}]
|
129 |
+
else:
|
130 |
+
providers = ['CUDAExecutionProvider', 'CPUExecutionProvider']
|
131 |
+
provider_options = None
|
132 |
+
|
133 |
+
print("Loading pretrained model from {}".format(args.trained_model))
|
134 |
+
run_ort = ort.InferenceSession(args.trained_model, providers=providers, provider_options=provider_options)
|
135 |
+
|
136 |
+
Retinaface_evalute(run_ort, args)
|
widerface_onnx_inference.py
ADDED
@@ -0,0 +1,151 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import os
|
2 |
+
import argparse
|
3 |
+
|
4 |
+
import onnxruntime as ort
|
5 |
+
from utils import *
|
6 |
+
|
7 |
+
|
8 |
+
CFG = {
|
9 |
+
"name": "mobilenet0.25",
|
10 |
+
"min_sizes": [[16, 32], [64, 128], [256, 512]],
|
11 |
+
"steps": [8, 16, 32],
|
12 |
+
"variance": [0.1, 0.2],
|
13 |
+
"clip": False,
|
14 |
+
}
|
15 |
+
INPUT_SIZE = [608, 640] #resize scale
|
16 |
+
DEVICE = torch.device("cpu")
|
17 |
+
|
18 |
+
|
19 |
+
def vis(img_raw, dets, vis_thres):
|
20 |
+
"""Visualization original image
|
21 |
+
Args:
|
22 |
+
img_raw: origin image
|
23 |
+
dets: detections
|
24 |
+
vis_thres: visualization threshold
|
25 |
+
Returns:
|
26 |
+
visualization results
|
27 |
+
"""
|
28 |
+
for b in dets:
|
29 |
+
if b[4] < vis_thres:
|
30 |
+
continue
|
31 |
+
text = "{:.4f}".format(b[4])
|
32 |
+
b = list(map(int, b))
|
33 |
+
cv2.rectangle(img_raw, (b[0], b[1]), (b[2], b[3]), (0, 0, 255), 2)
|
34 |
+
cx = b[0]
|
35 |
+
cy = b[1] + 12
|
36 |
+
cv2.putText(img_raw, text, (cx, cy), cv2.FONT_HERSHEY_DUPLEX, 0.5, (255, 255, 255),)
|
37 |
+
|
38 |
+
# landms
|
39 |
+
cv2.circle(img_raw, (b[5], b[6]), 1, (0, 0, 255), 4)
|
40 |
+
cv2.circle(img_raw, (b[7], b[8]), 1, (0, 255, 255), 4)
|
41 |
+
cv2.circle(img_raw, (b[9], b[10]), 1, (255, 0, 255), 4)
|
42 |
+
cv2.circle(img_raw, (b[11], b[12]), 1, (0, 255, 0), 4)
|
43 |
+
cv2.circle(img_raw, (b[13], b[14]), 1, (255, 0, 0), 4)
|
44 |
+
# save image
|
45 |
+
if not os.path.exists("./results/"):
|
46 |
+
os.makedirs("./results/")
|
47 |
+
name = "./results/" + 'result' + ".jpg"
|
48 |
+
cv2.imwrite(name, img_raw)
|
49 |
+
|
50 |
+
|
51 |
+
def Retinaface_inference(run_ort, args):
|
52 |
+
"""Infer an image with onnx seession
|
53 |
+
Args:
|
54 |
+
run_ort: Onnx session
|
55 |
+
args: including image path and hyperparameters
|
56 |
+
Returns: boxes_list, confidence_list, landm_list
|
57 |
+
boxes_list = [[left, top, right, bottom]...]
|
58 |
+
confidence_list = [[confidence]...]
|
59 |
+
landm_list = [[landms(dim=10)]...]
|
60 |
+
"""
|
61 |
+
img_raw = cv2.imread(args.image_path, cv2.IMREAD_COLOR)
|
62 |
+
# preprocess
|
63 |
+
img, scale, resize = preprocess(img_raw, INPUT_SIZE, DEVICE)
|
64 |
+
# forward
|
65 |
+
outputs = run_ort.run(None, {run_ort.get_inputs()[0].name: img})
|
66 |
+
# postprocess
|
67 |
+
dets = postprocess(CFG, img, outputs, scale, resize, args.confidence_threshold, args.nms_threshold, DEVICE)
|
68 |
+
|
69 |
+
# result list
|
70 |
+
boxes = dets[:, :4]
|
71 |
+
confidences = dets[:, 4:5]
|
72 |
+
landms = dets[:, 5:]
|
73 |
+
boxes_list = [box.tolist() for box in boxes]
|
74 |
+
confidence_list = [confidence.tolist() for confidence in confidences]
|
75 |
+
landm_list = [landm.tolist() for landm in landms]
|
76 |
+
|
77 |
+
# save image
|
78 |
+
if args.save_image:
|
79 |
+
vis(img_raw, dets, args.vis_thres)
|
80 |
+
|
81 |
+
return boxes_list, confidence_list, landm_list
|
82 |
+
|
83 |
+
|
84 |
+
if __name__ == '__main__':
|
85 |
+
parser = argparse.ArgumentParser(description="Retinaface")
|
86 |
+
parser.add_argument(
|
87 |
+
"-m",
|
88 |
+
"--trained_model",
|
89 |
+
default="./weights/RetinaFace_int.onnx",
|
90 |
+
type=str,
|
91 |
+
help="Trained state_dict file path to open",
|
92 |
+
)
|
93 |
+
parser.add_argument(
|
94 |
+
"--image_path",
|
95 |
+
default="./data/widerface/val/images/18--Concerts/18_Concerts_Concerts_18_38.jpg",
|
96 |
+
type=str,
|
97 |
+
help="image path",
|
98 |
+
)
|
99 |
+
parser.add_argument(
|
100 |
+
"--confidence_threshold",
|
101 |
+
default=0.4,
|
102 |
+
type=float,
|
103 |
+
help="confidence_threshold"
|
104 |
+
)
|
105 |
+
parser.add_argument(
|
106 |
+
"--nms_threshold",
|
107 |
+
default=0.4,
|
108 |
+
type=float,
|
109 |
+
help="nms_threshold"
|
110 |
+
)
|
111 |
+
parser.add_argument(
|
112 |
+
"-s",
|
113 |
+
"--save_image",
|
114 |
+
action="store_true",
|
115 |
+
default=False,
|
116 |
+
help="show detection results",
|
117 |
+
)
|
118 |
+
parser.add_argument(
|
119 |
+
"--vis_thres",
|
120 |
+
default=0.5,
|
121 |
+
type=float,
|
122 |
+
help="visualization_threshold"
|
123 |
+
)
|
124 |
+
parser.add_argument(
|
125 |
+
"--ipu",
|
126 |
+
action="store_true",
|
127 |
+
help="Use IPU for inference.",
|
128 |
+
)
|
129 |
+
parser.add_argument(
|
130 |
+
"--provider_config",
|
131 |
+
type=str,
|
132 |
+
default="vaip_config.json",
|
133 |
+
help="Path of the config file for seting provider_options.",
|
134 |
+
)
|
135 |
+
|
136 |
+
args = parser.parse_args()
|
137 |
+
|
138 |
+
if args.ipu:
|
139 |
+
providers = ["VitisAIExecutionProvider"]
|
140 |
+
provider_options = [{"config_file": args.provider_config}]
|
141 |
+
else:
|
142 |
+
providers = ['CUDAExecutionProvider', 'CPUExecutionProvider']
|
143 |
+
provider_options = None
|
144 |
+
|
145 |
+
print("Loading pretrained model from {}".format(args.trained_model))
|
146 |
+
run_ort = ort.InferenceSession(args.trained_model, providers=providers, provider_options=provider_options)
|
147 |
+
|
148 |
+
boxes_list, confidence_list, landm_list = Retinaface_inference(run_ort, args)
|
149 |
+
print('inference done!')
|
150 |
+
print(boxes_list, confidence_list, landm_list)
|
151 |
+
|