File size: 6,945 Bytes
5cd6cee
 
46a3752
 
 
5cd6cee
46a3752
 
5cd6cee
 
46a3752
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
---
title: Detection Metrics
emoji: 📈
colorFrom: green
colorTo: indigo
sdk: static
app_file: README.md
pinned: true
---

![alt text](https://huggingface.co/spaces/rafaelpadilla/detection_metrics/resolve/main/assets/metrics_small.png)

This project implements object detection **Average Precision** metrics using COCO style.  

With `Detection Metrics` you can easily compute all 12 COCO metrics given the bounding boxes output by your object detection model:

### Average Precision (AP):
1. **AP**: AP at IoU=.50:.05:.95 
2. **AP<sup>IoU=.50</sup>**: AP at IoU=.50 (similar to mAP PASCAL VOC metric)  
3. **AP<sup>IoU=.75%</sup>**: AP at IoU=.75 (strict metric)  

### AP Across Scales:
4. **AP<sup>small</sup>**: AP for small objects: area < 322  
5. **AP<sup>medium</sup>**: AP for medium objects: 322 < area < 962  
6. **AP<sup>large</sup>**: AP for large objects: area > 962  

### Average Recall (AR):
7. **AR<sup>max=1</sup>**: AR given 1 detection per image  
8. **AR<sup>max=10</sup>**: AR given 10 detections per image  
9. **AR<sup>max=100</sup>**: AR given 100 detections per image 

### AR Across Scales:
10. **AR<sup>small</sup>**: AR for small objects: area < 322 
11. **AR<sup>medium</sup>**: AR for medium objects: 322 < area < 962 
12. **AR<sup>large</sup>**: AR for large objects: area > 962

## How to use detection metrics?

Basically, you just need to create your ground-truth data and prepare your evaluation loop to output the boxes, confidences and classes in the required format. Follow these steps:  


### Step 1: Prepare your ground-truth dataset

Convert your ground-truth annotations in JSON following the COCO format.
COCO ground-truth annotations are represented in a dictionary containing 3 elements: "images", "annotations" and "categories". 
The snippet below shows an example of the dictionary, and you can find [here](https://towardsdatascience.com/how-to-work-with-object-detection-datasets-in-coco-format-9bf4fb5848a4).

```
{
    "images": [
        {
            "id": 212226,
            "width": 500,
            "height": 335
        },
        ...
    ],
    "annotations": [
        {
            "id": 489885,
            "category_id": 1,
            "iscrowd": 0,
            "image_id": 212226,
            "area": 12836,
            "bbox": [
                235.6300048828125, # x
                84.30999755859375, # y
                158.08999633789062, # w
                185.9499969482422 # h
            ]
        },
        ....
    ],
    "categories": [
        {
            "supercategory": "none",
            "id": 1,
            "name": "person"
        },
        ...
    ]
}
```
You do not need to save the JSON in disk, you can keep it in memory as a dictionary.

### Step 2: Load the object detection evaluator:

Install Hugging Face's `Evaluate` module (`pip install evaluate`) to load the evaluator. More instructions [here](https://huggingface.co/docs/evaluate/installation).

Load the object detection evaluator passing the JSON created on the previous step through the argument `json_gt`:
`evaluator = evaluate.load("rafaelpadilla/detection_metrics", json_gt=ground_truth_annotations, iou_type="bbox")`

### Step 3: Loop through your dataset samples to obtain the predictions:

```python
    # Loop through your dataset
    for batch in dataloader_train:   

        # Get the image(s) from the batch
        images = batch["images"]
        # Get the image ids of the image
        image_ids = batch["image_ids"]

        # Pass the image(s) to your model to obtain bounding boxes, scores and labels
        predictions = model.predict_boxes(images)
        # Pass the predictions and image id to the evaluator
        evaluator.add(prediction=predictions, reference=image_ids)
    
    # Call compute to obtain your results
    results = evaluator.compute()
    print(results)

```

Regardless your model's architecture, your predictions must be converted to a dictionary containing 3 fields as shown below:

```python
predictions: [
    {
        "scores": [0.55, 0.95, 0.87], 
        "labels": [6, 1, 1],
        "boxes": [[100, 30, 40, 28], [40, 32, 50, 28], [128, 44, 23, 69]]
    },
    ...
]
```
* `scores`: List or torch tensor containing the confidences of your detections. A confidence is a value between 0 and 1.  
* `labels`: List or torch tensor with the indexes representing the labels of your detections.  
* `boxes`: List or torch tensors with the detected bounding boxes in the format `x,y,w,h`.  

The `reference` added to the evaluator in each loop is represented by a list of dictionaries containing the image id of the image in that batch.

For example, in a batch containing two images, with ids 508101 and 1853, the `reference` argument must receive `image_ids` in the following format:

```python
image_ids = [ {'image_id': [508101]}, {'image_id': [1853]} ]
```

After the loop, you have to call `evaluator.compute()` to obtain your results in the format of a dictionary. The metrics can also be seen in the prompt as:

```
IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.415
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.613
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.436
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.209
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.449
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.601
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.333
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.531
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.572
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.321
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.624
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.794
```

The scheme below illustrates how your `for` loop should look like:

![alt text](https://huggingface.co/spaces/rafaelpadilla/detection_metrics/resolve/main/assets/scheme_coco_evaluate.png)

-----------------------

## References and further readings:

1. [COCO Evaluation Metrics](https://cocodataset.org/#detection-eval)  
2. [A Survey on performance metrics for object-detection algorithms](https://www.researchgate.net/profile/Rafael-Padilla/publication/343194514_A_Survey_on_Performance_Metrics_for_Object-Detection_Algorithms/links/5f1b5a5e45851515ef478268/A-Survey-on-Performance-Metrics-for-Object-Detection-Algorithms.pdf)  
3. [A Comparative Analysis of Object Detection Metrics with a Companion Open-Source Toolkit](https://www.mdpi.com/2079-9292/10/3/279/pdf)  
4. [COCO ground-truth annotations for your datasets in JSON](https://towardsdatascience.com/how-to-work-with-object-detection-datasets-in-coco-format-9bf4fb5848a4)