File size: 4,095 Bytes
08860ab
 
 
 
 
 
 
 
 
 
24effba
 
 
 
08860ab
 
 
 
 
 
 
 
 
2c3957e
08860ab
 
2c3957e
08860ab
2c3957e
08860ab
 
 
2c3957e
08860ab
2c3957e
08860ab
2c3957e
08860ab
 
 
 
232139b
 
08860ab
 
232139b
08860ab
 
232139b
08860ab
 
 
 
 
2c3957e
08860ab
2c3957e
08860ab
 
2c3957e
08860ab
2c3957e
08860ab
2c3957e
08860ab
2c3957e
08860ab
2c3957e
08860ab
2c3957e
08860ab
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
---
license: apache-2.0
tags:
- object-detection
- license-plate-detection
- vehicle-detection
datasets:
- coco
- license-plate-detection
widget:
- src: https://drive.google.com/uc?id=1j9VZQ4NDS4gsubFf3m2qQoTMWLk552bQ
  example_title: "Skoda 1"
- src: https://drive.google.com/uc?id=1p9wJIqRz3W50e2f_A0D8ftla8hoXz4T5
  example_title: "Skoda 2"
metrics:
- average precision
- recall
- IOU
model-index:
- name: yolos-small-rego-plates-detection
  results: []
---
# YOLOS (small-sized) model

The original YOLOS model was fine-tuned on COCO 2017 object detection (118k annotated images). It was introduced in the paper [You Only Look at One Sequence: Rethinking Transformer in Vision through Object Detection](https://arxiv.org/abs/2106.00666) by Fang et al. and first released in [this repository](https://github.com/hustvl/YOLOS). 
This model was further fine-tuned on the [license plate dataset]("https://www.kaggle.com/datasets/andrewmvd/car-plate-detection") from Kaggle. The dataset consists of 735 images of annotations categorised as "vehicle" and "license-plate". The model was trained for 200 epochs on a single GPU using Google Colab

## Model description

YOLOS is a Vision Transformer (ViT) trained using the DETR loss. Despite its simplicity, a base-sized YOLOS model is able to achieve 42 AP on COCO validation 2017 (similar to DETR and more complex frameworks such as Faster R-CNN).
## Intended uses & limitations
You can use the raw model for object detection. See the [model hub](https://huggingface.co/models?search=hustvl/yolos) to look for all available YOLOS models.

### How to use

Here is how to use this model:

```python
from transformers import YolosFeatureExtractor, YolosForObjectDetection
from PIL import Image
import requests

url = 'https://drive.google.com/uc?id=1p9wJIqRz3W50e2f_A0D8ftla8hoXz4T5'
image = Image.open(requests.get(url, stream=True).raw)
feature_extractor = YolosFeatureExtractor.from_pretrained('nickmuchi/yolos-small-rego-plates-detection')
model = YolosForObjectDetection.from_pretrained('nickmuchi/yolos-small-rego-plates-detection')
inputs = feature_extractor(images=image, return_tensors="pt")
outputs = model(**inputs)

# model predicts bounding boxes and corresponding face mask detection classes
logits = outputs.logits
bboxes = outputs.pred_boxes
```
Currently, both the feature extractor and model support PyTorch. 

## Training data

The YOLOS model was pre-trained on [ImageNet-1k](https://huggingface.co/datasets/imagenet2012) and fine-tuned on [COCO 2017 object detection](https://cocodataset.org/#download), a dataset consisting of 118k/5k annotated images for training/validation respectively. 
### Training

This model was fine-tuned for 200 epochs on the [license plate dataset]("https://www.kaggle.com/datasets/andrewmvd/car-plate-detection").

## Evaluation results

This model achieves an AP (average precision) of **47.9**.

Accumulating evaluation results...

IoU metric: bbox

Metrics           | Metric Parameter      | Location    | Dets          | Value |
----------------  | --------------------- | ------------| ------------- | ----- |
Average Precision | (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] | 0.479 |
Average Precision | (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] | 0.752 |
Average Precision | (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] | 0.555 |
Average Precision | (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] | 0.147 |
Average Precision | (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] | 0.420 |
Average Precision | (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] | 0.804 |
Average Recall    | (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] | 0.437 |
Average Recall    | (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] | 0.641 |
Average Recall    | (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] | 0.676 |
Average Recall    | (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] | 0.268 |
Average Recall    | (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] | 0.641 |
Average Recall    | (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] | 0.870 |