Spaces:
Runtime error
Runtime error
# Prepare Datasets for Mask2Former | |
A dataset can be used by accessing [DatasetCatalog](https://detectron2.readthedocs.io/modules/data.html#detectron2.data.DatasetCatalog) | |
for its data, or [MetadataCatalog](https://detectron2.readthedocs.io/modules/data.html#detectron2.data.MetadataCatalog) for its metadata (class names, etc). | |
This document explains how to setup the builtin datasets so they can be used by the above APIs. | |
[Use Custom Datasets](https://detectron2.readthedocs.io/tutorials/datasets.html) gives a deeper dive on how to use `DatasetCatalog` and `MetadataCatalog`, | |
and how to add new datasets to them. | |
MaskFormer has builtin support for a few datasets. | |
The datasets are assumed to exist in a directory specified by the environment variable | |
`DETECTRON2_DATASETS`. | |
Under this directory, detectron2 will look for datasets in the structure described below, if needed. | |
``` | |
$DETECTRON2_DATASETS/ | |
ADEChallengeData2016/ | |
coco/ | |
cityscapes/ | |
mapillary_vistas/ | |
``` | |
You can set the location for builtin datasets by `export DETECTRON2_DATASETS=/path/to/datasets`. | |
If left unset, the default is `./datasets` relative to your current working directory. | |
The [model zoo](https://github.com/facebookresearch/MaskFormer/blob/master/MODEL_ZOO.md) | |
contains configs and models that use these builtin datasets. | |
## Expected dataset structure for [COCO](https://cocodataset.org/#download): | |
``` | |
coco/ | |
annotations/ | |
instances_{train,val}2017.json | |
panoptic_{train,val}2017.json | |
{train,val}2017/ | |
# image files that are mentioned in the corresponding json | |
panoptic_{train,val}2017/ # png annotations | |
panoptic_semseg_{train,val}2017/ # generated by the script mentioned below | |
``` | |
Install panopticapi by: | |
``` | |
pip install git+https://github.com/cocodataset/panopticapi.git | |
``` | |
Then, run `python datasets/prepare_coco_semantic_annos_from_panoptic_annos.py`, to extract semantic annotations from panoptic annotations (only used for evaluation). | |
## Expected dataset structure for [cityscapes](https://www.cityscapes-dataset.com/downloads/): | |
``` | |
cityscapes/ | |
gtFine/ | |
train/ | |
aachen/ | |
color.png, instanceIds.png, labelIds.png, polygons.json, | |
labelTrainIds.png | |
... | |
val/ | |
test/ | |
# below are generated Cityscapes panoptic annotation | |
cityscapes_panoptic_train.json | |
cityscapes_panoptic_train/ | |
cityscapes_panoptic_val.json | |
cityscapes_panoptic_val/ | |
cityscapes_panoptic_test.json | |
cityscapes_panoptic_test/ | |
leftImg8bit/ | |
train/ | |
val/ | |
test/ | |
``` | |
Install cityscapes scripts by: | |
``` | |
pip install git+https://github.com/mcordts/cityscapesScripts.git | |
``` | |
Note: to create labelTrainIds.png, first prepare the above structure, then run cityscapesescript with: | |
``` | |
CITYSCAPES_DATASET=/path/to/abovementioned/cityscapes python cityscapesscripts/preparation/createTrainIdLabelImgs.py | |
``` | |
These files are not needed for instance segmentation. | |
Note: to generate Cityscapes panoptic dataset, run cityscapesescript with: | |
``` | |
CITYSCAPES_DATASET=/path/to/abovementioned/cityscapes python cityscapesscripts/preparation/createPanopticImgs.py | |
``` | |
These files are not needed for semantic and instance segmentation. | |
## Expected dataset structure for [ADE20k](http://sceneparsing.csail.mit.edu/): | |
``` | |
ADEChallengeData2016/ | |
images/ | |
annotations/ | |
objectInfo150.txt | |
# download instance annotation | |
annotations_instance/ | |
# generated by prepare_ade20k_sem_seg.py | |
annotations_detectron2/ | |
# below are generated by prepare_ade20k_pan_seg.py | |
ade20k_panoptic_{train,val}.json | |
ade20k_panoptic_{train,val}/ | |
# below are generated by prepare_ade20k_ins_seg.py | |
ade20k_instance_{train,val}.json | |
``` | |
The directory `annotations_detectron2` is generated by running `python datasets/prepare_ade20k_sem_seg.py`. | |
Install panopticapi by: | |
```bash | |
pip install git+https://github.com/cocodataset/panopticapi.git | |
``` | |
Download the instance annotation from http://sceneparsing.csail.mit.edu/: | |
```bash | |
wget http://sceneparsing.csail.mit.edu/data/ChallengeData2017/annotations_instance.tar | |
``` | |
Then, run `python datasets/prepare_ade20k_pan_seg.py`, to combine semantic and instance annotations for panoptic annotations. | |
And run `python datasets/prepare_ade20k_ins_seg.py`, to extract instance annotations in COCO format. | |
## Expected dataset structure for [Mapillary Vistas](https://www.mapillary.com/dataset/vistas): | |
``` | |
mapillary_vistas/ | |
training/ | |
images/ | |
instances/ | |
labels/ | |
panoptic/ | |
validation/ | |
images/ | |
instances/ | |
labels/ | |
panoptic/ | |
mapillary_vistas_instance_{train,val}.json # generated by the script mentioned below | |
``` | |
No preprocessing is needed for Mapillary Vistas on semantic and panoptic segmentation. | |
If you want to evaluate instance segmentation on Mapillary Vistas, run `python datasets/prepare_mapillary_vistas_ins_seg.py` to generate COCO-style instance annotations. | |
## Expected dataset structure for [YouTubeVIS 2019](https://competitions.codalab.org/competitions/20128): | |
``` | |
ytvis_2019/ | |
{train,valid,test}.json | |
{train,valid,test}/ | |
Annotations/ | |
JPEGImages/ | |
``` | |
## Expected dataset structure for [YouTubeVIS 2021](https://competitions.codalab.org/competitions/28988): | |
``` | |
ytvis_2021/ | |
{train,valid,test}.json | |
{train,valid,test}/ | |
Annotations/ | |
JPEGImages/ | |
``` | |