sebasmos commited on
Commit
8c62a56
1 Parent(s): d8ed789

Update README.md (#1)

Browse files

- Update README.md (f81d08ef957c78ab3207e2886611253516b0835f)

Files changed (1) hide show
  1. README.md +27 -7
README.md CHANGED
@@ -22,7 +22,7 @@ Project proposal ([READ MORE](https://eo4society.esa.int/wp-content/uploads/2023
22
  - **Repository:** [Code](https://github.com/sebasmos/satellite.extractor)
23
  - The proposed dataset format is shared in [Metadengue](https://github.com/sebasmos/MetaDengue)
24
  - **ESA project:** Sponsoring request ID 1c081a - Towards a Smart Eco-epidemiological Model of Dengue in Colombia using Satellite in Collaboration with MIT Critical Data Colombia)
25
- - **Point of Contact:** [Sebastian A. Cajas Ordóñez](mailto:scajasordonez@gmail.com)
26
 
27
 
28
  ## Summary
@@ -36,19 +36,25 @@ Here below find all the dataset's versions and descriptions.
36
  * **SAT1_dataset_5_best_cities**: Top 5 municipalities based on Baseline method from satellite extractor
37
 
38
  * **SAT2_dataset_10_best_cities**: Top 10 municipalities based on Baseline method from satellite extractor
 
39
 
40
  * **SAT3_FULL_COLOMBIA**: Top 81 municipalities based on Baseline method from satellite extractor
 
 
41
 
42
- * **SAT4_dataset_10_best_cities_augmented_v1**: Augmented data with aligned metadata. Data was extracted using recursive artifact removal, cloud removal based on LeastCC, and Nearest Interpolation for spatial resolution. Implemented [here](https://github.com/sebasmos/satellite.extractor/blob/main/notebooks/satellite_imagery_augmentation.ipynb) and augmentations applied to RGB channels while leaving other satellite channels unchanged:
 
 
43
 
44
- *Pre-processing*: The first step is to apply Contrast Limited Adaptive Histogram Equalization (CLAHE) to the image, with a clip limit of 6.0 and a tile grid size of 16 by 16. This technique enhances the contrast of the image while preventing over-amplification of noise. Secondly, we apply the RGBShift augmentation technique, which randomly shifts the values of pixels in the red, green, and blue channels of the image. This is done with a probability of 100% and is applied to 30 pixels per channel. Finally, we apply the RandomBrightnessContrast augmentation technique with a probability of 50%. This technique randomly adjusts the brightness and contrast of the image to create variations in the dataset.
 
 
 
 
45
 
46
  * **SAT5_dataset_10_best_cities_augmented_v2**: These images are improved to remove near black images method using a recursive [forward-backward artefact removal algorithm with inter-band data augmentation on satellite imagery](https://github.com/sebasmos/satellite.extractor/tree/main/src/PART_2_satellite-augmentation). Augmented data with aligned metadata. Improved version using Albumentation wrapper modules with extra augmented data. Data extracted using recursive artifact removal, cloud removal based on LeastCC, and Nearest Interpolation for spatial resolution. Implemented [Notebook](https://github.com/sebasmos/satellite.extractor/blob/main/notebooks/PART_2_satellite_imagery_augmentation.ipynb) and augmentations applied to RGB channels while leaving other satellite channels unchanged.
47
 
48
- *Pre-processing*: The image augmentation techniques used in this process include various forms of Gaussian noise, implemented through IAAAdditiveGaussianNoise with a probability of 20% and mean zero, and standard deviation of 0.01 * 255 or 0.05 * 255. The GaussNoise technique was also employed with a mean of zero and default variance of (10.0, 50.0). General blurring was implemented through MotionBlur (p=.2), MedianBlur (blur_limit=3, p=0.1), Blur(blur_limit=3, p=0.1), and ShiftScaleRotate(shift_limit=0.0625, scale_limit=0.2, rotate_limit=45, p=0.2). Distortion techniques included OpticalDistortion(p=0.3), GridDistortion(p=.1), IAAPiecewiseAffine(p=0.3), and IAAAffine(scale=(0.8, 1.2), translate_percent=0.1, rotate=15, shear=10, p=0.2). Finally, brightness adjustments were made using CLAHE(clip_limit=2), IAASharpen(), IAAEmboss(), RandomBrightnessContrast(), and HueSaturationValue(p=0.3).
49
 
50
- * **Creating Cloud-Cloudless Paired Dataset**: This dataset, derived from imagery in five Colombian municipalities, consists of 1640 images (820 pairs), where each of the 164 images per municipality is paired with a previously identified optimal cloudless image. The Cloud2CloudlesDataset class organizes these pairs into a new folder (DATASET), with images renamed to indicate ground truth and cloud presence. The class, initialized with source and destination paths, includes tests for image count verification and folder existence confirmation.
51
-
52
  ## Reading data
53
 
54
  The data can be read as (example):
@@ -89,4 +95,18 @@ MIT Critical data Colombia Team: Sebastian A. Cajas, David Restrepo, Kuan-Ting
89
 
90
  Please cite our work if you find the resources in this repository useful:
91
 
92
- Sebastian Andres Cajas Ordoñez, David Restrepo, Kuan-Ting Kuo, Dana Moukheiber, Atika Rahman Paddo, Leo Anthony Celi, Po-Chih Kuo (2022). Towards a Smart Eco-epidemiological Model of Dengue in Colombia using Satellite [Source code]. GitHub. https://github.com/sebasmos/satellite.extractor
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
  - **Repository:** [Code](https://github.com/sebasmos/satellite.extractor)
23
  - The proposed dataset format is shared in [Metadengue](https://github.com/sebasmos/MetaDengue)
24
  - **ESA project:** Sponsoring request ID 1c081a - Towards a Smart Eco-epidemiological Model of Dengue in Colombia using Satellite in Collaboration with MIT Critical Data Colombia)
25
+ - **Point of Contact:** [Sebastian A. Cajas Ordóñez](mailto:sebasmos@mit.edu)
26
 
27
 
28
  ## Summary
 
36
  * **SAT1_dataset_5_best_cities**: Top 5 municipalities based on Baseline method from satellite extractor
37
 
38
  * **SAT2_dataset_10_best_cities**: Top 10 municipalities based on Baseline method from satellite extractor
39
+ *RGB-Version*: [[Link](https://huggingface.co/datasets/MITCriticalData/10_municipalities_RGB)]
40
 
41
  * **SAT3_FULL_COLOMBIA**: Top 81 municipalities based on Baseline method from satellite extractor
42
+ *DATASET_81_CITIES_v1.0*: [[link](https://huggingface.co/datasets/MITCriticalData/DATASET_81_CITIES_v1.0)]
43
+ *DATASET_81_CITIES_v2.0*: [[link](https://huggingface.co/datasets/MITCriticalData/DATASET_81_CITIES_v2.0)]
44
 
45
+ * **Creating Cloud-Cloudless Paired Dataset**: This dataset, derived from imagery in five Colombian municipalities, consists of 1640 images (820 pairs), where each of the 164 images per municipality is paired with a previously identified optimal cloudless image. The Cloud2CloudlesDataset class organizes these pairs into a new folder (DATASET), with images renamed to indicate ground truth and cloud presence. The class, initialized with source and destination paths, includes tests for image count verification and folder existence confirmation.
46
+
47
+ * **Dataset on Rio de Janeiro, 2016-2018**: [[link](https://huggingface.co/MITCriticalData/dataset_rio_de_janeiro_2018_2023)] The datasets DATASET_rio_de_janeiro.zip and DATASET_rio_de_janeiro_forward_backwardv2.zip cover central Rio de Janeiro from 2016 to 2023, each containing 416 images per epidemiological week. DATASET_rio_de_janeiro.zip uses single-forward artifact removal, possibly leading to black images, while DATASET_rio_de_janeiro_forward_backwardv2.zip applies forward-backward artifact removal, replacing black images. Visit: https://github.com/sebasmos/satellite.extractor/tree/main/satellite_extractor/PART_1_satellite-augmentation for details.
48
 
49
+ * **Landsat Colombia 2008-2016**: [[Link](https://huggingface.co/datasets/MITCriticalData/L7_Dataset_2008_2015)]
50
+
51
+ * **MODIS 2 2008-2016**: [[link](https://huggingface.co/datasets/MITCriticalData/dataset_modis_2_2008_2015)]
52
+
53
+ * **SAT4_dataset_10_best_cities_augmented_v1**: Augmented data with aligned metadata. Data was extracted using recursive artifact removal, cloud removal based on LeastCC, and Nearest Interpolation for spatial resolution. Implemented [here](https://github.com/sebasmos/satellite.extractor/blob/main/notebooks/satellite_imagery_augmentation.ipynb) and augmentations applied to RGB channels while leaving other satellite channels unchanged:
54
 
55
  * **SAT5_dataset_10_best_cities_augmented_v2**: These images are improved to remove near black images method using a recursive [forward-backward artefact removal algorithm with inter-band data augmentation on satellite imagery](https://github.com/sebasmos/satellite.extractor/tree/main/src/PART_2_satellite-augmentation). Augmented data with aligned metadata. Improved version using Albumentation wrapper modules with extra augmented data. Data extracted using recursive artifact removal, cloud removal based on LeastCC, and Nearest Interpolation for spatial resolution. Implemented [Notebook](https://github.com/sebasmos/satellite.extractor/blob/main/notebooks/PART_2_satellite_imagery_augmentation.ipynb) and augmentations applied to RGB channels while leaving other satellite channels unchanged.
56
 
 
57
 
 
 
58
  ## Reading data
59
 
60
  The data can be read as (example):
 
95
 
96
  Please cite our work if you find the resources in this repository useful:
97
 
98
+ Satellite extractor, [Source code] GitHub. https://github.com/sebasmos/satellite.extractor
99
+
100
+ ```
101
+ @article{cajasmulti,
102
+ title={A Multi-Modal Satellite Imagery Dataset for Public Health Analysis in Colombia},
103
+ author={Cajas, Sebastian A and Restrepo, David and Moukheiber, Dana and Kuo, Kuan Ting and Wu, Chenwei and Chicangana, David Santiago Garcia and Paddo, Atika Rahman and Moukheiber, Mira and Moukheiber, Lama and Moukheiber, Sulaiman and others}
104
+ }
105
+
106
+ @article{kuo2024denguenet,
107
+ title={DengueNet: Dengue Prediction using Spatiotemporal Satellite Imagery for Resource-Limited Countries},
108
+ author={Kuo, Kuan-Ting and Moukheiber, Dana and Ordonez, Sebastian Cajas and Restrepo, David and Paddo, Atika Rahman and Chen, Tsung-Yu and Moukheiber, Lama and Moukheiber, Mira and Moukheiber, Sulaiman and Purkayastha, Saptarshi and others},
109
+ journal={arXiv preprint arXiv:2401.11114},
110
+ year={2024}
111
+ }
112
+ ```