Post
1599
First Global and Dense Open Embedding Dataset of Earth! ๐ ๐ค
Introducing the Major TOM embeddings dataset, created in collaboration with CloudFerro S.A. ๐ถ and ฮฆ-lab at the European Space Agency (ESA) ๐ฐ๏ธ. Together with @mikonvergence and Jฤdrzej S. Bojanowski, we present the first open-access dataset of Copernicus embeddings, offering dense, global coverage across the full acquisition areas of Sentinel-1 and Sentinel-2 sensors.
๐ก Highlights:
๐ Data: Over 8 million Sentinel-1 & Sentinel-2 images processed, distilling insights from 9.368 trillion pixels of raw data.
๐ง Models: Foundation models include SigLIP, DINOv2, and SSL4EO.
๐ฆ Scale: 62 TB of raw satellite data processed into 170M+ embeddings.
This project delivers open and free vectorized expansions of Major-TOM/README datasets, setting a new standard for embedding releases and enabling lightweight, scalable ingestion of Earth Observation (EO) data for countless applications.
๐ค Explore the datasets:
Major-TOM/Core-S2L1C-SSL4EO
Major-TOM/Core-S1RTC-SSL4EO
Major-TOM/Core-S2RGB-DINOv2
Major-TOM/Core-S2RGB-SigLIP
๐ Check paper: Global and Dense Embeddings of Earth: Major TOM Floating in the Latent Space (2412.05600)
๐ป Code notebook: https://github.com/ESA-PhiLab/Major-TOM/blob/main/05-Generate-Major-TOM-Embeddings.ipynb
Introducing the Major TOM embeddings dataset, created in collaboration with CloudFerro S.A. ๐ถ and ฮฆ-lab at the European Space Agency (ESA) ๐ฐ๏ธ. Together with @mikonvergence and Jฤdrzej S. Bojanowski, we present the first open-access dataset of Copernicus embeddings, offering dense, global coverage across the full acquisition areas of Sentinel-1 and Sentinel-2 sensors.
๐ก Highlights:
๐ Data: Over 8 million Sentinel-1 & Sentinel-2 images processed, distilling insights from 9.368 trillion pixels of raw data.
๐ง Models: Foundation models include SigLIP, DINOv2, and SSL4EO.
๐ฆ Scale: 62 TB of raw satellite data processed into 170M+ embeddings.
This project delivers open and free vectorized expansions of Major-TOM/README datasets, setting a new standard for embedding releases and enabling lightweight, scalable ingestion of Earth Observation (EO) data for countless applications.
๐ค Explore the datasets:
Major-TOM/Core-S2L1C-SSL4EO
Major-TOM/Core-S1RTC-SSL4EO
Major-TOM/Core-S2RGB-DINOv2
Major-TOM/Core-S2RGB-SigLIP
๐ Check paper: Global and Dense Embeddings of Earth: Major TOM Floating in the Latent Space (2412.05600)
๐ป Code notebook: https://github.com/ESA-PhiLab/Major-TOM/blob/main/05-Generate-Major-TOM-Embeddings.ipynb