Open SAR Toolkit - Tutorial 2, version 1.1, July 2020. Andreas Vollrath, ESA/ESRIN phi-lab

title


OST Tutorial II

How to access and download Sentinel-1 data with OST

Open In Colab


Short description

This notebook introduces you to OST’s main class Generic, and its subclass Sentinel1. The Generic class handles the basic structure of any OST batch processing project, while the Sentinel1 class provides methods to search, refine and download sets of acquisitions for the EU Copernicus Sentinel-1 mission.

This notebook is of interest for those users who like to only search and download Sentinel-1 data in an efficient way.

  • I: Get to know the Generic main class for setting up a OST Project

  • II: Get to know the Sentinel1 subclass, that features functions for data search and access


Requirements

NOTE: all cells that have an * after its number can be executed without changing any code.

0* - Install OST and dependencies

NOTE: Applies only if you haven’t fully installed OST and its dependencies yet, e.g. on Google Colab, so uncomment the lines in this case.

[ ]:
# !apt-get -y install wget
# !wget https://raw.githubusercontent.com/ESA-PhiLab/OST_Notebooks/master/install_ost.sh
# !bash install_ost.sh

I-1* - Import python libraries necessary for processing

[ ]:
# this imports we need to handle the folders, independent of the OS
from pathlib import Path
from pprint import pprint

# this is the Generic class, that basically handles all the workflow from beginning to the end
from ost import Generic

I-2 - Data selection parameters

In order to define your project you need to define 3 main attributes.

1 Area of Interest:

The Area of Interest can be defined in different ways:

  1. One possibility is to use the low resolution layer of country boundaries from geopandas. To select a specific country you need to specify its ISO3 code. You can find a collection of all ISO3 codes here.

  2. Another possibility is to provide a Well-Known Text formatted string, which is the format OST uses internally.

  3. A third possibility is to provide a path to a valid vector file supported by OGR (e.g. GeoJSON, GeoPackage, KML, Esri Shapefile). Try to keep that as simple as possible. If your layer contains lots of different entries (e.g. crop fields), create a convex hull beforehand and use this.

2 Time of Interest:

The time of interest is defined by a start and end date. The date is defined by a string in the format ‘YYYY-MM-DD’. If none of the two parameters are defined, both parameters will use default values, which is 2014-10-01 for start, and today for the end of the TOI.

3 Project directory

Here we set a high-level directory where all of the project-related data (i.e. inventory, download, processed files) will be stored or created.

[ ]:
# ----------------------------
# Area of interest
# ----------------------------

# Here we can either point to a shapefile, an ISO3 country code, or a WKT string
aoi = "AUT"  # AUT is the ISO3 country code of Austria

# ----------------------------
# Time of interest
# ----------------------------
# we set only the start date to today - 30 days
start = "2019-06-01"
end = "2019-08-31"

# ----------------------------
# Project folder
# ----------------------------

# get home folder
home = Path.home()

# create a processing directory
project_dir = home / "OST_Tutorials" / "Tutorial_2"

# ------------------------------
# Print out AOI and start date
# ------------------------------
print("AOI: ", aoi)
print("TOI start: ", start)
print("TOI end: ", end)
print("Project Directory: ", project_dir)

I-3* - Initialize the Generic class

The above defined variables are used to initialize the class with its main attributes.

[ ]:
# create an OST Generic class instance
ost_generic = Generic(project_dir=project_dir, aoi=aoi, start=start, end=end)

# Uncomment below to see the list of folders inside the project directory (UNIX only):
print("")
print("We use the linux ls command for listing the directories inside our project folder:")
!ls {project_dir}

I-4* Customise project parameters

The initialisation of the class creates a config file, where all project attributes are stored. This includes for example the lcoation of the download or the processing folder. Those can be customised as shown below. Also note that independent of the input format of the AOI, it will be stored as Well Known Text string. The possible input formats for AOI defintion will be covered in later tutorials.

[ ]:
# Default config as created by the class initialisation
print(" Before customisation")
print("---------------------------------------------------------------------")
pprint(ost_generic.config_dict)
print("---------------------------------------------------------------------")

# customisation
ost_generic.config_dict["download_dir"] = "/download"
ost_generic.config_dict["temp_dir"] = "/tmp"

print("")
print(" After customisation (note the change in download_dir and temp_dir)")
print("---------------------------------------------------------------------")
pprint(ost_generic.config_dict)

II-1* - The Sentinel1 class

The Sentinel1 class, as a subclass of the Generic class, inherts all the attributes and methods from the Generic class, and adds specific new ones for search and download of data.

[ ]:
# the import of the Sentinel1 class
from ost import Sentinel1

II-2* Initialize the Sentinel1 class

In addition to the AOI, TOI and project directory parameters needed for the initialization of the Generic class, three more Sentinel-1 specific attributes can be defined

  1. product_type: this can be either RAW, SLC, GRD or OCN (default is ‘*’ for all)

  2. the beam mode: this can be either IW, SM, EW or WV (default is ‘*’ for all)

  3. polarisation: This can be either VV, VH, HV, HH or a combination, e.g. VV, VH or HH, HV (default is ‘*’ for all)

Have a look at https://sentinel.esa.int/web/sentinel/user-guides/sentinel-1-sar/acquisition-modes for further information on Sentinel-1 acquisition modes and https://sentinel.esa.int/web/sentinel/missions/sentinel-1/observation-scenario for information of the observation scenario globally.

[ ]:
# initialize the Sentinel1 class
ost_s1 = Sentinel1(
    project_dir=project_dir,
    aoi=aoi,
    start=start,
    end=end,
    product_type="SLC",
    beam_mode="IW",
    polarisation="*",
)

II-3* Searching for data

The search method of our Sentinel1 class instance will trigger a search query on the scihub catalogue and get the results back in 2 ways:

  • write it into a shapefile (inside your inventory directory).

  • store it as an instance attribute in form of a Geopandas GeoDataFrame that can be called by ost_s1.inventory

You will need a valid scihub account to do this step. In case you do not have a scihub account yet, please go here to register.

IMPORTANT OST, by default, queries the Copernicus Apihub (i.e. a different server than the one you access over your web browser), for which user credentials will be transfered only after a week of registration to the standard open scihub (more info here). In case this is an issue, use the commented line with the specified base_url and comment out the standard search command.

So you may need to wait a couple of days after first registration before it works.

[ ]:
# ---------------------------------------------------
# for plotting purposes we use this iPython magic
%matplotlib inline
%pylab inline
pylab.rcParams["figure.figsize"] = (13, 13)
# ---------------------------------------------------

# search command
ost_s1.search()

# uncomment in case you have issues with the registration procedure
# ost_s1.search(base_url='https://scihub.copernicus.eu/dhus')

# we plot the full Inventory on a map
ost_s1.plot_inventory(transparency=0.1)

II-4* The inventory attribute

The results of the search are stored in the inventory attribute of the class instance ost_s1. This is actually a Geopandas Dataframe that stores all the available metadata from the scihub catalogue. Therefore all (geo)pandas functionality can be applied for filtering, plotting and selection.

[ ]:
print(
    "-----------------------------------------------------------------------------------------------------------"
)
print(" INFO: We found a total of {} products for our project definition".format(len(ost_s1.inventory)))
print(
    "-----------------------------------------------------------------------------------------------------------"
)
print("")
# combine OST class attribute with pandas head command to print out the first 5 rows of the
print(
    "-----------------------------------------------------------------------------------------------------------"
)
print("The columns of our inventory:")
print("")
print(ost_s1.inventory.columns)
print(
    "-----------------------------------------------------------------------------------------------------------"
)

print("")
print(
    "-----------------------------------------------------------------------------------------------------------"
)
print(" The last 5 rows of our inventory:")
print(ost_s1.inventory.tail(5))

II-5* Search Refinement

The results returned by the search algorithm on Copernicus scihub might not be 100% appropriate to what we are looking for. In this step we refine the results adressing possible issues and reduce later processing needs.

A first step splits the data by orbit direction (i.e. ascending and descending) and polarization mode (i.e. VV, VV/VH, HH, HH/HV). For each combination the routine then checks the coverage for the resulting combinations (e.g. descending VV/VH polarization). If one combination results in a non-full overlap to the AOI, all further steps are disregarded. In case a full coverage is possbile further refinement steps are taken:

  1. Some of the acquisition frames might have been processed and/or stored more than once in the ESA ground segment. Therefore they appear twice, with the scene identifier that only changes for the last 4 digits. It is necessary to identify those scenes in order to avoid redundancy. We therefore take the ones with the latest ingestion date to assure the use of the latest processor.

  2. Some of the scenes returned by the search query are actually not overlapping the AOI. This is because the search algorithm will actually check for data within a square defined by the outer bounds of the AOI geometry and not the AOI itself. The refinement only takes those frames overlapping with the AOI in order to reduce unnecassary processing later on.

  3. In the case of ascending tracks that cross the equator, the orbit number of the frames will increase by 1 even though they are practically from the same acquisition. During processing the frames need to be merged and the relative orbit numbers (i.e. tracks) should be the same. The metadata in the inventory is therefore updated in order to normalize the relative orbit number for the project.

  4. (optional) The tracks of Sentinel-1 overlap to a certain degree. The data inventory might return tracks that only marginally cross the AOI, but there AOI overlap is already covered by the adjacent track. Thus, if tracks do not contribute to the overall overlap of the AOI, they are disregarded.

  5. (optional) Some acquisitions might not cross the entire AOI. For the subsequent time-series/timescan processing this becomes problematic, since the generation of the time-series will only consider the overlapping region for all acquisitions per track.

  6. A similar issue appears when one track crosses the AOI twice. In other words some of the frames in the middle of the track are not overlapping the AOI and are already disregarded by step 2. The assembling of the non-subsequent frames during processing would result in a failure. The metadata in the inventory is consequently updated, where the first part of the relative orbit number will be renamed to XXX.1, the second part to XXX.2 and so on. During processing those acquistions will be handled as 2 different tracks, and only merged during the final mosaicking.

  7. (optional) A last step is needed to assure that for one mosaic in time that consists of different tracks, is only covered once by each track.

[ ]:
ost_s1.refine_inventory()

II-6* - Selecting the right data

The results of the refinement are stored in a new attribute called refined_inventory_dict. This is a python dictionary with the mosaic keys as dictionary keys, whereas the respective items are the refined Geodataframes.

[ ]:
pylab.rcParams["figure.figsize"] = (19, 19)

key = "ASCENDING_VVVH"
ost_s1.refined_inventory_dict[key]
ost_s1.plot_inventory(ost_s1.refined_inventory_dict[key], 0.1)

II-7* Downloading the data

Now that we have a refined selection of the scenes we want to download, we have different data mirrors as options. By executing the following cell, OST will ask you from which data portal you want to download.

ESA’s Scihub catalogue

The main entry point is the offcial scihub catalogue from ESA. It is however limited to 2 concurrent donwloads at the same time. Also note that it is a rolling archive, so for historical data, a special procedure has to applied to download this data (see Tips and Tricks notebook).

Alternative I - Alaska Satellite Facility:

A good alternative is the download mirror from the Alaska Satellite Facility, which provides the full archive of Sentinel-1 data. In order to get registered, go on their data portal and register. If you already have a NASA Earthdata account, make sure you signed the specific EULA needed to access the Copernicus data. A good practice is to try a download directly from the vertex data protal, to assure everything works.

Alternative II - PEPS server from CNES:

Another good alternative is the Peps server from the French Space Agency CNES. While it is also a rolling archive, copies of historic data are stored on tape and can be easily transferred to the online available storage. OST takes care of that automatically. You can register for an account here

Alternative III - ONDA DIAS by Serco:

Another good alternative is the free data access portal from the ONDA DIAS. This is especially well suited for SLC data for which it holds the full archive. GRD data is accessible by a rolling archive. You can register for an account here.

NOTE While for scihub there is a limit of 2 concurrent downloads, ASF, PEPS and ONDA do not have such strict limits. For ASF the limit is 10, and we can set this with the keyword concurrent.

[ ]:
ost_s1.download(ost_s1.refined_inventory_dict[key], concurrent=10)