Knowledge Base

AGU 2018 Fall Meeting: NSIDC DAAC Hands-on Tutorial

Programmatic Access to Data with Filtering and Customization Services


Introduction


This tutorial will walk you through the steps needed to successfully access data and customization services programmatically from the National Snow and Ice Data Center Distributed Active Archive Center (NSIDC DAAC) using our Application Programming Interface, or API. By the end of this 45-minute tutorial, you will have successfully determined the number of files available for a given data set and study area, requested an Earthdata Login token, and received subsetted and reformatted data using spatial and temporal filters, all using the curl command line tool.

Why use this service?

This API enables you to enter spatial and temporal filters as well as customization options into a single access command, without the need to script against our data directory structure. While the same filtering and customization services are available through NASA Earthdata Search, you may want to access data in a more programmatic way outside of a web interface, especially if you frequently download NSIDC DAAC data and are interested in incorporating data access commands into your analysis or visualization code.

Before we begin, please visit the Prerequisites and System Check page to make sure you have an Earthdata Login account, as well as an updated version of curl and TLS.


Lesson 1: Query Data Set Information

Lesson 2: Download ICESat/GLAS data

Lesson 3: Download more than 10 IceBridge files

Extra Credit

Additional Resources

Feedback form
 

Lesson 1: Query Data Set Information


The Common Metadata Repository (CMR) is a high-performance metadata system that provides search capabilities for data at NSIDC. During this lesson, we will perform an example search query of an ICESat/GLAS data set utilizing the CMR search API, which provides a wealth of metadata information that can be requested programmatically.

What is an API? API stands for Application Programming Interface. You can think of it as a middle man between an application or end-use (in this case, us) and a data provider (in this case, the Common Metadata Repository and NSIDC). These APIs are essentially structured as URL with a base plus indiviudal key-value-pairs (KVPs) separated by ‘&’. You can see an example of how the same API we're using in this lesson is utilized in NASA Earthdata Search to display data granules (files) filtered by space and time. We’ll also describe the syntax of NSIDC’s API structure in more detail in the next lesson.


Step 1: Open a blank text document (e.g. TextEdit on Mac or Notepad on Windows).

Set this to plain text if possible.


Step 2: Open a command line window. 

Mac OS X: Open a Terminal window, which is a UNIX-based command line application. 

Windows: Open a Command Prompt window. On Windows 10 with an Ubuntu BASH shell enabled, you can also open Ubuntu and follow the Mac/UNIX commands in the following lessons.

Linux: You can also follow the Mac/UNIX commands if you are on a Linux machine.


Step 3. Find the number of ICESat/GLAS files available within a time range and location.

The base URL of the CMR API granule (file) search is:

https://cmr.earthdata.nasa.gov/search/granules?

We will append the following Key-Value-Pairs (KVPs) to the base URL, separated by ‘&’:

Key Value Description

short_name

GLAH12

GLAS/ICESat L2 Antarctic and Greenland Ice Sheet Altimetry Data (HDF5)

version

034

Specifies the data set version number. If not specified, all available versions will be returned. This can either be a 3-digit or 1-digit number

bounding_box

-50.33333,68.56667,-49.33333,69.56667

Specifies a spatial filter in decimal degrees WEST,SOUTH,EAST,NORTH format

temporal

2009-01-01T00:00:00,2009-12-31T23:59:59

Specifies a temporal range filter in START_TIME,END_TIME format


Using curl and either grep or findstr depending on operating systems, you can receive the number of granules returned by CMR.

Copy the entire command in gray into your command line window and hit return:

Mac OS X/UNIX:

curl "https://cmr.earthdata.nasa.gov/search/granules?short_name=GLAH12&version=034&bounding_box=-50.33333,68.56667,-49.33333,69.56667&temporal=2009-01-01T00:00:00,2009-12-31T23:59:59&pretty=true" | grep "hits"

Windows:

curl "https://cmr.earthdata.nasa.gov/search/granules?short_name=GLAH12&version=034&bounding_box=-50.33333,68.56667,-49.33333,69.56667&temporal=2009-01-01T00:00:00,2009-12-31T23:59:59&pretty=true" | findstr "hits"


In this case, CMR returned 6 GLAH12 Version 034 hits, or granules, over my area and time of interest, as displayed in my command line window:

<hits>6</hits>


The default page size is 10, meaning that only up to 10 granules (files) will be returned by CMR unless the page_size KVP is included, which we will demonstrate in Lesson 3.


 

Lesson 2: Download ICESat/GLAS data


During this lesson, we will generate a token needed in order to access data using your Earthdata Login credentials, and we will apply that token to the API in order to download subsetted ICESat/GLAS data.

Step 1. Compile information required for Earthdata Login token generation.

You will need the following pieces of information:

a. Earthdata Login username and password

b. Your IP address: Go to WhatIsMyIp.Host and record your IP v4 address in your blank document.


Step 2. Request an Earthdata Login token.

a. Copy the entire command in gray into your Text Editor page (make sure this is set to plain text):

curl -X POST --header "Content-Type: application/xml" -d "<token><username>EARTHDATA_LOGIN_USER_NAME</username><password>EARTHDATA_LOGIN_PASSWORD</password><client_id>NSIDC_client_id</client_id><user_ip_address>YOUR_IP_ADDRESS</user_ip_address> </token>" https://cmr.earthdata.nasa.gov/legacy-services/rest/tokens

b. Replace all items in CAPS (EARTHDATA_LOGIN_USER_NAME, EARTHDATA_LOGIN_PASSWORD, and YOUR_IP_ADDRESS) with your personal credentials that you collected in Step 1.

c. Copy the command into your command line window and hit return. This command can be used across all operating systems.

Alternatively, you can copy the entire command in gray above directly into your command line window. Use your cursor to move to each item in CAPS and replace with your personal credentials within your command line, and then hit return.

You should receive the token ID in the response, which is valid for 30 days, even if you log out and log back in to Earthdata in a browser. The value in the <id> tag is the token you will use in Lesson 2 and beyond (in bold below).

d. Copy the token (<id> string) into your Text Editor page to be used in the API examples below. If you are having technical difficulties creating a token, you can copy the following token into your Text Editor page, which was created for use during this tutorial.

<token>
<id>93EA0855-20FB-8A6B-4249-CB5EE2ED0D32</id>
<username>earthdata_login_user_name</username>
<client_id>NSIDC_client_id</client_id>
<user_ip_address>your_origin_ip_address</user_ip_address>
</token>


Step 3. Download ICESat/GLAS data.

We will now download subsetted and reformatted GLAH12 data using the same spatial and temporal filters used in Lesson 1. Those 6 granules we queried above will be downloaded to the directory of your choosing as a single zip file. 

The base URL of the NSIDC DAAC API is:

https://n5eil02u.ecs.nsidc.org/egi/

We will append /request? to the URL, along with a series of Key-Value-Pairs (KVPs):

https://n5eil02u.ecs.nsidc.org/egi/request?CMR_kvp=CMR_kvp&service_kvp=service_kvp&token=token

We will use the following KVPs to request GLAH12 data. We are using the same CMR spatial and temporal filters that we used in Lesson 1, along with subsetting (cropping) and reformatting service values. In addition, we need to append the token that we created in Step 2 of Lesson 1:

Key Value Description

short_name

GLAH12

GLAS/ICESat L2 Antarctic and Greenland Ice Sheet Altimetry Data (HDF5)

version

034

Specifies the data set version number. If not specified, all available versions will be returned. This can either be a 3-digit or 1-digit number

temporal 2009-01-01T00:00:00,2009-12-31T23:59:59 Temporal filter CMR parameter, specified in START_TIME,END_TIME format

bounding_box

-50.33333,68.56667,-49.33333,69.56667

Spatial filter CMR parameter, specified in decimal degrees of latitude and longitude in WEST,SOUTH,EAST,NORTH format

bbox

-50.33333,68.56667,-49.33333,69.56667

Spatial subset service parameter, specified in decimal degrees of latitude and longitude in WEST,SOUTH,EAST,NORTH format

format TABULAR_ASCII Reformatting service parameter
token 93EA0855-20FB-8A6B-4249-CB5EE2ED0D32 Token used to access data using your Earthdata Login credentials

(For reference, our full API documentation includes a table of all available service KVPs)

a. Copy the entire command in gray into your Text Editor page (make sure this is set to plain text):

curl -O -J --dump-header response-header.txt "https://n5eil02u.ecs.nsidc.org/egi/request?short_name=GLAH12&version=034&bounding_box=-50.33333,68.56667,-49.33333,69.56667&bbox=-50.33333,68.56667,-49.33333,69.56667&time=2009-01-01T00:00:00,2009-12-31T23:59:59&format=TABULAR_ASCII&token=TOKEN-FROM-STEP-2"

b. Replace TOKEN-FROM-STEP-2 with your credentials from Step 1. Make sure to retain the final ".

c. Copy the command into your command line window and hit return. This command can be used across all operating systems. 

Alternatively, you can copy the entire command in gray above directly into your command line window. Copy your token from your Text Editor page. Use your cursor to move to the token item in your command line, paste your token, and then hit return, making sure to retain the final ". 

Four GLAH12 granules will be downloaded as a single .zip file to the current working directory, e.g.:

curl: Saved to filename '5000000130000.zip'

d. Open a Finder or File Explorer window to locate the downloaded file, or you can search for the name of the file, e.g. 5000000130000.zip, on your computer. 

e. Unzip the file by double-clicking on the file or using a zip utility. You will see 4 subfolders each containing a single ASCII file. Open a file using any text or CSV reader to see the data returned by the API.

In the curl command, we used the --dump-header option to return an HTTP response status code along with file name and/or debugging info in the response-header.txt file. The -O option will save the result in a local file and the -J option will name the saved file according to the returned header. See the Resources section below for additional curl option documentation. 


 

Lesson 3: Download more than 10 IceBridge files


Our last lesson will explore a request example that returns more than 10 granules (files). Because the default number of granules returned is set to 10, we will need to use an additional Key-Value-Pair (KVP) to designate the page size. Customization services are currently not available for Operation IceBridge data, therefore we also need to specify a no-processing KVP:

Key Value Description
page_size

14

This designates the number of files returned per page. The default is set to 10.
agent NO This specifies that no customization service processing will be performed.


Step 1. Download 14 IceBridge files. 

a. Copy the entire command in gray into your Text Editor page (make sure this is set to plain text):

curl -O -J --dump-header response-header.txt "https://n5eil02u.ecs.nsidc.org/egi/request?short_name=ILATM1B&version=2&time=2017-05-01T00:00:00,2017-05-01T23:59:59&bounding_box=-50.33333,68.56667,-49.33333,69.56667&agent=NO&page_size=14&token=TOKEN-FROM-LESSON-2"

b. Replace TOKEN-FROM-LESSON-2 with your credentials from Lesson 2. Make sure to retain the final ".

c. Copy the command into your command line window and hit return. This command can be used across all operating systems. 

Alternatively, you can copy the entire command in gray above directly into your command line window. Copy your token from your Text Editor page. Use your cursor to move to the token item in your command line, paste your token, and then hit return, making sure to retain the final ". 

Fourteen ILATM1B granules will be downloaded as a single .zip file to the current working directory, e.g.:

curl: Saved to filename '5000000130001.zip'

d. Open a Finder or File Explorer window to locate the downloaded file, or you can search for the name of the file, e.g. 5000000130000.zip, on your computer. 

e. Unzip the file by double-clicking on the file or using a zip utility. You will see 14 subfolders each containing a single HDF5 file and an associated metadata file. Open a file using an HDF5 reader (e.g. HDFView or Panoply) to see the data returned by the API.

Note that the maximum number of granules per request is set to 100. You may set this to the maximum value, but you run the risk of a connection timeout with large requests. 

Extra Credit


EC1: Download ICESat-2 Global Geolocated Photon Data Files.

This example will return spatial- and time- subsetted ICESat-2 Global Geolocated Photon Data (ATL03) data over the Jakobshavn Glacier from October 13, 2018 through November 13, 2018 reformatted to NetCDF-4:

a. Copy the entire command in gray into your Text Editor page (make sure this is set to plain text):

curl -O -J --dump-header response-header.txt "https://n5eil02u.ecs.nsidc.org/egi/request?short_name=ATL03&version=001&time=2018-10-13T00:00:00,2018-11-13T23:59:59&bounding_box=-50.33333,68.56667,-49.33333,69.56667&bbox=-50.33333,68.56667,-49.33333,69.56667&format=NetCDF4-CF&email=yes&token=TOKEN-FROM-LESSON-2"

b. Replace TOKEN-FROM-LESSON-2 with your credentials from Lesson 2. Make sure to retain the final ".

c. Copy the command into your command line window and hit return. This command can be used across all operating systems. 

Alternatively, you can copy the entire command in gray above directly into your command line window. Copy your token from your Text Editor page. Use your cursor to move to the token item in your command line, paste your token, and then hit return, making sure to retain the final ". 

EC2: Download ICESat-2 Land Ice Height Files over Inputted Polygon Region. 

This example will return variable- and spatial- subsetted ATLAS/ICESat-2 L3A Land Ice Height, Version 1 (ATL06) data over the Barnes Ice Cap using an uploaded shapefile of the ice cap boundary. The Barnes Ice Cap South Dome N Slope shapefile is available through the NSIDC Global Land Ice Measurements from Space (GLIMS) database

Note that the curl -F option is needed to POST a shapefile to the API. While this shapefile will be used to subset (crop) the data to the glacier outline, we also need to specify a polygon or bounding box to be used to initially filter the data. For simplicity, we will use course polygon coordinates to capture the boundary of our area of interest, however this may lead to filtered granules without data contained within the uploaded shapefile region. In this case, a “No data found that matched subset constraints” error will be returned for some granules, indicating that the requested height data of those granules does not exist within the bounds of the glacier. For best results, the polygon coordinates must be equal or near-equal to the coordinates of the uploaded shapefile.  

a. Copy the entire command in gray into your Text Editor page (make sure this is set to plain text):

curl -O -J -F "shapefile=@FILE-NAME-OF-SHAPEFILE"  "https://n5eil02u.ecs.nsidc.org/egi/request?short_name=ATL06&version=209&polygon=-74.886,69.408,-71.681,69.408,-71.681,70.62,-74.886,70.62,
-74.886,69.408&coverage=/gt1l/land_ice_segments/h_li,/gtl1/land_ice_segments/latitude,/gtl1/land_ice_segments/longitude,/gtlr/land_ice_segments/h_li,/gtlr/land_ice_segments/latitude,/gtlr/land_ice_segments/longitude,/gt2l/land_ice_segments/h_li,/gt21/land_ice_segments/latitude,/gt21/land_ice_segments/longitude,/gt2r/land_ice_segments/h_li,/gt2r/land_ice_segments/latitude,/gt2r/land_ice_segments/longitude,/gt31/land_ice_segments/h_li,/gt31/land_ice_segments/latitude,/gt31/land_ice_segments/longitude,/gt3r/land_ice_segments/h_li,/gt3r/land_ice_segments/latitude,/gt3r/land_ice_segments/longitude,/quality_assessment&email=yes&token=TOKEN-FROM-LESSON-2"

b. Replace FILE-NAME-OF-SHAPEFILE with the name of your shapefile, in .zip format including the .shp, .dbf, and .shx files) and TOKEN-FROM-LESSON-2 with your credentials from Lesson 2. Make sure to retain the final ".

c. Copy the command into your command line window and hit return, making sure that you are running the command within the same directory as your shapefile. 
Alternatively, you can copy the entire command in gray above directly into your command line window. Copy your token from your Text Editor page. Use your cursor to move to the token item in your command line, paste your token, and then hit return, making sure to retain the final ". 

EC3: Find the customization services available for ICESat/GLAS files.


We can use the NSIDC DAAC's access and service API to perform a query of the customization services available for GLAH12 version 034:

https://n5eil02u.ecs.nsidc.org/egi/capabilities/GLAH12.034.xml

Using curl and either grep or findstr depending on operating systems, we will request the reformatting and subsetting options available for this data set. 

Copy the entire command in gray into your command line window and hit return:

Mac OS X:

curl "https://n5eil02u.ecs.nsidc.org/egi/capabilities/GLAH12.034.xml" | grep "Format value\|SubsetAgent"


The available subsetting and reformatting options are displayed in the command response:

<SubsetAgent id="GLAS" spatialSubsetting="true" temporalSubsetting="true" type="both" maxGransSyncRequest="100" maxGransAsyncRequest="2000">
<Format value="" label="No Reformatting" />
<Format value="TABULAR_ASCII" label="Tabular ASCII" />
</SubsetAgent>

Additionally, "SubsetVariable" can be grepped to determine all variables contained within each HDF5, which can also be subsetted individually. 

EC4: ICESat/GLAS and IceBridge Query
 

This example will combine an ICESat/GLAS and Operation IceBridge data request in a single API command. We will request both GLAH12 as well as ILVIS2 using the agent=NO KVP like we did in Lesson 3 to specify that we request the data to be returned natively, without any subset or format customizations.

a. Copy the entire command in gray into your Text Editor page (make sure this is set to plain text):

curl -O -J --dump-header response-header.txt "https://n5eil02u.ecs.nsidc.org/egi/request?short_name=ILVIS2&short_name=GLAH12&time=2009-10-10T06:50:53,2009-10-20T19:02:26&agent=NO&token=TOKEN-FROM-LESSON-2"

b. Replace TOKEN-FROM-LESSON-2 with your credentials from Lesson 2. Make sure to retain the final ".

c. Copy the command into your command line window and hit return. This command can be used across all operating systems. 

Alternatively, you can copy the entire command in gray above directly into your command line window. Copy your token from your Text Editor page. Use your cursor to move to the token item in your command line, paste your token, and then hit return, making sure to retain the final ". 

Eight ILVIS2 and 2 GLAH12 granules are returned, totally 10 granules delivered as a single zip file to the current working directory, e.g.:

curl: Saved to filename '5000000130002.zip'

d. Open a Finder or File Explorer window to locate the downloaded file, or you can search for the name of the file, e.g. 5000000130002.zip, on your computer. 

e. Unzip the file by double-clicking on the file or using a zip utility. You will see 10 subfolders each containing a single ILVIS2 TXT or GLAH12 HDF5 file with its corresponding metadata file. Open the files using either an HDF5 reader (e.g. HDFView or Panoply) or text file reader to see the data returned by the API.


EC5: Perform a loop on a large file request 

The following GLAH12 request will return 22 granules:

https://n5eil02u.ecs.nsidc.org/egi/request?short_name=GLAH12&version=034&time=2009-03-01T00:00:00,2009-03-31T23:59:59&bounding_box=-73.75,-72.75,-56.75,-59&bbox=-73.75,-72.75,-56.75,-59&token=TOKEN-FROM-LESSON-2

In Lesson 3, we used the page_size KVP to return more than 10 granules. We can also use the page_num KVP to request successive CMR page results. The following set of bash commands loops through the page numbers until all files are returned from the example above. This can only be performed using Mac OS X/UNIX command line:

i=1
while [ $? -eq 0 ]
do
    curl -O -J -L --dump-header response-header.txt "https://n5eil02u.ecs.nsidc.org/egi/request?short_name=GLAH12&version=034&time=2009-03-01T00:00:00,2009-03-31T23:59:59&bounding_box=-73.75,-72.75,-56.75,-59&bbox=-73.75,-72.75,-56.75,-59&token=TOKEN-FROM-LESSON-2&page_num='$i
    i=$[$i+1]
    grep -E 'Content-Disposition: attachment; filename=".*zip"' response-header.txt
done
 

Additional Resources


  1. Everything curl Documentation: "Everything curl is an extensive guide to everything there is to know about curl, the project, the command-line tool, the library, how everything started and how it came to be what it is today."
     
  2. CMR API Documentation
     
  3. How-to guide on our access and service API, including a table of all available customization service parameters
     
  4. HDFView download information 
     
  5. Panoply download information 
     
  6. HDF-EOS ICESat/GLAS Python, NCL, Matlab, and IDL code examples
     
  7. NSIDC's ICESat-2 Home Page


You can also contact NSIDC User Services with any questions or feedback on the material covered in this tutorial: nsidc@nsidc.org

Feedback form


Please take a moment to fill out this feedback form. Your responses are greatly appreciate and will help us improve our tutorials in the future.