Knowledge Base

AGU 2018 Fall Meeting: NSIDC DAAC Hands-on Tutorial

Programmatic Access to Data with Filtering and Customization Services


This tutorial was presented during the AGU 2018 Fall meeting and walks you through the steps needed to successfully access data and customization services programmatically from the National Snow and Ice Data Center Distributed Active Archive Center (NSIDC DAAC) using our Application Programming Interface, or API. By the end of this 45-minute tutorial, you will have successfully determined the number of files available for a given data set and study area, requested an Earthdata Login token, and received subsetted and reformatted data using spatial and temporal filters, all using the curl command line tool.

Why use this service?

This API enables you to enter spatial and temporal filters as well as customization options into a single access command, without the need to script against our data directory structure. While the same filtering and customization services are available through NASA Earthdata Search, you may want to access data in a more programmatic way outside of a web interface, especially if you frequently download NSIDC DAAC data and are interested in incorporating data access commands into your analysis or visualization code.

Before you get started


The prerequisites required for this tutorial include and Earthdata account and curl installation. If you are ready, skip down to Lesson 1.

1. Earthdata Login registration

An Earthdata Login account is required to access data from the NSIDC DAAC. Please visit the Earthdata Login registration page to register for an account. Keep your username and password handy for the lessons in this tutorial. 

2. curl installation

curl is a command line tool for getting or sending files using URL syntax. It supports a wide range of internet protocols including HTTPS, which is the data distribution protocol supported by the NSIDC DAAC. See the curl Documentation Overview page for more information. To install curl, visit the curl Download page. You may select the curl Download Wizard to be guided through their recommended download method, or you can choose from the list of packages sorted by operating system. If you are unsure whether or not you already have curl installed on your computer, follow these steps:

Mac OS X or Windows

From a Terminal or Command Prompt window (for Mac OS X or Windows, respectively), run the curl command:

curl

You should see help options displayed, indicating that curl is installed:

curl: try 'curl --help' or 'curl --manual' for more information

You can also check your curl version in a Terminal or Command Prompt window:

curl -V

The version information is displayed below:

curl 7.62.0 (x86_64-apple-darwin17.7.0) libcurl/7.62.0 OpenSSL/1.0.2q zlib/1.2.11

The most recent stable version is 7.62.0 but previous recent stable versions should also work for this exercise (curl 7.54.0 was tested and worked successfully, for example).

If you do not have curl or would like to upgrade your version, we recommend the following methods:


Mac OS X

Install Homebrew by pasting the following command in a Terminal window and following the script prompts. See the Homebrew documentation for more options required for OS X 10.8 Mountain Lion and below.  

/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

Once Homebrew is installed, paste the following command in a Terminal window to install curl with openSSL:

brew install curl --with-openssl

You can confirm your curl installation and version using the following commands:

brew search curl

This will display all partial or full matches. You should see a green check next to curl:

curl

To find version information using Homebrew:

brew info curl

This will display your version of curl:

curl: stable 7.62.0 (bottled), HEAD [keg-only]

To upgrade to the latest curl using homebrew:

brew upgrade curl --with-openssl

This should only take a few seconds to upgrade to the latest version.


Windows

Visit the curl Windows Downloads page to download the appropriate package for your Windows configuration. Follow the installation instructions provided in the docs folder of the zip file. Alternatively, if you are a Cygwin user, you can install either the Cygiwn 64 bit or 32 bit curl package. 


3. Transport Layer Security (TLS) 

The Transport Layer Security (TLS) is a protocol needed to access SSL secure web pages. NASA Earthdata systems is removing support of TLSv1.1 on 6 December 2018, therefore TLSv1.2 is required to access NSIDC DAAC data. 

In either Mac OSX or Windows, you can test whether or not you can connect to a web server that supports TLSv1.2 using the following curl command:

curl --max-time 3 -v -l --silent https://google.com/ --tlsv1.2


You should see a line starting with: * SSL connection using TLSv1.2  Alternatively, you use the following curl command: 

curl --tlsv1.2 --silent https://www.howsmyssl.com/a/check

The end of the output string should include "tls_version":"TLS 1.2".

Curl relies on the underlying cryptographic library that it is built against for TLS support, e.g., OpenSSL, NSS, etc. It may also be helpful to check that your cryptographic library contains the Elliptic Curve Ciphers that NASA Earthdata supports:

ECDHE-RSA-AES128-GCM-SHA256
ECDHE-RSA-AES256-GCM-SHA384
DHE-RSA-AES128-GCM-SHA256
DHE-RSA-AES256-GCM-SHA384
ECDHE-RSA-AES128-SHA256
ECDHE-RSA-AES256-SHA384
ECDHE-RSA-AES128-SHA
ECDHE-RSA-AES256-SHA


For Mac OS X users, you can run the following command to get a list of ciphers supported in your library, which should include the ciphers supported by NASA Earthdata:

openssl ciphers -v ALL | grep TLSv1.2 | awk '{print $1}' | sort -u

Lesson 1: Query Data Set Information


The Common Metadata Repository (CMR) is a high-performance metadata system that provides search capabilities for data at NSIDC. During this lesson, we will perform an example search query of an ICESat/GLAS data set utilizing the CMR search API, which provides a wealth of metadata information that can be requested programmatically.

What is an API?

API stands for Application Programming Interface. You can think of it as a middle man between an application or end-use (in this case, us) and a data provider (in this case, the Common Metadata Repository and NSIDC). These APIs are essentially structured as URL with a base plus indiviudal key-value-pairs (KVPs) separated by ‘&’. You can see an example of how the same API we're using in this lesson is utilized in NASA Earthdata Search to display data granules (files) filtered by space and time. We’ll also describe the syntax of NSIDC’s API structure in more detail in the next lesson.


Step 1: Open a blank text document (e.g. TextEdit on Mac or Notepad on Windows).

Set this to plain text if possible.


Step 2: Open a command line window. 

Mac OS X: Open a Terminal window, which is a UNIX-based command line application. 

Windows: Open a Command Prompt window. On Windows 10 with an Ubuntu BASH shell enabled, you can also open Ubuntu and follow the Mac/UNIX commands in the following lessons.

Linux: You can also follow the Mac/UNIX commands if you are on a Linux machine.


Step 3. Find the number of ICESat/GLAS files available within a time range and location.

The base URL of the CMR API granule (file) search is:

https://cmr.earthdata.nasa.gov/search/granules?

We will append the following Key-Value-Pairs (KVPs) to the base URL, separated by ‘&’:

Key Value Description

short_name

GLAH12

GLAS/ICESat L2 Antarctic and Greenland Ice Sheet Altimetry Data (HDF5)

version

034

Specifies the data set version number. If not specified, all available versions will be returned. This can either be a 3-digit or 1-digit number

bounding_box

-50.33333,68.56667,-49.33333,69.56667

Specifies a spatial filter in decimal degrees WEST,SOUTH,EAST,NORTH format

temporal

2009-01-01T00:00:00,2009-12-31T23:59:59

Specifies a temporal range filter in START_TIME,END_TIME format


Using curl and either grep or findstr depending on operating systems, you can receive the number of granules returned by CMR.

Copy the entire command in gray into your command line window and hit return:

Mac OS X/UNIX:

curl "https://cmr.earthdata.nasa.gov/search/granules?short_name=GLAH12&version=034&bounding_box=-50.33333,68.56667,-49.33333,69.56667&temporal=2009-01-01T00:00:00,2009-12-31T23:59:59&pretty=true" | grep "hits"

Windows:

curl "https://cmr.earthdata.nasa.gov/search/granules?short_name=GLAH12&version=034&bounding_box=-50.33333,68.56667,-49.33333,69.56667&temporal=2009-01-01T00:00:00,2009-12-31T23:59:59&pretty=true" | findstr "hits"


In this case, CMR returned 6 GLAH12 Version 034 hits, or granules, over my area and time of interest, as displayed in my command line window:

<hits>6</hits>

The default page size is 10, meaning that only up to 10 granules (files) will be returned by CMR unless the page_size KVP is included, which we will demonstrate in Lesson 3.

Lesson 2: Download ICESat/GLAS data


During this lesson, we will generate a token needed in order to access data using your Earthdata Login credentials, and we will apply that token to the API in order to download subsetted ICESat/GLAS data.

Step 1. Compile information required for Earthdata Login token generation.

You will need the following pieces of information:

a. Earthdata Login username and password

b. Your IP address: Go to WhatIsMyIp.Host and record your IP v4 address in your blank document.


Step 2. Request an Earthdata Login token.

a. Copy the entire command in gray into your Text Editor page (make sure this is set to plain text):

curl -X POST --header "Content-Type: application/xml" -d "<token><username>EARTHDATA_LOGIN_USER_NAME</username><password>EARTHDATA_LOGIN_PASSWORD</password><client_id>NSIDC_client_id</client_id><user_ip_address>YOUR_IP_ADDRESS</user_ip_address> </token>" https://cmr.earthdata.nasa.gov/legacy-services/rest/tokens

b. Replace all items in CAPS (EARTHDATA_LOGIN_USER_NAME, EARTHDATA_LOGIN_PASSWORD, and YOUR_IP_ADDRESS) with your personal credentials that you collected in Step 1.

c. Copy the command into your command line window and hit return. This command can be used across all operating systems.

Alternatively, you can copy the entire command in gray above directly into your command line window. Use your cursor to move to each item in CAPS and replace with your personal credentials within your command line, and then hit return.

You should receive the token ID in the response, which is valid for 30 days, even if you log out and log back in to Earthdata in a browser. The value in the <id> tag is the token you will use in Lesson 2 and beyond (in bold below).

d. Copy the token (<id> string) into your Text Editor page to be used in the API examples below. If you are having technical difficulties creating a token, you can copy the following token into your Text Editor page, which was created for use during this tutorial.

<token>
<id>93EA0855-20FB-8A6B-4249-CB5EE2ED0D32</id>
<username>earthdata_login_user_name</username>
<client_id>NSIDC_client_id</client_id>
<user_ip_address>your_origin_ip_address</user_ip_address>
</token>


Step 3. Download ICESat/GLAS data.

We will now download subsetted and reformatted GLAH12 data using the same spatial and temporal filters used in Lesson 1. Those 6 granules we queried above will be downloaded to the directory of your choosing as a single zip file. 

The base URL of the NSIDC DAAC API is:

https://n5eil02u.ecs.nsidc.org/egi/

We will append /request? to the URL, along with a series of Key-Value-Pairs (KVPs):

https://n5eil02u.ecs.nsidc.org/egi/request?CMR_kvp=CMR_kvp&service_kvp=service_kvp&token=token

We will use the following KVPs to request GLAH12 data. We are using the same CMR spatial and temporal filters that we used in Lesson 1, along with subsetting (cropping) and reformatting service values. In addition, we need to append the token that we created in Step 2 of Lesson 1:

Key Value Description

short_name

GLAH12

GLAS/ICESat L2 Antarctic and Greenland Ice Sheet Altimetry Data (HDF5)

version

034

Specifies the data set version number. If not specified, all available versions will be returned. This can either be a 3-digit or 1-digit number

temporal 2009-01-01T00:00:00,2009-12-31T23:59:59 Temporal filter CMR parameter, specified in START_TIME,END_TIME format

bounding_box

-50.33333,68.56667,-49.33333,69.56667

Spatial filter CMR parameter, specified in decimal degrees of latitude and longitude in WEST,SOUTH,EAST,NORTH format

bbox

-50.33333,68.56667,-49.33333,69.56667

Spatial subset service parameter, specified in decimal degrees of latitude and longitude in WEST,SOUTH,EAST,NORTH format

format TABULAR_ASCII Reformatting service parameter
token 93EA0855-20FB-8A6B-4249-CB5EE2ED0D32 Token used to access data using your Earthdata Login credentials

(For reference, our full API documentation includes a table of all available service KVPs)

a. Copy the entire command in gray into your Text Editor page (make sure this is set to plain text):

curl -O -J --dump-header response-header.txt "https://n5eil02u.ecs.nsidc.org/egi/request?short_name=GLAH12&version=034&bounding_box=-50.33333,68.56667,-49.33333,69.56667&bbox=-50.33333,68.56667,-49.33333,69.56667&time=2009-01-01T00:00:00,2009-12-31T23:59:59&format=TABULAR_ASCII&token=TOKEN-FROM-STEP-2"

b. Replace TOKEN-FROM-STEP-2 with your token from Step 2.

c. Copy the command into your command line window and hit return. This command can be used across all operating systems. 

Alternatively, you can copy the entire command in gray above directly into your command line window. Copy your token from your Text Editor page. Use your cursor to move to the token item in your command line, paste your token, and then hit return, making sure to retain the final ". 

Four GLAH12 granules will be downloaded as a single .zip file to the current working directory, e.g.:

curl: Saved to filename '5000000130000.zip'

d. Open a Finder or File Explorer window to locate the downloaded file, or you can search for the name of the file, e.g. 5000000130000.zip, on your computer. 

e. Unzip the file by double-clicking on the file or using a zip utility. You will see 4 subfolders each containing a single ASCII file. Open a file using any text or CSV reader to see the data returned by the API.

In the curl command, we used the --dump-header option to return an HTTP response status code along with file name and/or debugging info in the response-header.txt file. The -O option will save the result in a local file and the -J option will name the saved file according to the returned header. See the Resources section below for additional curl option documentation. 

Lesson 3: Download more than 10 IceBridge files


Our last lesson will explore a request example that returns more than 10 granules (files). Because the default number of granules returned is set to 10, we will need to use an additional Key-Value-Pair (KVP) to designate the page size. Customization services are currently not available for Operation IceBridge data, therefore we also need to specify a no-processing KVP:

Key Value Description
page_size

14

This designates the number of files returned per page. The default is set to 10.
agent NO This specifies that no customization service processing will be performed.


Step 1. Download 14 IceBridge files. 

a. Copy the entire command in gray into your Text Editor page (make sure this is set to plain text):

curl -O -J --dump-header response-header.txt "https://n5eil02u.ecs.nsidc.org/egi/request?short_name=ILATM1B&version=2&time=2017-05-01T00:00:00,2017-05-01T23:59:59&bounding_box=-50.33333,68.56667,-49.33333,69.56667&agent=NO&page_size=14&token=TOKEN-FROM-LESSON-2"

b. Replace TOKEN-FROM-LESSON-2 with your credentials from Lesson 2. Make sure to retain the final ".

c. Copy the command into your command line window and hit return. This command can be used across all operating systems. 

Alternatively, you can copy the entire command in gray above directly into your command line window. Copy your token from your Text Editor page. Use your cursor to move to the token item in your command line, paste your token, and then hit return, making sure to retain the final ". 

Fourteen ILATM1B granules will be downloaded as a single .zip file to the current working directory, e.g.:

curl: Saved to filename '5000000130001.zip'

d. Open a Finder or File Explorer window to locate the downloaded file, or you can search for the name of the file, e.g. 5000000130000.zip, on your computer. 

e. Unzip the file by double-clicking on the file or using a zip utility. You will see 14 subfolders each containing a single HDF5 file and an associated metadata file. Open a file using an HDF5 reader (e.g. HDFView or Panoply) to see the data returned by the API.

Note that the maximum number of granules per request is set to 100. You may set this to the maximum value, but you run the risk of a connection timeout with large requests. 

Extra Credit


EC1: Download ICESat-2 Global Geolocated Photon Data Files.

This example will return spatial- and time- subsetted ICESat-2 Global Geolocated Photon Data (ATL03) data over the Jakobshavn Glacier from October 13, 2018 through November 13, 2018 reformatted to NetCDF-4:

a. Copy the entire command in gray into your Text Editor page (make sure this is set to plain text):

curl -O -J --dump-header response-header.txt "https://n5eil02u.ecs.nsidc.org/egi/request?short_name=ATL03&version=001&time=2018-10-13T00:00:00,2018-11-13T23:59:59&bounding_box=-50.33333,68.56667,-49.33333,69.56667&bbox=-50.33333,68.56667,-49.33333,69.56667&format=NetCDF4-CF&email=yes&token=TOKEN-FROM-LESSON-2"

b. Replace TOKEN-FROM-LESSON-2 with your credentials from Lesson 2. Make sure to retain the final ".

c. Copy the command into your command line window and hit return. This command can be used across all operating systems. 

Alternatively, you can copy the entire command in gray above directly into your command line window. Copy your token from your Text Editor page. Use your cursor to move to the token item in your command line, paste your token, and then hit return, making sure to retain the final ". 

EC2: Download ICESat-2 Land Ice Height Files over Inputted Polygon Region. 

This example will return variable- and spatial- subsetted ATLAS/ICESat-2 L3A Land Ice Height, Version 1 (ATL06) data over the Barnes Ice Cap using an uploaded shapefile of the ice cap boundary. The Barnes Ice Cap South Dome N Slope shapefile is available through the NSIDC Global Land Ice Measurements from Space (GLIMS) database

Note that the curl -F option is needed to POST a shapefile to the API. While this shapefile will be used to subset (crop) the data to the glacier outline, we also need to specify a polygon or bounding box to be used to initially filter the data. For simplicity, we will use course polygon coordinates to capture the boundary of our area of interest, however this may lead to filtered granules without data contained within the uploaded shapefile region. In this case, a “No data found that matched subset constraints” error will be returned for some granules, indicating that the requested height data of those granules does not exist within the bounds of the glacier. For best results, the polygon coordinates must be equal or near-equal to the coordinates of the uploaded shapefile.  

a. Copy the entire command in gray into your Text Editor page (make sure this is set to plain text):

curl -O -J -F "shapefile=@FILE-NAME-OF-SHAPEFILE"  "https://n5eil02u.ecs.nsidc.org/egi/request?short_name=ATL06&version=209&polygon=-74.886,69.408,-71.681,69.408,-71.681,70.62,-74.886,70.62,
-74.886,69.408&coverage=/gt1l/land_ice_segments/h_li,/gtl1/land_ice_segments/latitude,/gtl1/land_ice_segments/longitude,/gtlr/land_ice_segments/h_li,/gtlr/land_ice_segments/latitude,/gtlr/land_ice_segments/longitude,/gt2l/land_ice_segments/h_li,/gt21/land_ice_segments/latitude,/gt21/land_ice_segments/longitude,/gt2r/land_ice_segments/h_li,/gt2r/land_ice_segments/latitude,/gt2r/land_ice_segments/longitude,/gt31/land_ice_segments/h_li,/gt31/land_ice_segments/latitude,/gt31/land_ice_segments/longitude,/gt3r/land_ice_segments/h_li,/gt3r/land_ice_segments/latitude,/gt3r/land_ice_segments/longitude,/quality_assessment&email=yes&token=TOKEN-FROM-LESSON-2"

b. Replace FILE-NAME-OF-SHAPEFILE with the name of your shapefile, in .zip format including the .shp, .dbf, and .shx files) and TOKEN-FROM-LESSON-2 with your credentials from Lesson 2. Make sure to retain the final ".

c. Copy the command into your command line window and hit return, making sure that you are running the command within the same directory as your shapefile. 
Alternatively, you can copy the entire command in gray above directly into your command line window. Copy your token from your Text Editor page. Use your cursor to move to the token item in your command line, paste your token, and then hit return, making sure to retain the final ". 

EC3: Find the customization services available for ICESat/GLAS files.


We can use the NSIDC DAAC's access and service API to perform a query of the customization services available for GLAH12 version 034:

https://n5eil02u.ecs.nsidc.org/egi/capabilities/GLAH12.034.xml

Using curl and either grep or findstr depending on operating systems, we will request the reformatting and subsetting options available for this data set. 

Copy the entire command in gray into your command line window and hit return:

Mac OS X:

curl "https://n5eil02u.ecs.nsidc.org/egi/capabilities/GLAH12.034.xml" | grep "Format value\|SubsetAgent"


The available subsetting and reformatting options are displayed in the command response:

<SubsetAgent id="GLAS" spatialSubsetting="true" temporalSubsetting="true" type="both" maxGransSyncRequest="100" maxGransAsyncRequest="2000">
<Format value="" label="No Reformatting" />
<Format value="TABULAR_ASCII" label="Tabular ASCII" />
</SubsetAgent>

Additionally, "SubsetVariable" can be grepped to determine all variables contained within each HDF5, which can also be subsetted individually. 

EC4: ICESat/GLAS and IceBridge Query
 

This example will combine an ICESat/GLAS and Operation IceBridge data request in a single API command. We will request both GLAH12 as well as ILVIS2 using the agent=NO KVP like we did in Lesson 3 to specify that we request the data to be returned natively, without any subset or format customizations.

a. Copy the entire command in gray into your Text Editor page (make sure this is set to plain text):

curl -O -J --dump-header response-header.txt "https://n5eil02u.ecs.nsidc.org/egi/request?short_name=ILVIS2&short_name=GLAH12&time=2009-10-10T06:50:53,2009-10-20T19:02:26&agent=NO&token=TOKEN-FROM-LESSON-2"

b. Replace TOKEN-FROM-LESSON-2 with your credentials from Lesson 2. Make sure to retain the final ".

c. Copy the command into your command line window and hit return. This command can be used across all operating systems. 

Alternatively, you can copy the entire command in gray above directly into your command line window. Copy your token from your Text Editor page. Use your cursor to move to the token item in your command line, paste your token, and then hit return, making sure to retain the final ". 

Eight ILVIS2 and 2 GLAH12 granules are returned, totally 10 granules delivered as a single zip file to the current working directory, e.g.:

curl: Saved to filename '5000000130002.zip'

d. Open a Finder or File Explorer window to locate the downloaded file, or you can search for the name of the file, e.g. 5000000130002.zip, on your computer. 

e. Unzip the file by double-clicking on the file or using a zip utility. You will see 10 subfolders each containing a single ILVIS2 TXT or GLAH12 HDF5 file with its corresponding metadata file. Open the files using either an HDF5 reader (e.g. HDFView or Panoply) or text file reader to see the data returned by the API.


EC5: Perform a loop on a large file request 

The following GLAH12 request will return 22 granules:

https://n5eil02u.ecs.nsidc.org/egi/request?short_name=GLAH12&version=034&time=2009-03-01T00:00:00,2009-03-31T23:59:59&bounding_box=-73.75,-72.75,-56.75,-59&bbox=-73.75,-72.75,-56.75,-59&token=TOKEN-FROM-LESSON-2

In Lesson 3, we used the page_size KVP to return more than 10 granules. We can also use the page_num KVP to request successive CMR page results. The following set of bash commands loops through the page numbers until all files are returned from the example above. This can only be performed using Mac OS X/UNIX command line:

i=1
while [ $? -eq 0 ]
do
    curl -O -J -L --dump-header response-header.txt "https://n5eil02u.ecs.nsidc.org/egi/request?short_name=GLAH12&version=034&time=2009-03-01T00:00:00,2009-03-31T23:59:59&bounding_box=-73.75,-72.75,-56.75,-59&bbox=-73.75,-72.75,-56.75,-59&token=TOKEN-FROM-LESSON-2&page_num="$i
    i=$[$i+1]
    grep -E 'Content-Disposition: attachment; filename=".*zip"' response-header.txt
done
 

Additional Resources


  1. Everything curl Documentation: "Everything curl is an extensive guide to everything there is to know about curl, the project, the command-line tool, the library, how everything started and how it came to be what it is today."
     
  2. CMR API Documentation
     
  3. How-to guide on our access and service API, including a table of all available customization service parameters
     
  4. HDFView download information 
     
  5. Panoply download information 
     
  6. HDF-EOS ICESat/GLAS Python, NCL, Matlab, and IDL code examples
     
  7. NSIDC's ICESat-2 Home Page


You can also contact NSIDC User Services with any questions or feedback on the material covered in this tutorial: nsidc@nsidc.org