Harmony API Quickstart Guide: Customizing NASA NSIDC DAAC data in Earthdata Cloud
Harmony API Quickstart Guide: Customizing NASA NSIDC DAAC data in Earthdata Cloud
The Harmony API provides programmatic access to the NASA NSIDC DAAC Earthdata Cloud archive. It enables users to request subsetted files with specific spatial and temporal filters applied for select data sets. This quickstart guide will walk you through the essential steps to start using the API.
For comprehensive Harmony documentation, please visit https://harmony.earthdata.nasa.gov. If you are interested in working with Harmony in Python, we recommend using the harmony-py library, designed to streamline the programmatic ordering experience: https://nasa-openscapes.github.io/earthdata-cloud-cookbook/tutorials/IS2_Harmony.html.
Prerequisites
- NASA Earthdata Login (EDL) for authentication
- Create an account at: https://urs.earthdata.nasa.gov/users/new
- An EDL is required to access NASA data and services
- curl installed on your computer
Authentication Setup
To simplify curl requests, store your EDL credentials in a .netrc file in your home directory:
echo 'machine urs.earthdata.nasa.gov login <your_username> password <your_password>' >> ~/.netrc
chmod 0600 ~/.netrc
The first command creates the .netrc file with your credentials, while the second sets the appropriate file permissions.
Understanding the API Structure
The Harmony API follows the OGC Coverages API version 1.0.0 standard. The basic structure of an API endpoint URL is:
https://harmony.earthdata.nasa.gov/{collection-id}/ogc-api-coverages/1.0.0/collections/{variable}/coverage/rangeset?{query-parameters}
Where:
- {collection-id}: CMR collection concept ID or data product short name
- {variable}: Specifies the variables to retrieve. Not all data sets support variable subsetting - in those cases use {all}.
- {query-parameters}: Key-value pairs appended to refine your query
For detailed information on required parameters and available query parameters, refer to:
Example: Spatial Subsetting of a Single Data File
Step 1: Build the URL
Let's construct a URL to spatially subset a single ATL08 granule:
We'll use these strings for the required URL parameters:
- {collection-id}:
ATL08
(defaults to most recent version) - {variable}:
all
(this retrieves all variables)
To refine our query, we'll append these key-value pairs:
granuleName=ATL08_20210925185926_00511302_006_02.h5
subset=lat(40:40.25)&subset=lon(-105.5:-105)
https://harmony.earthdata.nasa.gov/ATL08/ogc-api-coverages/1.0.0/collections/all/coverage/rangeset?granuleName=ATL08_20210925185926_00511302_006_02.h5&subset=lat(40:40.25)&subset=lon(-105.5:-105)
This URL subsets the data to a bounding box covering Boulder, CO, with coordinates (SW: 40, -105.5) and (NE: 40.25, -105).
Step 2: Use curl for Request
Execute the following curl command to download the spatially subsetted ATL08 granule:
curl -Lnbj 'https://harmony.earthdata.nasa.gov/ATL08/ogc-api-coverages/1.0.0/collections/all/coverage/rangeset?granuleName=ATL08_20210925185926_00511302_006_02.h5&subset=lat(40:40.25)&subset=lon(-105.5:-105)' -o output_file.h5
This command will save the subsetted granule as output_file.h5 in your current working directory.
The -Lnbj flags handle the passing of your EDL credentials. For more details on what each flag does, read https://harmony.earthdata.nasa.gov/docs#passing-credentials-with-curl.
Example: Spatial Subsetting of Multiple Data Files
This example builds upon the previous one, demonstrating how to spatially subset multiple data files instead of a single file. It also adds instructions on how to check the progress of the request and retrieve the subsetted files.
Step 1: Build the URL
To retrieve all ATL08 granules intersecting the defined bounding box, remove the granuleID key-value pair:
https://harmony.earthdata.nasa.gov/ATL08/ogc-api-coverages/1.0.0/collections/all/coverage/rangeset?subset=lat(40:40.25)&subset=lon(-105.5:-105)
Step 2: Use curl for Request
curl -Lnbj 'https://harmony.earthdata.nasa.gov/ATL08/ogc-api-coverages/1.0.0/collections/all/coverage/rangeset?subset=lat(40:40.25)&subset=lon(-105.5:-105)'
Step 3: Monitor Job Status
Multiple-file requests are asynchronous by default. To retrieve results, we'll check the job status JSON response in the terminal after running the curl command. Harmony processes requests as jobs, with the JSON response providing job details.
Example of the returned JSON response:
{"username":"username","status":"running","message":"The job is being processed","progress":0,"createdAt":"2024-10-29T19:55:44.737Z","updatedAt":"2024-10-29T19:55:44.737Z","dataExpiration":"2024-11-28T19:55:44.737Z","links":[{"title":"Cancels the job.","href":"<https://harmony.earthdata.nasa.gov/jobs/675244e3-6f5f-4319-8fd6-6a1ee648a44f/cancel","type":"application/json","rel":"canceler"},{"title":"Pauses> the job.","href":"<https://harmony.earthdata.nasa.gov/jobs/675244e3-6f5f-4319-8fd6-6a1ee648a44f/pause","type":"application/json","rel":"pauser"},{"href":"https://harmony.earthdata.nasa.gov/jobs/675244e3-6f5f-4319-8fd6-6a1ee648a44f?page=1&limit=2000","title":"The> current page","type":"application/json","rel":"self"}],"labels":[],"request":"<https://harmony.earthdata.nasa.gov/C2613553260-NSIDC_CPRD/ogc-api-coverages/1.0.0/collections/all/coverage/rangeset?subset=lat(40%3A40.25)&subset=lon(-105.5%3A-105)","numInputGranules":103,"jobID":"675244e3-6f5f-4319-8fd6-6a1ee648a44f>"}
Tip: Use an online JSON viewer for better readability.
The JSON response includes the job ID and status of our Harmony job—whether it's running, running with errors, complete, or complete with errors. For descriptions of each field in the Harmony job response, refer to this table: https://harmony.earthdata.nasa.gov/docs#response-2.
To check status, enter the Harmony jobs API endpoint (https://harmony.earthdata.nasa.gov/jobs/) plus the job ID from the JSON response in your browser.
Using the job ID from our example JSON response, the URL would look like this https://harmony.earthdata.nasa.gov/jobs/675244e3-6f5f-4319-8fd6-6a1ee64…
Refresh periodically until the status indicates complete or complete with errors.
Tip:
The Harmony documentation provides instructions for managing your job:
You'll find URLs for these actions in the JSON response as well.
You can also check the status of your job using the Harmony Workflow UI: https://harmony.earthdata.nasa.gov/workflow-ui
Step 4: Retrieve Subsetted files
Data transformed by Harmony services are staged in NASA Amazon Web Services (AWS) S3 buckets or in user-owned AWS S3 buckets. Data in NASA S3 buckets are accessed using signed URLs or temporary access credentials. These data can be downloaded to your local machine or accessed directly if you are working in an AWS cloud instance in AWS us-west-2.
Once processing is complete, the Harmony jobs API endpoint will display HTTPS links for downloading the subsetted files. Here's an example of a link:
https://harmony.earthdata.nasa.gov/service-results/harmony-prod-staging/public/675244e3-6f5f-4319-8fd6-6a1ee648a44f/79898776/ATL08_20190128053323_04700206_006_02_subsetted.h5
Users can parse the JSON response and use curl to download each file using the provided links. For helpful guidance on using curl to retrieve data, visit this resource. Subsetted files are available for 30 days.
Note: Python users can simplify this workflow with the highly recommended harmony-py library (https://github.com/nasa/harmony-py).
Additional Examples
Temporal Subsetting
This example demonstrates temporal subsetting of any granule within a specific time range on July 31, 2022, from 22:12:00 to 22:15:00.
- {collection-id}:
ATL06
- {variable}:
all
subset=time(%222022-07-31T22%3A12%3A00.000Z%22%3A%222022-07-31T22%3A15%3A00.000Z%22)
curl -Lnbj 'https://harmony.earthdata.nasa.gov/ATL06/ogc-api-coverages/1.0.0/collections/all/coverage/rangeset?subset=time(%222022-07-31T22%3A12%3A00.000Z%22%3A%222022-07-31T22%3A20%3A00.000Z%22)'
Note: The URL-encoded time range "%222022-07-31T22%3A12%3A00.000Z%22%3A%222022-07-31T22%3A15%3A00.000Z%22" is equivalent to "2022-07-31 22:12:00" to "2022-07-31 22:15:00"
Polygon Subsetting
This example shows how to subset a single ATL06 granule using a GeoJSON file and save the result as output_file.h5.
- {collection-id}:
ATL06
- {variable}:
all
- granuleName=
ATL06_20220731220955_06111603_006_02.h5
(CMR granule ID for the granule to be retrieved)
curl -Ln -bj 'https://harmony.earthdata.nasa.gov/ATL06/ogc-api-coverages/1.0.0/collections/all/coverage/rangeset?granuleName=ATL06_20220731220955_06111603_006_02.h5' -F "shapefile=@Iceland_outline_simple.zip;type=application/shapefile+zip" -o output_file.h5
Note: This example assumes you have the GeoJSON file "Iceland_example.geojson" in your current working directory. You can dowload this file from here.
Final Thoughts
This quickstart guide provides an introduction to using the Harmony API for accessing and subsetting data from the NASA NSIDC DAAC Earthdata Cloud archive. For more advanced usage and detailed information, refer to the complete documentation at https://harmony.earthdata.nasa.gov/.
If you have any questions or need assistance, please contact NSIDC User Services at nsidc@nsidc.org.