Accessing NSIDC DAAC S3 Data with AWS CLI

NSIDC DAAC Earthdata Cloud datasets are stored in protected, non-listable Amazon S3 buckets.

This means:

  • You must be working in the us-west-2 region to access them via S3.
  • Buckets cannot be browsed — only exact file paths work.
  • Listing commands like aws s3 ls, aws s3 sync, or aws s3api list-objects return AccessDenied (403) errors.
  • From outside of AWS (i.e., on your laptop or in another region), you must use HTTPS endpoints instead.

This guide shows you how to use AWS CLI from an authorized cloud instance to download NSIDC DAAC datasets.

What is an Authorized Cloud Instance?

An authorized cloud instance means:

  • An AWS EC2 instance (or other compute environment in AWS) running in the us-west-2 region.
  • Configured with temporary EDL-based AWS Credentials for direct S3 access (see Temporary AWS Credentials guide for details).
  • Once configured, the instance can access NSIDC DAAC S3 buckets directly.

Local machines (e.g., laptops) are not authorized instances. Direct S3 access from outside AWS will fail — use HTTPS endpoints instead.

Why Use AWS CLI?

With AWS CLI on an authorized instance, you can:

  • Access protected NSIDC DAAC datasets directly from S3.
  • Automate multiple downloads with scripts.
  • Reproduce workflows securely and efficiently.

Key Points Before You Start

1.   Region restriction

  • All NSIDC DAAC buckets are in us-west-2.
  • Be sure to set this region in your CLI configuration or environment variables.

2.   Non-Listable buckets

  • You cannot explore folders or use recursive commands.
  • Use exact file paths from Earthdata Search or CMR.

3.   Temporary Credentials

  • Required values:
    • AWS_ACCESS_KEY_ID
    • AWS_SECRET_ACCESS_KEY
    • AWS_SESSION_TOKEN
  • Credentials expire after ~1 hour. Refresh them when starting a new session.

Setting Up Your Instance

If you don’t already have an AWS EC2 instance in us-west-2, follow Parts 1 and 2 in this guide: Launching and Connecting to a Free AWS EC2 Instance for NASA Earthdata Cloud Access.

In summary:

1.   Launch an EC2 instance in us-west-2.

2.   SSH into your instance from your local terminal.

Installing AWS CLI on Your Instance

Note: Installing AWS CLI is free, though downloading data may incur AWS transfer costs.

1.   Download the AWS CLI v2 installer (Linux x86_64):

curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"

2.   Install unzip (if needed) and extract:

sudo apt update && sudo apt install -y unzip
unzip awscliv2.zip

3.   Run the installer:

sudo ./aws/install

4.   Verify installation:

aws --version

Expected output: aws-cli/2.x.x Python/3.x.x Linux/x86_64

Configuring Temporary Credentials

1.   Retrieve temporary AWS credentials. See this guide for details: https://nsidc.org/data/user-resources/help-center/accessing-nsidc-daac-s3-data-temporary-aws-credentials

2.   Inside your instance, set credentials as environment variables:

export AWS_ACCESS_KEY_ID=<your_access_key>
export AWS_SECRET_ACCESS_KEY=<your_secret_key>
export AWS_SESSION_TOKEN=<your_session_token>
export AWS_DEFAULT_REGION=us-west-2

Downloading Data via AWS CLI

Since NSIDC DAAC buckets are non-listable, only a limited set of commands will work.

Supported Commands

  • aws s3 cp (single file, exact path)
  • aws s3api get-object (exact key)

Learn more: S3 paths and keys

Unsupported Commands

  • aws s3 ls
  • aws s3 sync
  • Any aws s3api list-* command

Example: Copy a single file

aws s3 cp s3://nsidc-cumulus-prod-protected/SMAP/SPL3SMP_E/006/2025/09/22/SMAP_L3_SM_P_E_20250922_R19240_001.h5 .

Advanced: Using s3api get-object

Note: For most single‑file downloads, aws s3 cp is simpler and easier. Use s3api if you need extra detail or more control.

When to use s3api
  • You want extra file details (JSON metadata such as size, type, last modified)
  • You only need part of a file (e.g., first few MB)
  • You want to check or skip downloads if the file hasn’t changed
  • You prefer explicit bucket/key commands with clearer error messages (e.g., NoSuchKey, AccessDenied)
Basic example
aws s3api get-object \\
  --bucket nsidc-cumulus-prod-protected \\
  --key SMAP/SPL3SMP/009/2025/09/SMAP_L3_SM_P_20250922_R19240_001.h5 \\
  ./SMAP_L3_SM_P_20250922_R19240_001.h5
  • Downloads the file SMAP_L3_SM_P_20250922_R19240_001.h5 from the protected NSIDC DAAC S3 bucket into the current directory
  • Returns a JSON response metadata describing the object that was retrieved, something like:

    {
        "AcceptRanges": "bytes",
        "LastModified": "2025-09-22T14:35:00+00:00",
        "ContentLength": 39675432,
        "ETag": "\\"abc123def456...\\"",
        "ContentType": "application/x-hdf",
        "Metadata": {}
    }
Example: Download Part of a File with --range
aws s3api get-object \\
  --bucket nsidc-cumulus-prod-protected \\
  --key SMAP/SPL3SMP/009/2025/09/SMAP_L3_SM_P_20250922_R19240_001.h5 \\
  ./SMAP_partial.h5 \\
  --range bytes=0-1048575
  • Downloads only the first 1 MB of the file (bytes 0-1048575)
  • Useful for testing downloads, checking file headers, or resuming large files
  • The saved file will be incomplete, but can still be inspected
Example: Skip Download if the File Hasn’t Changed
aws s3api get-object \\
  --bucket nsidc-cumulus-prod-protected \\
  --key SMAP/SPL3SMP/009/2025/09/SMAP_L3_SM_P_20250922_R19240_001.h5 \\
  ./SMAP_latest.h5 \\
  --if-none-match "abc123def456..."
  • Only downloads if the file’s ETag (unique fingerprint) is different from the one you provide
  • If unchanged, the command exits without downloading
  • Useful for automation to avoid re-downloading unchanged files
Tip: Check Metadata Without Downloading
aws s3api head-object \\
  --bucket nsidc-cumulus-prod-protected \\
  --key SMAP/SPL3SMP/009/2025/09/SMAP_L3_SM_P_20250922_R19240_001.h5
  • Verifies that the file exists and shows size/metadata without retrieving it

Automating Multiple Downloads

1.   Upload a text file containing a list of S3 URLs to your EC2 instance (see Part 4.1 in https://nsidc.org/data/user-resources/help-center/launching-and-connecting-free-aws-ec2-instance-nasa-earthdata-cloud-access)

2.   Run the following script to download each file:

while read file; do
    aws s3 cp "$file" ./local_folder
done < file_list.txt

Each line in file_list.txt must be a fully-qualified S3 path.

Quick Reference: AWS CLI Commands for NSIDC DAAC S3

CommandExampleWorks with NSIDC DAAC S3?Notes
aws s3 cpaws s3 cp s3://bucket/path/file.h5 ./YesSingle-file copy only. Must use exact S3 path--recursive not supported.
aws s3api get-objectaws s3api get-object --bucket bucket --key path/file.h5 ./file.h5YesExact file key required. Useful in scripts.
aws s3 lsaws s3 ls s3://bucket/NoBuckets are non-listable → returns AccessDenied.
aws s3 syncaws s3 sync s3://bucket/ ./local/NoRecursive sync fails due to non-listable buckets.
aws s3api list-objectsaws s3api list-objects --bucket bucketNoListing commands return AccessDenied.
aws s3api list-bucketsaws s3api list-bucketsNoCannot enumerate NSIDC DAAC buckets.

Tips for Efficiency

  • Get file paths from Earthdata Search or CMR.
  • Plan ahead for the ~1 hour expiration of credentials.

Summary

  • NSIDC DAAC buckets are non-listable.
  • S3 access works only from an authorized cloud instance with valid temporary credentials in us-west-2.
  • Use aws s3 cp or aws s3api get-object.
  • Listing and sync commands (lssynclist-*) will fail.
  • Outside AWS, always use HTTPS endpoints instead.