Accessing NSIDC DAAC S3 Data with AWS CLI
NSIDC DAAC Earthdata Cloud datasets are stored in protected, non-listable Amazon S3 buckets.
This means:
- You must be working in the
us-west-2
region to access them via S3. - Buckets cannot be browsed — only exact file paths work.
- Listing commands like
aws s3 ls
,aws s3 sync
, oraws s3api list-objects
return AccessDenied (403) errors. - From outside of AWS (i.e., on your laptop or in another region), you must use HTTPS endpoints instead.
This guide shows you how to use AWS CLI from an authorized cloud instance to download NSIDC DAAC datasets.
What is an Authorized Cloud Instance?
An authorized cloud instance means:
- An AWS EC2 instance (or other compute environment in AWS) running in the
us-west-2
region. - Configured with temporary EDL-based AWS Credentials for direct S3 access (see Temporary AWS Credentials guide for details).
- Once configured, the instance can access NSIDC DAAC S3 buckets directly.
Local machines (e.g., laptops) are not authorized instances. Direct S3 access from outside AWS will fail — use HTTPS endpoints instead.
Why Use AWS CLI?
With AWS CLI on an authorized instance, you can:
- Access protected NSIDC DAAC datasets directly from S3.
- Automate multiple downloads with scripts.
- Reproduce workflows securely and efficiently.
Key Points Before You Start
1. Region restriction
- All NSIDC DAAC buckets are in
us-west-2
. - Be sure to set this region in your CLI configuration or environment variables.
2. Non-Listable buckets
- You cannot explore folders or use recursive commands.
- Use exact file paths from Earthdata Search or CMR.
3. Temporary Credentials
- Required values:
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_SESSION_TOKEN
- Credentials expire after ~1 hour. Refresh them when starting a new session.
Setting Up Your Instance
If you don’t already have an AWS EC2 instance in us-west-2
, follow Parts 1 and 2 in this guide: Launching and Connecting to a Free AWS EC2 Instance for NASA Earthdata Cloud Access.
In summary:
1. Launch an EC2 instance in us-west-2
.
2. SSH into your instance from your local terminal.
Installing AWS CLI on Your Instance
Note: Installing AWS CLI is free, though downloading data may incur AWS transfer costs.
1. Download the AWS CLI v2 installer (Linux x86_64):
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
2. Install unzip (if needed) and extract:
sudo apt update && sudo apt install -y unzip
unzip awscliv2.zip
3. Run the installer:
sudo ./aws/install
4. Verify installation:
aws --version
Expected output: aws-cli/2.x.x Python/3.x.x Linux/x86_64
Configuring Temporary Credentials
1. Retrieve temporary AWS credentials. See this guide for details: https://nsidc.org/data/user-resources/help-center/accessing-nsidc-daac-s3-data-temporary-aws-credentials
2. Inside your instance, set credentials as environment variables:
export AWS_ACCESS_KEY_ID=<your_access_key>
export AWS_SECRET_ACCESS_KEY=<your_secret_key>
export AWS_SESSION_TOKEN=<your_session_token>
export AWS_DEFAULT_REGION=us-west-2
Downloading Data via AWS CLI
Since NSIDC DAAC buckets are non-listable, only a limited set of commands will work.
Supported Commands
aws s3 cp
(single file, exact path)aws s3api get-object
(exact key)
Learn more: S3 paths and keys
Unsupported Commands
aws s3 ls
aws s3 sync
- Any
aws s3api list-*
command
Example: Copy a single file
aws s3 cp s3://nsidc-cumulus-prod-protected/SMAP/SPL3SMP_E/006/2025/09/22/SMAP_L3_SM_P_E_20250922_R19240_001.h5 .
Advanced: Using s3api get-object
Note: For most single‑file downloads, aws s3 cp
is simpler and easier. Use s3api
if you need extra detail or more control.
When to use s3api
- You want extra file details (JSON metadata such as size, type, last modified)
- You only need part of a file (e.g., first few MB)
- You want to check or skip downloads if the file hasn’t changed
- You prefer explicit bucket/key commands with clearer error messages (e.g.,
NoSuchKey
,AccessDenied
)
Basic example
aws s3api get-object \\
--bucket nsidc-cumulus-prod-protected \\
--key SMAP/SPL3SMP/009/2025/09/SMAP_L3_SM_P_20250922_R19240_001.h5 \\
./SMAP_L3_SM_P_20250922_R19240_001.h5
- Downloads the file
SMAP_L3_SM_P_20250922_R19240_001.h5
from the protected NSIDC DAAC S3 bucket into the current directory Returns a JSON response metadata describing the object that was retrieved, something like:
{ "AcceptRanges": "bytes", "LastModified": "2025-09-22T14:35:00+00:00", "ContentLength": 39675432, "ETag": "\\"abc123def456...\\"", "ContentType": "application/x-hdf", "Metadata": {} }
Example: Download Part of a File with --range
aws s3api get-object \\
--bucket nsidc-cumulus-prod-protected \\
--key SMAP/SPL3SMP/009/2025/09/SMAP_L3_SM_P_20250922_R19240_001.h5 \\
./SMAP_partial.h5 \\
--range bytes=0-1048575
- Downloads only the first 1 MB of the file (bytes 0-1048575)
- Useful for testing downloads, checking file headers, or resuming large files
- The saved file will be incomplete, but can still be inspected
Example: Skip Download if the File Hasn’t Changed
aws s3api get-object \\
--bucket nsidc-cumulus-prod-protected \\
--key SMAP/SPL3SMP/009/2025/09/SMAP_L3_SM_P_20250922_R19240_001.h5 \\
./SMAP_latest.h5 \\
--if-none-match "abc123def456..."
- Only downloads if the file’s ETag (unique fingerprint) is different from the one you provide
- If unchanged, the command exits without downloading
- Useful for automation to avoid re-downloading unchanged files
Tip: Check Metadata Without Downloading
aws s3api head-object \\
--bucket nsidc-cumulus-prod-protected \\
--key SMAP/SPL3SMP/009/2025/09/SMAP_L3_SM_P_20250922_R19240_001.h5
- Verifies that the file exists and shows size/metadata without retrieving it
Automating Multiple Downloads
1. Upload a text file containing a list of S3 URLs to your EC2 instance (see Part 4.1 in https://nsidc.org/data/user-resources/help-center/launching-and-connecting-free-aws-ec2-instance-nasa-earthdata-cloud-access)
2. Run the following script to download each file:
while read file; do
aws s3 cp "$file" ./local_folder
done < file_list.txt
Each line in file_list.txt
must be a fully-qualified S3 path.
Quick Reference: AWS CLI Commands for NSIDC DAAC S3
Command | Example | Works with NSIDC DAAC S3? | Notes |
---|---|---|---|
aws s3 cp | aws s3 cp s3://bucket/path/file.h5 ./ | Yes | Single-file copy only. Must use exact S3 path. --recursive not supported. |
aws s3api get-object | aws s3api get-object --bucket bucket --key path/file.h5 ./file.h5 | Yes | Exact file key required. Useful in scripts. |
aws s3 ls | aws s3 ls s3://bucket/ | No | Buckets are non-listable → returns AccessDenied. |
aws s3 sync | aws s3 sync s3://bucket/ ./local/ | No | Recursive sync fails due to non-listable buckets. |
aws s3api list-objects | aws s3api list-objects --bucket bucket | No | Listing commands return AccessDenied. |
aws s3api list-buckets | aws s3api list-buckets | No | Cannot enumerate NSIDC DAAC buckets. |
Tips for Efficiency
- Get file paths from Earthdata Search or CMR.
- Plan ahead for the ~1 hour expiration of credentials.
Summary
- NSIDC DAAC buckets are non-listable.
- S3 access works only from an authorized cloud instance with valid temporary credentials in
us-west-2
. - Use
aws s3 cp
oraws s3api get-object
. - Listing and sync commands (
ls
,sync
,list-*
) will fail. - Outside AWS, always use HTTPS endpoints instead.