Accessing NASA Earthdata in the Cloud with JupyterLab on EC2

This tutorial guides you through setting up JupyterLab on an AWS EC2 instance, uploading a sample notebook, and running it directly in your browser. By working in the cloud, you can access large datasets—such as NASA Earthdata—without having to download them to your local machine.

This approach offers a flexible and secure workflow for Earth science analysis and other data-intensive tasks. Once set up, your JupyterLab environment can be accessed from any computer with an internet connection, making it ideal for remote work and collaboration.

Although the setup takes a little extra work upfront, it pays off with faster performance, minimal local storage needs, and the flexibility to continue your work from anywhere.

Prerequisites

Before you begin, make sure you have the following in place:

1. An AWS Account

If you don’t have one, sign up at: https://aws.amazon.com/

2. Correct AWS Region

In the AWS Management Console (https://console.aws.amazon.com/), locate the Region selector dropdown in the upper-right corner.
Select United States (Oregon) — us-west-2.
This ensures all your resources (VPC, EC2, Security Groups) are created in the same location, which avoids cross-region connection issues and extra data transfer costs.

3. An EC2 instance

You will need an active EC2 instance before proceeding with this workflow.
Create one using the AWS EC2 launch wizard.
For detailed instructions, see: Getting Started with Amazon EC2 and our own guide on launching and connecting to a free AWS EC2 instance.
Recommended instance types:
- t3.micro — Free Tier. Slower performance, limited RAM (~30 min setup).
- t2.medium — Faster, 4GiB RAM (may incur charges).
Use Amazon Linux or Ubuntu 22.04 LTS with at least 15 GiB of storage.
Download the .pem key file — you’ll need this for SSH access.

4. Terminal with SSH Support

Examples: macOS Terminal, Windows WSL, or any Linux shell.

Step 1: Configure a Security Group for Jupyter

1. Navigate to the AWS EC2 Console at https://console.aws.amazon.com/ec2/.

2. In the left navigation pane, go to Security Groups under the Network & Security section.

3. Click the Create security group button (orange button in the upper-right corner)

4. Fill in the Basic details:

Security group name: e.g. JupyterAccess
Description: e.g. Allow SSH and Jupyter Notebook access
VPC: Select the VPC where your EC2 instance will run (usually the default VPC).

5. Under Inbound rules, add:

Rule 1 – Enable EC2 to connect securely to websites

Type: HTTPS
Port range: 443
Source: Anywhere-IPV4

Rule 2 – Allow secure SSH access to the instance

Type: SSH
Port range: 22
Source: My IP (recommended for security — prevents public SSH access)

Rule 3 – Allow Jupyter Notebook access

Type: Custom TCP
Port range: 8888
Source: My IP (highly recommended; only your machine can connect)

6. Under Outbound rules, check if you have:

Type: All traffic → Destination: Custom
This is the default in all AWS regions. If missing, add it manually to ensure your instance can download packages or fetch external data.

7. Click Create security group (bottom right).

Step 2: Assign Security Group to Instance

1. In the left navigation pane, click Instances.

2. Select your EC2 instance (check the box next to its name).

3. At the top right of the instance details page, click Actions (dropdown menu).

4. From the dropdown, choose Security → Change security groups.

5. Add the security group you just created (e.g., JupyterAccess).

6. Click Save to apply changes.

Tip: When launching a new instance, you can assign a security group in the Network settings step instead of changing it later.

Step 3: Start the Instance

Since your instance is already selected from Step 2:

1. At the top right of the Instances table, click the Instance state dropdown.

2. Select Start instance.

3. Wait until the Instance state column updates to Running before continuing.

Step 4: Connect to the Instance

1. In the Instances table, ensure your EC2 instance is selected.

2. At the top right of the table, click the Connect button (located next to Instance state and Actions) and go to the SSH client tab.

3. Prepare your .pem file:

Make sure it is in the same directory as your terminal.
Set the correct permissions to secure the file:

chmod 400 "<your_key>.pem"

4. Copy the SSH command provided in the SSH client tab. It usually looks like this:

ssh -i "<your_key>.pem" ec2-user@<your_instance_public_dns>

Replace <your_key> with you .pem file name.
Replace <your_instance_public_dns> with your EC2 public DNS.

5. Run the SSH command in your terminal to connect to the EC2 instance. When prompted if you want to continue connecting, type yes.

Once connected, your terminal prompt will change to show the remote EC2 instance (e.g., ec2-user@ip-172-31-xx-xxx:~$)
If you’re using Ubuntu instead of Amazon Linux, the username may be ubuntu instead of ec2-user.

Step 5: Install Miniconda on the EC2 Instance

Run these commands in the same terminal:

# Create install directory
mkdir -p ~/miniconda3

# Download Miniconda installer
wget <https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh> -O ~/miniconda3/miniconda.sh

# Run installer silently
bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3

# Clean up installer
rm -rf ~/miniconda3/miniconda.sh

# Initialize conda
~/miniconda3/bin/conda init bash

# Reload shell
source ~/.bashrc

After initializing conda, you might see the message ==> For changes to take effect, close and re-open your current shell. <==. If so, simply disconnect and reconnect via SSH to ensure conda is available.

Note: These instructions are specific to Linux EC2 instances. For other architectures or operating systems, use the appropriate installer by replacing the wget link. See: https://repo.anaconda.com/miniconda/

Step 6: Create a `.netrc` File for Earthdata Access

In the same terminal, run the following commands:

# Add your Earthdata credentials to a .netrc file
echo 'machine urs.earthdata.nasa.gov login <your_username> password <your_password>' >> ~/.netrc

# Secure the .netrc file so only your user can read it
chmod 600 ~/.netrc

Replace <your_username> and <your_password> with your Earthdata Login credentials. Do not share this file or your EC2 instance with others.

Tip: You can verify that the file exists and has the correct permissions by running:

ls -l ~/.netrc

You should see permissions like -rw-------.

Step 7: Download the Example Files to Your EC2 Instance

This tutorial uses two example files from NSIDC’s GitHub repository:

ATL06-direct-access.ipynb — A sample Jupyter notebook for ICESat-2 data access.
environment.yml — Defines the Conda environment used in the tutorial.

Note: You can substitute these with your own notebooks or environment files if you wish to work with different data or projects. The workflow remains the same.

Download the files directly to your EC2 instance

1. In your the EC2 terminal, copy the “Raw” URL of each file from GitHub. Then run:

# Download the sample Jupyter notebook
wget <https://raw.githubusercontent.com/nsidc/NSIDC-Data-Tutorials/main/notebooks/ICESat-2_Cloud_Access/ATL06-direct-access.ipynb>

# Download the environment file
wget <https://raw.githubusercontent.com/nsidc/NSIDC-Data-Tutorials/main/notebooks/ICESat-2_Cloud_Access/environment/environment.yml>

2. Verify the files downloaded successfully by listing your current directory:

ls

You should see the two downloaded GitHub notebooks and the newly created miniconda3 directory.

ATL06-direct-access.ipynb  environment.yml  miniconda3

Step 8: Add Swap Memory (Required for `t3.micro`)

If you’re using a t3.micro instance, enabling swap memory is essential. These instances have very limited RAM, which can cause Conda environment creation or package installation to fail.

Run the following commands in your EC2 terminal:

# Create a 5 GiB file to use as swap space
sudo fallocate  -l 5G /swapfile

# Secure the file so only root can access it
sudo chmod 600 /swapfile

# Initialize the file as swap space
sudo mkswap /swapfile

# Activate the swap file
sudo swapon /swapfile

Running these commands:

Reserves 5 GiB of storage to act as RAM
Ensures the system doesn’t crash or freeze during heavy memory usage
This is temporary and will be removed when the instance is stopped or rebooted (unless re-added)
Swap memory is slower than physical RAM, so operations like Conda environment creation may take 15-30 minutes.

Step 9: Create and Activate Conda Environment

# Install mamba into base environment
conda install mamba -n base -c conda-forge

# Create new Conda environment from .yml
mamba env create -f environment.yml

# Activate newly created environment 
conda activate ICESat2_cloud_access

# Install additional packages for Jupyter notebook
pip install notebook ipywidgets widgetsnbextension

When prompted, accept the terms of service and type yes to proceed.

Be patient while creating the Conda environment — the process may appear slow, especially on smaller instances, but do not interrupt it.

You can replace the example environment.yml with any other environment file if you want to customize the setup for your own projects or notebooks.

Step 10: Set Jupyter Notebook Password

jupyter notebook password

Enter a unique password when prompted and confirm it by typing it a second time. This password will secure access to your Jupyter session, preventing unauthorized users from accessing your notebooks.

Step 11: Launch JupyterLab

jupyter lab --no-browser --ip=0.0.0.0 --port=8888

Keep this terminal open while you use JupyterLab. Closing it will stop the Jupyter server.

Step 12: Open JupyterLab in Your Browser

1. Open a web browser and navigate to:

http://<your-ec2-public-dns>:8888

Replace <your-ec2-public-dns> with the public DNS or IP address of your EC2 instance (available in the AWS Console under your instance details).

2. Enter the password you set in Step 10.

3. Once logged in, you’ll see the JupyterLab interface with your notebook displayed in the left sidebar. Double-click the notebook to open and run it.

Step 13: Stop or Terminate Your Instance

Leaving your instance running will continue to incur charges, even if you’re not using it. When you’re finished, you can stop or terminate it:

Stop = Shuts the instance down, but keeps your data and settings so you can restart it later.
Terminate = Permanently deletes the instance and all data stored on it.

How to Stop or Terminate

1. In the Instances table, ensure your EC2 instance is selected.

2. Click the Instance state dropdown at the top right.

3. Choose:

Stop instance (to pause and save your setup)
Terminate instance (to completely delete it and avoid future costs)

Tip: If you’re not ready to delete your work, choose Stop. You can always restart the instance later.

Final Thoughts

This tutorial guides you through a complete workflow for:

Setting up a cloud-based environment on AWS EC2
Installing Miniconda and configuring a Conda environment
Configuring authentication for NASA Earthdata
Preparing a reproducible environment for Earth science data analysis

Once configured, this workflow can be adapted to:

Work with other NASA Earthdata data sets
Develop and test your own analysis notebooks
Scale up for larger processing tasks using different EC2 instance types

Whether you’re exploring ICESat-2, SMAP, or other Earthdata collections, this approach lays the groundwork for a wide range of Earth science applications in the cloud.

Last Updated: August 2025