Accessing NASA Earthdata in the Cloud with JupyterLab on EC2
This tutorial guides you through setting up JupyterLab on an AWS EC2 instance, uploading a sample notebook, and running it directly in your browser. By working in the cloud, you can access large datasets—such as NASA Earthdata—without having to download them to your local machine.
This approach offers a flexible and secure workflow for Earth science analysis and other data-intensive tasks. Once set up, your JupyterLab environment can be accessed from any computer with an internet connection, making it ideal for remote work and collaboration.
Although the setup takes a little extra work upfront, it pays off with faster performance, minimal local storage needs, and the flexibility to continue your work from anywhere.
Prerequisites
Before you begin, make sure you have the following in place:
1. An AWS Account
- If you don’t have one, sign up at: https://aws.amazon.com/
2. Correct AWS Region
- In the AWS Management Console (https://console.aws.amazon.com/), locate the Region selector dropdown in the upper-right corner.
- Select United States (Oregon) —
us-west-2
. - This ensures all your resources (VPC, EC2, Security Groups) are created in the same location, which avoids cross-region connection issues and extra data transfer costs.
3. An EC2 instance
- You will need an active EC2 instance before proceeding with this workflow.
- Create one using the AWS EC2 launch wizard.
- For detailed instructions, see: Getting Started with Amazon EC2 and our own guide on launching and connecting to a free AWS EC2 instance.
- Recommended instance types:
t3.micro
— Free Tier. Slower performance, limited RAM (~30 min setup).t2.medium
— Faster, 4GiB RAM (may incur charges).
- Use Amazon Linux or Ubuntu 22.04 LTS with at least 15 GiB of storage.
- Download the
.pem
key file — you’ll need this for SSH access.
4. Terminal with SSH Support
- Examples: macOS Terminal, Windows WSL, or any Linux shell.
Step 1: Configure a Security Group for Jupyter
1. Navigate to the AWS EC2 Console at https://console.aws.amazon.com/ec2/.
2. In the left navigation pane, go to Security Groups under the Network & Security section.
3. Click the Create security group button (orange button in the upper-right corner)
4. Fill in the Basic details:
- Security group name: e.g.
JupyterAccess
- Description: e.g.
Allow SSH and Jupyter Notebook access
- VPC: Select the VPC where your EC2 instance will run (usually the default VPC).
5. Under Inbound rules, add:
Rule 1 – Enable EC2 to connect securely to websites
- Type:
HTTPS
- Port range:
443
- Source:
Anywhere-IPV4
Rule 2 – Allow secure SSH access to the instance
- Type:
SSH
- Port range:
22
- Source:
My IP
(recommended for security — prevents public SSH access)
Rule 3 – Allow Jupyter Notebook access
- Type:
Custom TCP
- Port range:
8888
- Source:
My IP
(highly recommended; only your machine can connect)
6. Under Outbound rules, check if you have:
- Type:
All traffic
→ Destination:Custom
- This is the default in all AWS regions. If missing, add it manually to ensure your instance can download packages or fetch external data.
7. Click Create security group (bottom right).
Step 2: Assign Security Group to Instance
1. In the left navigation pane, click Instances.
2. Select your EC2 instance (check the box next to its name).
3. At the top right of the instance details page, click Actions (dropdown menu).
4. From the dropdown, choose Security → Change security groups.
5. Add the security group you just created (e.g., JupyterAccess
).
6. Click Save to apply changes.
Step 3: Start the Instance
Since your instance is already selected from Step 2:
1. At the top right of the Instances table, click the Instance state dropdown.
2. Select Start instance.
3. Wait until the Instance state column updates to Running before continuing.
Step 4: Connect to the Instance
1. In the Instances table, ensure your EC2 instance is selected.
2. At the top right of the table, click the Connect button (located next to Instance state and Actions) and go to the SSH client tab.
3. Prepare your .pem
file:
- Make sure it is in the same directory as your terminal.
- Set the correct permissions to secure the file:
chmod 400 "<your_key>.pem"
4. Copy the SSH command provided in the SSH client tab. It usually looks like this:
ssh -i "<your_key>.pem" ec2-user@<your_instance_public_dns>
- Replace
<your_key>
with you.pem
file name. - Replace
<your_instance_public_dns>
with your EC2 public DNS.
5. Run the SSH command in your terminal to connect to the EC2 instance. When prompted if you want to continue connecting, type yes
.
- Once connected, your terminal prompt will change to show the remote EC2 instance (e.g.,
ec2-user@ip-172-31-xx-xxx:~$
) - If you’re using Ubuntu instead of Amazon Linux, the username may be
ubuntu
instead ofec2-user
.
Step 5: Install Miniconda on the EC2 Instance
Run these commands in the same terminal:
# Create install directory
mkdir -p ~/miniconda3
# Download Miniconda installer
wget <https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh> -O ~/miniconda3/miniconda.sh
# Run installer silently
bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
# Clean up installer
rm -rf ~/miniconda3/miniconda.sh
# Initialize conda
~/miniconda3/bin/conda init bash
# Reload shell
source ~/.bashrc
After initializing conda, you might see the message ==> For changes to take effect, close and re-open your current shell. <==
. If so, simply disconnect and reconnect via SSH to ensure conda is available.
Step 6: Create a .netrc
File for Earthdata Access
In the same terminal, run the following commands:
# Add your Earthdata credentials to a .netrc file
echo 'machine urs.earthdata.nasa.gov login <your_username> password <your_password>' >> ~/.netrc
# Secure the .netrc file so only your user can read it
chmod 600 ~/.netrc
Replace <your_username>
and <your_password>
with your Earthdata Login credentials. Do not share this file or your EC2 instance with others.
ls -l ~/.netrc
Step 7: Download the Example Files to Your EC2 Instance
This tutorial uses two example files from NSIDC’s GitHub repository:
- ATL06-direct-access.ipynb — A sample Jupyter notebook for ICESat-2 data access.
- environment.yml — Defines the Conda environment used in the tutorial.
Download the files directly to your EC2 instance
1. In your the EC2 terminal, copy the “Raw” URL of each file from GitHub. Then run:
# Download the sample Jupyter notebook
wget <https://raw.githubusercontent.com/nsidc/NSIDC-Data-Tutorials/main/notebooks/ICESat-2_Cloud_Access/ATL06-direct-access.ipynb>
# Download the environment file
wget <https://raw.githubusercontent.com/nsidc/NSIDC-Data-Tutorials/main/notebooks/ICESat-2_Cloud_Access/environment/environment.yml>
2. Verify the files downloaded successfully by listing your current directory:
ls
You should see the two downloaded GitHub notebooks and the newly created miniconda3
directory.
ATL06-direct-access.ipynb environment.yml miniconda3
Step 8: Add Swap Memory (Required for t3.micro
)
If you’re using a t3.micro
instance, enabling swap memory is essential. These instances have very limited RAM, which can cause Conda environment creation or package installation to fail.
Run the following commands in your EC2 terminal:
# Create a 5 GiB file to use as swap space
sudo fallocate -l 5G /swapfile
# Secure the file so only root can access it
sudo chmod 600 /swapfile
# Initialize the file as swap space
sudo mkswap /swapfile
# Activate the swap file
sudo swapon /swapfile
Running these commands:
- Reserves 5 GiB of storage to act as RAM
- Ensures the system doesn’t crash or freeze during heavy memory usage
- This is temporary and will be removed when the instance is stopped or rebooted (unless re-added)
- Swap memory is slower than physical RAM, so operations like Conda environment creation may take 15-30 minutes.
Step 9: Create and Activate Conda Environment
# Install mamba into base environment
conda install mamba -n base -c conda-forge
# Create new Conda environment from .yml
mamba env create -f environment.yml
# Activate newly created environment
conda activate ICESat2_cloud_access
# Install additional packages for Jupyter notebook
pip install notebook ipywidgets widgetsnbextension
When prompted, accept the terms of service and type yes
to proceed.
Step 10: Set Jupyter Notebook Password
jupyter notebook password
Enter a unique password when prompted and confirm it by typing it a second time. This password will secure access to your Jupyter session, preventing unauthorized users from accessing your notebooks.
Step 11: Launch JupyterLab
jupyter lab --no-browser --ip=0.0.0.0 --port=8888
Keep this terminal open while you use JupyterLab. Closing it will stop the Jupyter server.
Step 12: Open JupyterLab in Your Browser
1. Open a web browser and navigate to:
http://<your-ec2-public-dns>:8888
- Replace
<your-ec2-public-dns>
with the public DNS or IP address of your EC2 instance (available in the AWS Console under your instance details).
2. Enter the password you set in Step 10.
3. Once logged in, you’ll see the JupyterLab interface with your notebook displayed in the left sidebar. Double-click the notebook to open and run it.
Step 13: Stop or Terminate Your Instance
Leaving your instance running will continue to incur charges, even if you’re not using it. When you’re finished, you can stop or terminate it:
- Stop = Shuts the instance down, but keeps your data and settings so you can restart it later.
- Terminate = Permanently deletes the instance and all data stored on it.
How to Stop or Terminate
1. In the Instances table, ensure your EC2 instance is selected.
2. Click the Instance state dropdown at the top right.
3. Choose:
- Stop instance (to pause and save your setup)
- Terminate instance (to completely delete it and avoid future costs)
Final Thoughts
This tutorial guides you through a complete workflow for:
- Setting up a cloud-based environment on AWS EC2
- Installing Miniconda and configuring a Conda environment
- Configuring authentication for NASA Earthdata
- Preparing a reproducible environment for Earth science data analysis
Once configured, this workflow can be adapted to:
- Work with other NASA Earthdata data sets
- Develop and test your own analysis notebooks
- Scale up for larger processing tasks using different EC2 instance types
Whether you’re exploring ICESat-2, SMAP, or other Earthdata collections, this approach lays the groundwork for a wide range of Earth science applications in the cloud.
Last Updated: August 2025