Using the NASA Openscapes Framework to enable open science
By Audrey Payne
Science is changing. Or rather, the way that scientific research is being conducted, shared, and replicated is changing. It is becoming more timely, accessible, and inclusive, thanks to the open science movement. Open science is the concept that scientific research should be accessible, transparent, and reproducible. It encourages diversity and access, makes science more global and collaborative, and accelerates discovery. The four pillars of open science are open data, open code, open access to papers, and open review.
In 2018, Openscapes sprung up from this initiative dedicated to “championing open practices in environmental science to help uncover data-driven solutions faster.” And as NASA moves their data, including data collections managed and housed at their 12 distributed active archive centers (DAACs), over to the Earthdata Cloud, the organization will lean on Openscapes. Beginning in 2021, NASA awarded Openscapes a three-year grant (#20-TWSC20-2-0003) in collaboration with 2i2c and The Carpentries, which is led by Julia Stewart Lowndes of the University of California Santa Barbara and Erin Robinson of Metadata Game Changers. This grant will support NASA DAAC researchers in migrating workflows to the cloud, with the end goal of enabling more open science.
“There is an efficiency that you get from doing open science by making sure your code is open and your data is findable, accessible, interoperable, and reproducible—FAIR,” said Andrew Barrett, a senior associate scientist and open science lead for the National Snow and Ice Data Center (NSIDC). “This saves a lot of time for scientists to do their work because they can share tools and how they work with data, so they can get to doing science quicker.”
Moving data to the cloud
The Earthdata Cloud is NASA’s archive of Earth observations, hosted in Amazon Web Services (AWS) alongside related tools and services from the NASA DAACs. As the sheer volume of data produced by NASA missions continues to expand, the need for an extensible storage system has become apparent. The cloud offers a scalable way to address current data storage issues, and also comes with multiple advantages for users.
For example, before the formation of Earthdata Cloud, data had been entirely housed at the NASA DAACs, and if scientists needed data from different DAACs, they had to go to each individual DAAC, download data to their personal machines, and repeat. But with data in the cloud, they will only need to go to one place. An added benefit is that analysis will be done directly in the cloud without downloading the data because computing resources will exist alongside data in the cloud. The process will be much more efficient, and will make conducting analysis quicker. In addition, sharing computing resources will make research more reproducible, which has the potential to enable more open science. Completing the data migration to the cloud will likely take several years, but multiple data sets have already moved over.
The Openscapes Framework
To support NASA DAAC researchers during the data migration to the cloud, Openscapes developed the Openscapes Framework, a scalable leadership training and community-building framework which includes three components: engaging DAAC mentors, empowering science research teams, and amplifying open science leaders. Under this model, cohorts work together to solve problems that come up as they try to make their workflows more open. Several NASA DAACs, including the NSIDC DAAC, are involved in the initiative, with DAAC scientists acting as mentors, heading workshops, and developing tutorials and guides within the framework of Openscapes on how to access and work with NASA data in the cloud.
In addition to the workshops and tutorials, DAAC scientists involved in the NASA Openscapes project are also working on a NASA Earthdata Cloud Cookbook, a resource to help data users learn how to work with NASA Earthdata in the cloud and how to contribute to these resources. The Cookbook is a work in progress, and more and more resources are being developed to smoothly transition scientists to working with NASA data in the cloud.
“The model that Openscapes is using—this cohort, community-building model—and in general moving toward open science, is as much about the human side of working as much as it is about the technical side of working,” said Barrett. “To some extent, the human side of working is more important because in order to share code and share knowledge earlier in the process—not just when we publish a paper—that requires trust. One of the really powerful things I see coming out of Openscapes is this community building.”