Big data at NASA

Deputy Associate CIO for Technology and Innovation, NASA

Data already is the new currency and is at the heart of everything digital. I like to repeat the adage, “Data becomes Information, becomes Knowledge, becomes Wisdom”.  And “It’s all about the data”.  So why do we send up probes, sensors or satellites —  for the data?

I’d like to touch upon some mission critical projects showing how NASA is using Big Data in a variety of ways to accomplish its goals:

  • Quantum Artificial Intelligence Laboratory (QuAIL) and quantum computers
  • NASA's Pleiades Supercomputer performing modeling and simulation
  • Distributed Active Archive Centers (DAACs) and massive storage of Earth Science data
  • Network Activity Cybersecurity Risk Assessment (NACRA) and network Cyber Security
  • Exploration Medical Capabilities (ExMC) and Expert Medical Care.

I’d also like to identify an effort where IBM and NASA are partnering in utilization of the Watson Super Computer regarding a whole spectrum of projects.

At the 2017 Annual Data Warehouse Big Data and Enterprise Architecture Meeting on March 15th, sponsored by Data Management Forum,  I pointed out a few of the tools we use to gather this data and what’s on the horizon for big data.  The Quantum Artificial Intelligence Laboratory (QuAIL) is NASA Space Agency's hub for an experiment to assess the potential of quantum computers to perform calculations that are difficult or impossible using conventional supercomputers.  “The D-Wave took about a hundredth of a second; with a classical computer it’d take about 100 days”, as recently quoted by Google’s Director of Engineering, Hartmut Neven. 

Figure 1: Quantum Artificial Intelligence Laboratory (QuAIL) D-Wave Supercomputer. Photo by article author taken 1 May 2014, with Ed McLarney from Langley Research Center in the background.

NASA's Pleiades Supercomputer is one of the Agency’s state-of-the-art technologies for meeting supercomputing requirements, enabling scientists and engineers to conduct mission modeling and simulation.  It is located at NASA Ames Research Center near Silicone Valley and is run by Dr Piyush Mehrotra.  It’s also ranked #13 in the world, as of November 2016 (LINKPACK Rating), it can handle 5.95 Pflops per second

Figure 2: Shown (reentry heat shield simulation) is the type of visualizations produced by the Pleiades Supercomputer. 

And where is all that data stored?  One of the many areas is NASA’s Distributed Active Archive Centers (DAACs).  The DAACS can archive, process, document, and distribute data for all of NASA’s Earth Science Division missions through 12 science discipline focused centers. 

Figure 3: Example of visual data stored at one of NASA’s DAACS.  Image captured on 29 January 2017 by the VIIRS instrument, on board the joint NASA/NOAA SNPP satellite. 

What big data tools does NASA use?  A simple search on the internet for “Big Data Landscape” turns up a graphic with hundreds of companies.  As a matter of fact, every time I ask the NASA Big Data Working Group if someone is using a specific tool I just heard about, someone raises their hand and says yes. We use those tools to work on projects and prototypes like Network Activity Cybersecurity Risk Assessment (NACRA), which is entering phase II prototyping.  It uses advanced technology to identify and visualize network flows that are associated with attacks or unauthorized data exfiltration. The purpose is to address security problems with an approach and appropriate technology to provide a means to more easily identify those security events that are buried in the Big Data and which are actionable.

Another prototype is Exploration Medical Capabilities (ExMC) Data Analytics.  Its purpose is to provide the crew with the best chance to accomplish mission and get home healthy and to provide a tool to help informed decision making.  Imagine an exploration mission system that could: 

  • Provide for centralized medical care
  • Enhance the available knowledge base
  • Monitor supplies for the crew
  • Monitor the crew as needed
  • Streamline communication with ground flight surgeons
  • Decrease likelihood of medical errors.

Figure 4: Shown is NASA astronaut Dan Burbank, Expedition 30 commander, as he prepares to use the Integrated Cardiovascular Resting Echo Scan on a crew member (out of frame) at the Human Research Facility rack in the Columbus laboratory of the International Space Station.

Partnering with IBM on big data

Many key scientific questions are now being advanced by NASA and partner IBM, particularly when it comes to pioneering use of the Watson Super Computer.  IBM's question-answering whiz, the Watson computer system, famously beat former winners on Jeopardy in 2011 — and now it's digging into aerospace research and data to help NASA answer questions on the frontier of spaceflight science and make crucial decisions in the moment during air travel.

More than 60 years after the first IBM computing machines showed up in the halls of NASA's Langley Research Center,  new work at Langley will use IBM tech to help researchers sort through the huge volumes of data that is generated by aerospace research. (Credit: NASA/David C. Bowman  and  Sarah Lewin, Staff Writer, for more details)

"There's so much data out there that consists of unstructured text that usually only humans can make sense of, but the challenge is that there's too much of it for any human being to read," Chris Codella, an IBM Distinguised Engineer who is working on Watson, told "The idea here is to have a Watson system that can be a research development advisor to people who work in the aerospace fields."

Watson operates with what IBM calls cognitive computing — essentially, it draws connections after examining huge volumes of data that is fed to it, and it is able to return highly relevant answers within the fields that data encompasses. The system has been used to analyze connections within medical and scientific research documents, make potential diagnoses, invent recipes and analyze people's personality traits through social media posts.

Moving from data to decisions is the new frontier.  Machine learning, deep learning, and artificial intelligence are already starting to help us get there.  It’s only a matter of a few more breakthroughs for some of the new Sci-Fi movies to become reality.  All the technology hype aside, we need innovative technology and data solutions to very complex problems to get us to be able to set foot on another planet, and soon.  

Editor’s note: This article is offered for publication in association with the Big Data Seminar 2017, November 16-17, 2017 in New York City at Hotel Pennsylvania, sponsored by Data Management Forum.  Additional information is available in the Big Data Seminar flyer .

Check out IBM Analytics for more information.