Disaster prediction research at NASA FDL with kdb+/q, ML and AI

27 Aug 2019 | ,
Share on:

By Diane O’Donoghue and Conor McCarthy

Kx is again partnering with NASA FDL on their  fourth annual summer research accelerator, which is in full swing. For three years data scientists from Kx have collaborated with FDL at their research lab in Mountain View, California, applying our kdb+ time series database and q programming language to AI problems.  The goal of NASA FDL is to bring together experts with backgrounds in the AI, technology and the space science sectors to apply AI methods to a broad range of global and space-related challenges. The key research areas this year include:

  1. The Moon for Good: How AI can help with the establishment of a permanent human presence on the moon
  2. Astronaut Health: AI to help support the monitoring of astronaut health
  3. Living with our Star: AI for prediction of solar activity
  4. Mission Support: AI for optimizing spacecraft operations
  5. Disaster Prevention, Progress and Response: How AI can be used to help reduce the impact of flooding events in the US and the wider world

This year Kx has been working with teams on both sides of the Atlantic on two different disaster prevention projects.  Both initiatives have the same underlying goal:  to help the public, emergency services, and both governmental and non-governmental organizations to make decisions surrounding flooding events. Below is a brief outline of what the projects involve; more detailed updates will follow when they are completed.

NASA FDL in Mountain View, California:

The main objective of the Disaster Prevention, Progress and Response (Floods) team is to use AI to improve the ability to forecast a stream area’s susceptibility to flooding, based on model inputs such as orbital imagery, rainfall radar data, basin characteristics, and impervious surface information.

In collaboration with the U.S. Geological Survey (USGS) the goal is to determine where machine learning could assist in the area of flood prediction to improve their ability to best prepare and respond when a natural disaster occurs. One of the initial focus areas is how the change over time in impervious surfaces and other basin characteristics can change a region’s susceptibility to flooding. This project brings a novel approach to the organization for flood prediction, as currently, the USGS does not use any machine learning methods for model calibrations or stream height estimates, and any mathematical hydrological models that are in use are extremely costly and time consuming to prepare.

The dataset comes from over a thousand stream gauge stations which have been collecting stream height data every 15 minutes for a period of 10 years. This data is combined with rainfall data over these locations collected from the airborne Portable Remote Imaging SpectroMeter (PRISM). Because of kdb+’s high performance with very large datasets, it  is ideal for ingesting and manipulating  vast amounts of  data using partitioned tables, which enable the team  to quickly extract aggregate features from the dataset. The dataset is derived from various sources which are then spatially and temporally joined using q’s columnar-based orientation.

We are currently deploying machine learning models to investigate what information we can extract when tabular data from stream gauges and rainfall data is combined with satellite and radar information accumulated from a variety of scientific instruments from Earth observation Landsat satellites.

FDL Europe in Oxford, England:

The primary goal of the corresponding team in Europe team is to produce a neural network-based flood mapping system which can be deployed onto a CubeSat satellite. The project builds on aspects of work completed during last year’s FDL Europe project and makes use of the Intel Movidius neural compute VPU chips which allow the neural network to be deployed onto a low power system.

A complementary part of the project will examine social media data to extract information about flooding events, namely to find affected individuals and infrastructural damage. This information will help organizations like NGO’s and other interested parties seeking to contact those affected by the flooding and coordinate volunteers and donations.

This part of the project is using Kx machine learning tools like embedPy, which is a software component that simplifies the running of a Python interpreter alongside a kdb+ server. In addition to using embedPy Kx’s machine learning and NLP libraries and Machine Learning toolkits are being used to process Twitter data from previous flooding events to train a machine learning model to identify distinct tweet classifications to assist in assessing data.

We will be providing in-depth updates on these projects, the techniques used, and lessons learned when they are completed at the end of September.  In the meantime, if you want to read more about the work that has been done previously by Kx with  NASA FDL please follow some of the links below. The Exploration of Space Weather at NASA FDL with kdb+

White Paper on Detection of Exoplanets at NASA FDL with kdb+The exploration of solar storm data using JupyterQ

For any machine learning related questions within the Kx technology stack please do not hesitate to contact

 

SUGGESTED ARTICLES

ML and kdb+

Machine Learning Toolkit Update: Cross-Validation and ML Workflow in kdb+

23 Jul 2019 | , ,

The Kx machine learning team has an ongoing project of periodically releasing useful machine learning libraries and notebooks for kdb+. This release relates to the areas of cross-validation and standardized code distribution procedures for incorporating both Python and q distribution. Such procedures are used in feature creation through the FRESH algorithm and cross-validation within kdb+/q.