White Paper on NLP for Flood Disaster Management at FDL Europe with kdb+

16 Oct 2019 | ,
Share on:

By Conor McCarthy

The European Space Agency (ESA) Frontier Development Lab (FDL) is an applied artificial intelligence (AI) research accelerator, hosted by both the ESA Centre for Earth Observation (ESRIN) and Oxford University. The program brings non-commercial and private partners together with academic researchers to solve challenges in the space science sector using AI techniques and cutting edge technologies.

2019 marks the third year of collaboration between Kx and FDL. I had the privilege of being involved in this year’s FDL project, and worked at both ESRIN and at Oxford as a consultant and visiting data scientist. This year’s program focused on three main research areas relating to space and the use of AI for social good:

  1. Atmospheric Phenomena and Climate Variability
  2. Disaster Prevention Progress and Response
  3. Ground Station Pass Optimization for Constellations

This year in both the US and Europe, Kx focused on the second of these challenges, and in particular, prevention and response as it applies to areas affected by flooding.

Flooding events affect on average 82.6 million people worldwide annually, across all social classes and geographic locations. As a result of global warming, and its associated side effects, this number is likely to increase in the future. 

The work presented in my white paper, and the associated JupyterQ notebook, highlights the use of natural language processing in a kdb+ architecture to classify tweets relating to flooding events. This could allow both governmental and non-governmental institutions to reach out to individuals touched by flooding or to highlight infrastructural damage reported by individuals on the ground. This ability is vital during emergency situations in order to ensure the damage caused by such events is minimized. 

In addition to the production of an LSTM Recurrent Neural Network to classify the tweets into appropriate categories, a triaging system based on the kdb+ ticker-plant architecture was built to show how such a model could be deployed in the real world.

This work makes use of the varied functionality released by the machine learning team over the last number of years including, the machine learning toolkit for data preprocessing model evaluation, the NLP library for sentiment analysis and embedPy for the training and fitting of models.

The full paper and JupyterQ notebooks explaining the data origins, data-preprocessing, feature engineering and the ML models used can be found here.

Additional information about Kx’s previous work with FDL can be found below:

The Detection of Exoplanets at NASA FDL with kdb+

The Exploration of Space Weather at NASA FDL with kdb+

Case study: Kdb+ Used at NASA Frontier Development Lab in Predictive AI tool

The Exploration of Solar Storm Data Using JupyterQ

VIDEO: The Exploration of Solar Storms at NASA FDL

Conor and Kx gratefully acknowledge the support of all those involved with FDL Europe in particular James Parr, Sarah McGeehan, Jodie Hughes and all those on the European Floods Team.

SUGGESTED ARTICLES

ML and kdb+

Machine Learning Toolkit Update: Cross-Validation and ML Workflow in kdb+

23 Jul 2019 | , ,

The Kx machine learning team has an ongoing project of periodically releasing useful machine learning libraries and notebooks for kdb+. This release relates to the areas of cross-validation and standardized code distribution procedures for incorporating both Python and q distribution. Such procedures are used in feature creation through the FRESH algorithm and cross-validation within kdb+/q.