Water system workers with kdb+ historical database

Kdb+ Use Case: Machine Learning Water System Maintenance Application

6 Dec 2017 | , , , ,
Share on:

Kdb+ is being used much more widely in machine learning applications today. Its ability to quickly ingest and process data, particularly large, fragmented datasets, is one way that developers are adding kdb+ to their technology stack of artificial intelligence and machine learning tools.

For Australian kdb+ developer Sherief Khorshid, who also develops machine learning systems, incorporating kdb+ into a predictive maintenance application gave him the edge in a hackathon win that landed him a cash prize and a contract with the Water Corporation of Western Australia.

Background

The Ministry of Data in Perth is on a mission to promote an innovation ecosystem in Western Australia. To that end, it has started sponsoring hackathons where the brightest technologists and start-ups in the state take on some of the toughest problems facing Western Australian government agencies.

The first hackathon was held over a weekend in August 2017. Dozens of engineers gathered in a competition to solve problems posed by two government agencies: Main Roads, which oversees road access, and the Water Corporation, which manages water, wastewater and drainage.

A team led by Khorshid tackled the Water Corporation problem, which asked developers to come up with a way that the Water Corporation could identify events and triggers of potential problems in the water system early enough to reduce asset failure and enable timely maintenance within the network.

The challenge

The Water Corporation gave the hackers 20 years of performance history from three water plants. There were about 1,000 physical files for each plant. The data consisted of over 300 columns and 1.5 million rows. Over the years, the format of the data had changed several times, so it was extremely messy and inconsistent. Although the data was technically considered  “computer output,” it was not really computerized in the sense that it was not in a single format that could be simply loaded into a database and analyzed as is.

The solution

Up against a ticking clock, Khorshid chose to use kdb+ rather than Python to ingest and pre-process the data because he knew he could save himself many hours in the process. He quickly saw that he would need to use kdb+ bulk pattern matching to extract the data into consistent data representations in CSV format ready to re-ingest and analyze.

Khorshid’s team used a number of neural network and XGBoost regression models to make real-time forecasts of asset failure.  His team also created a web based GUI connecting directly to a  kdb+ instance, allowing for two-way communication and data transfer.

The application that Khorshid built had a kdb+ process powering the back-end, “pumping out” the historical water network data. The back-end pushed out one hours’ worth of data every second. In the final presentation at the hackathon, Khorshid was able to show that multiple clients from around the world were able to simultaneously run the app, and query the data.

The results

Khorshid, who originally learned kdb+ as a developer at an investment bank, found that kdb+ was the best tool for pre-processing the water plant data. “When doing a lot of data munging, it is a lot easier in kdb+ than in Python,” Khorshid said.

The value of converting previously inaccessible data into usable datasets can be a significant business benefit for organizations like the Water Corporation. The agency has been collecting SCADA data for two decades from thousands of pumps and pipes, and then simply storing the data away in an historical database that they have never been able to utilize before.

Once cleansed and normalized, that data can be incorporated into improved applications for detecting leaks; measuring the effectiveness of the water system; for demand forecasting, or for preventative maintenance, all of which can meaningfully improve water operations and their service to their customers.

© 2017 Kx Systems
Kx® and kdb+ are registered trademarks of Kx Systems, Inc., a subsidiary of First Derivatives plc.

SUGGESTED ARTICLES

Kx collaborating with Fintech startup chartiq

Collaboration: The Dominant Trend in Finance

13 Dec 2017 | , , , ,

In December we are re-blogging some of our favorite content from Kx partners and affiliated companies, starting with this article on the ChartIQ blog. ChartIQ is an agile FinTech company that sells an advanced HTML5 charting library used in technical data analysis, trading configurations and for charting in the capital markets industry. Kx offers a ChartIQ integration as an addition to our Dashboards. In Collaboration: The Dominant Trend in Finance, ChartIQ’s Hanni Chehak writes about the rise of FinTech companies, and the role collaboration plays as FinTech companies are increasingly disrupting the traditional banking sector.

kdb+ FFI

Kdb+ FFI: Access external libraries more easily from q

22 Nov 2017 | , , ,

Following on from the hugely popular Python library and interface embedPy and PyQ, Kx has released an FFI as part of the Fusion for kdb+ interfaces. As with embedPy and PyQ, this FFI is open-sourced under the Apache 2 license.
The kdb+ FFI is a foreign function interface library for loading and calling dynamic libraries from q code. It has been adapted and expanded upon from a library originally written by Alex Belopolsky of Enlightenment Research. With the kdb+ FFI you can now call your favorite C/C++ libraries directly from q without the overhead of having to compile shared objects and load into q using the 2: command.

kdb+ and Python

Kdb+ and Python: embedPy and PyQ

15 Nov 2017 | , , , ,

In September, Kx announced a range of initiatives to put machine learning (ML) capabilities at the heart of future technology development. The first library to be released as part of this initiative is embedPy, which exposes powerful Python functionality to q developers. EmbedPy is the mirror image of PyQ, a set of software components that simplify the running of a Python interpreter alongside a kdb+ server, the rights to which Kx has acquired. Developed by Alexander Belopolsky of Enlightenment Research, PyQ covers all Python libraries, with a primary focus on numerical libraries such as NumPy and SciPy.