Kx Insights: Machine learning and the value of historical data

2 Aug 2018 | , , ,

Data is being generated at a faster rate now than ever before. IDC has predicted that in 2025, there will be 163 zettabytes of data generated each year—a massive increase from the 16.1 zettabytes created in 2016. These high rates of data generation are partially an outcome of the multitude of sensors found on Internet of Things (IoT) devices, the majority of which are capable of recording data many times per second. IHS estimates that the number of IoT devices in use will increase from 15.4 billion devices in 2015 to 75.4 billion in 2025, indicating that these immense rates of data generation will continue to grow even higher in the years to come.

Machine learning techniques featured in JupyterQ notebooks

19 Jul 2018 | , , , , , , , ,

Machine learning with kdb+ has been a theme of the Kx blog over the past couple of months because of the release of a series of JupyterQ notebooks on the Kx ML GitHub. As more different kinds of developers work with ML techniques, the uses for kdb+ in ML applications is growing. The release of embedPy, which loads Python into kdb+, so Python variables and objects become q variables and either language can act upon them, has been a catalyst for this trend. With embedPy, Python code and files can be embedded within q code, and Python functions can be called as q functions.

Random Forests in kdb+

12 Jul 2018 | , , , , ,

The Random Forest algorithm is an ensemble method commonly used for both classification and regression problems that combines multiple decision trees and outputs and average prediction. It can be considered to be a collection of decision trees (forest) so it offers the same advantages as an individual tree: it can manage a mix of continuous, discrete and categorical variables; it does not require either data normalization or pre-processing; it is not complicated to interpret; and it automatically performs feature selection and detects interactions between variables. In addition to these, random forests solve some of the issues presented by decision trees: reduce variance and overfitting and provide more accurate and stable predictions. This is all achieved by making use of...

Kx and NASA FDL: Space Weather, GNSS and Exoplanets

10 Jul 2018 | , ,

By Robert Hill Kx is delighted to once more be partnering with the NASA Frontier Development Laboratory (NASA FDL) team on two exciting challenges facing the space sector. This follows from last year’s successful solar activity detection work, which resulted in the ‘FlareNet’ tool (supported by Kx and Lockheed Martin) that demonstrated the potential for

Kx Insights: Machine learning subject matter experts in semiconductor manufacturing

9 Jul 2018 | , ,

Subject matter experts are needed for ML projects since generalist data scientists cannot be expected to be fully conversant with the context, details, and specifics of problems across all industries. The challenges are often domain-specific and require considerable industry background to fully contextualize and address. For that reason, successful projects are typically those that adopt a teamwork approach bringing together the strengths of data scientists and subject matter experts. Where data scientists bring generic analytics and coding capabilities, Subject matter experts provide specialized insights in three crucial areas: identifying the right problem, using the right data, and getting the right answers.