Decision Trees in kdb+
5 Jul 2018
The open source notebook outlined in this blog, describes the use of a common machine learning technique called decision trees. We focus here on a decision tree which provides an ability to classify if a cancerous tumor is malignant or benign. The notebook shows the use of both q and Python to leverage the areas where they respectively provide advantages in data manipulation and visualization.
Feature Engineering in kdb+
28 Jun 2018
Feature engineering is an essential part of the machine learning pipeline. In this blog, Fionnuala Carr discusses the feature engineering JupyterQ notebook, which includes an investigation of four different scaling , their impact on the k-Nearest Neighbors classifiers and the impact of using one-hot encoding.
Kx on the Google Cloud Platform
26 Jun 2018
At Kx25, the international kdb+ user conference held in New York City on May 18th, Kx announced that kdb+ is now available on the Google Cloud Launcher. Antonio Zurlo of Google Cloud Platform (GCP) gave a presentation about Google Cloud and described an example of how to use kdb+ on the GCP.
Classification using K-Nearest Neighbors in kdb+
21 Jun 2018
As part of Kx25, the international kdb+ user conference held May 18th in New York City, a series of seven JuypterQ notebooks were released and are now available on https://code.kx.com/q/ml/. Each notebook demonstrates how to implement a different machine learning technique in kdb+, primarily using embedPy, to solve all kinds of machine learning problems, from feature extraction to fitting and testing a model.
Machine Learning and AI: The Future Possibilities in Business
20 Jun 2018
At Kx25, the international kdb+ user conference held in New York City on May 18th, Alan Rozet, from Kx partner Brainpool.ai, gave a presentation about how machine learning (ML) is being used in business today. Alan is a Principal Machine Learning Engineer at Capital One with wide ranging experience in ML including in the financial and health care and consumer sectors.
Dimensionality Reduction in kdb+
14 Jun 2018
Dimensionality reduction methods have been the focus of much interest within the statistics and machine learning communities for a range of applications. These techniques have a long history as being methods for data pre-processing. Dimensionality reduction is the mapping of data to a lower dimensional space such that uninformative variance in the data is discarded. By doing this we hope to retain only that data that is meaningful to our machine learning problem. In addition, by finding a lower-dimensional representation of a dataset we hope we can improve the efficiency and accuracy of machine learning models.