kdb+/q machine learning github

GitHub: Machine learning project for kdb+/q

14 Dec 2016 | , ,
Share on:

Software engineer and kdb+ programmer Juan Lasheras recently added a kdb+/q machine learning project to GitHub.

The aim of Juan’s ml.q repository is to act as a multi-purpose machine learning toolkit. It provides multiple useful methods that practitioners can use for data analysis and predictive modeling. It is comparable to the scikit-learn toolkit for Python. The project currently has the following three algorithms implemented:

K nearest neighbors: The user specifies a known point in a dataset and the algorithm will find other points closest to it.

K-means clustering: This breaks down a dataset into multiple partitions. This is particularly useful as the partitions can indicate some sort of relationship between data points.

Decision Tree (ID3): This scans a dataset and constructs a series of questions that can help identify future data points.

You can see Juan’s project here.

To learn more about the scikit-learn toolkit for Python see http://scikit-learn.org.

SUGGESTED ARTICLES

Classification using K-Nearest Neighbors in kdb+

21 Jun 2018 | , , , , , ,

As part of Kx25, the international kdb+ user conference held May 18th in New York City, a series of seven JuypterQ notebooks were released and are now available on https://code.kx.com/q/ml/. Each notebook demonstrates how to implement a different machine learning technique in kdb+, primarily using embedPy, to solve all kinds of machine learning problems, from feature extraction to fitting and testing a model.

ML and credit card fraud

Machine Learning and AI: The Future Possibilities in Business

20 Jun 2018 | , , ,

At Kx25, the international kdb+ user conference held in New York City on May 18th, Alan Rozet, from Kx partner Brainpool.ai, gave a presentation about how machine learning (ML) is being used in business today. Alan is a Principal Machine Learning Engineer at Capital One with wide ranging experience in ML including in the financial and health care and consumer sectors.

Machine Learning with kdb+ blog

Dimensionality Reduction in kdb+

14 Jun 2018 | , , , , ,

Dimensionality reduction methods have been the focus of much interest within the statistics and machine learning communities for a range of applications. These techniques have a long history as being methods for data pre-processing. Dimensionality reduction is the mapping of data to a lower dimensional space such that uninformative variance in the data is discarded. By doing this we hope to retain only that data that is meaningful to our machine learning problem. In addition, by finding a lower-dimensional representation of a dataset we hope we can improve the efficiency and accuracy of machine learning models.