Web Scraping – A Kdb+ Use case

24 Jan 2019 | , , ,

By Abin Saju Web scraping is a method through which human readable content is extracted from a web page using an automated system. The system can be implemented using a bot/web crawler which traverses through domains or through a web browser which mimics human interaction with a page. There are many use cases for the

The Exploration of Space Weather at NASA FDL with kdb+

4 Dec 2018 | , , , ,

Our society is dependent on GNSS services for navigation in everyday life, so it is critically important to know when signal disruptions might occur. Physical models have struggled to predict astronomic scintillation events. One method for making predictions is to use machine learning (ML) techniques. This article describes how kdb+ and embedPy were used in the ML application.

Random Forests in kdb+

12 Jul 2018 | , , , , ,

The Random Forest algorithm is an ensemble method commonly used for both classification and regression problems that combines multiple decision trees and outputs and average prediction. It can be considered to be a collection of decision trees (forest) so it offers the same advantages as an individual tree: it can manage a mix of continuous, discrete and categorical variables; it does not require either data normalization or pre-processing; it is not complicated to interpret; and it automatically performs feature selection and detects interactions between variables. In addition to these, random forests solve some of the issues presented by decision trees: reduce variance and overfitting and provide more accurate and stable predictions. This is all achieved by making use of...

Classification using K-Nearest Neighbors in kdb+

21 Jun 2018 | , , , , , ,

As part of Kx25, the international kdb+ user conference held May 18th in New York City, a series of seven JuypterQ notebooks were released and are now available on https://code.kx.com/q/ml/. Each notebook demonstrates how to implement a different machine learning technique in kdb+, primarily using embedPy, to solve all kinds of machine learning problems, from feature extraction to fitting and testing a model.