Kdb+ Transitive Comparisons

6 Jun 2018 | , ,
Share on:

By Hugh Hyndman

Last summer, I wrote a blog discussing my experiences running kdb+ on a Raspberry Pi, in particular making use of published benchmark content from InfluxData to generate test data, perform ingestion, and invoke a set of benchmarking queries. As a result of kdb+’s excellent performance, I concluded that it would be a perfect fit for small platform or edge computing.

I felt that I owed it to the Kx community to take things a step further: to run performance tests against all of the products that InfluxData documented, including Cassandra, ElasticSearch, MongoDB, and OpenTSDB – and go beyond the Raspberry Pi and use a variety of other server configurations.

The difficulty with doing this is that I didn’t have time to install and configure these technologies (let alone on the Raspberry Pi), so I decided to take a different approach and exploit the old transitivity argument, where if a is greater than b, and if b is greater than c, then it follows that a is greater than c.

So, using this logic and taking InfluxData’s benchmark results at face value, I concluded that all I had to do was run the tests on my hardware and compare my results with theirs to get a broad comparison across all the other technologies. Moreover, as InfluxDB had pretty much outperformed all the other databases in their tests, I reckoned that if kdb+ outperformed InfluxDB, then by transitivity, kdb+ was the fastest of them all!

To read more about the data, queries and hardware environment that I used and the resulting performance figures please click here 

 

SUGGESTED ARTICLES

Classification using K-Nearest Neighbors in kdb+

21 Jun 2018 | , , , , , ,

As part of Kx25, the international kdb+ user conference held May 18th in New York City, a series of seven JuypterQ notebooks were released and are now available on https://code.kx.com/q/ml/. Each notebook demonstrates how to implement a different machine learning technique in kdb+, primarily using embedPy, to solve all kinds of machine learning problems, from feature extraction to fitting and testing a model.

ML and credit card fraud

Machine Learning and AI: The Future Possibilities in Business

20 Jun 2018 | , , ,

At Kx25, the international kdb+ user conference held in New York City on May 18th, Alan Rozet, from Kx partner Brainpool.ai, gave a presentation about how machine learning (ML) is being used in business today. Alan is a Principal Machine Learning Engineer at Capital One with wide ranging experience in ML including in the financial and health care and consumer sectors.

Machine Learning with kdb+ blog

Dimensionality Reduction in kdb+

14 Jun 2018 | , , , , ,

Dimensionality reduction methods have been the focus of much interest within the statistics and machine learning communities for a range of applications. These techniques have a long history as being methods for data pre-processing. Dimensionality reduction is the mapping of data to a lower dimensional space such that uninformative variance in the data is discarded. By doing this we hope to retain only that data that is meaningful to our machine learning problem. In addition, by finding a lower-dimensional representation of a dataset we hope we can improve the efficiency and accuracy of machine learning models.