We will be in contact shortly.
By Przemek Tomczak
The relational and columnar database kdb+ is well known for exceptionally fast analytics on large scale datasets in motion and at rest. This has made kdb+ the technology of choice for capital markets applications and industrial IoT applications involving large amounts of time-series data. This is due to the way data is optimally stored for manipulation and querying of time-series data and relational data in the programming platform. Although there are other columnar databases in the market, there are no databases that combine all of these aspects together.
This optimization enables kdb+ to deliver orders of magnitude better performance when working with sensor and other types of time-series data compared to alternative technologies. Some performance snippets from various customer implementations of kdb+ running on a single server are listed below.
Why is kdb+ so fast for data ingestion?
What makes kdb+ unique is that as an in-memory, time-series database it enables data to be ingested and made immediately available for queries. This makes it ideal for industrial IoT applications for ingesting, storing, processing, and analyzing time-series data – including IoT sensor data used in manufacturing and financial market data.
To achieve this level of performance, data is first placed in in-memory table(s) using a prescribed schema and protected through an on-disk log. By going to memory first, and making data available immediately for query, it enables kdb+ to support much higher ingestion rates of many millions of readings per second, hundreds of MBs / second, many terabytes per day on a single server than other technologies.
As memory is consumed, data is migrated from the in-memory database called the real-time database (RDB) to queryable temporary table(s) on disk called the IntradayDatabase (IDB). The IDB is partitioned by any configurable time interval, commonly 5, 10, 30, 60 minutes depending on the volume and available RAM. The data is then further organized, sorted, and migrated to more permanent storage on disk database tables that we call the Historical Database (HDB). The IDB and HDB can utilise various and tiered storage media such as solid state drives (SSD), hard disk drives (HDD), storage area networks (SAN), network attached storage (NAS), and parallel file systems, providing options to customers to optimize performance and cost of storing their data.
This ingestion process exploits both the performance advantage of sequential-write operations to disk, and making data immediately available from memory, thereby delivering orders-of-magnitude better performance than other technologies. Also, the structure of the database tables (columnar format) allows for bulk writes to tables on disk, which allows for more efficient ingestion of data.
With this approach, we are able to support large data volumes with less infrastructure, particularly where the daily volume exceeds RAM on a single server, while delivering exceptional query performance. The other added benefit is that organizations can avoid making copies of data for analysis when a single system can support both real-time and historical analytics applications.
Why is kdb+ so fast for queries?
The three primary reasons why kdb+ is so fast are:
Each of these three factors make kdb+ fast, but combined, they make it even more powerful. Although there are other time-series, columnar or vector databases on the market, there are no databases that combine all these aspects together. What are the specific advantages?
Measuring the results with kdb+
As we have shown, kdb+ comes with a programming system optimized for high performance manipulation and querying of time-series data and relational data. This optimization enables kdb+ to deliver order-of-magnitude better performance when working with sensor and related data compared to alternative technologies.
Transitive comparisons to other database technologies
By running a series of performance tests against another solution and comparing the results against its benchmarks we were able to assess the relative performance of kdb+ versus other database technologies. The results are illustrated below.
Normalized Queries per Second
Click here to the full paper on our transitive comparison.
For completely independent and audited performance benchmarks, the Security Technology Analysis Center Benchmark Council has a number of tests comparing low-latency, high volume technologies; kdb+ features well in STAC’s results. You can visit STAC at https://stacresearch.com.
The velocity and volume of data continues to grow, along with the need for performing analyses ever faster, challenging traditional approaches and databases that were never designed to support these demands. For example, we are seeing data volumes and data rates increase by 10x to 100x across a wide range of industries. In manufacturing facilities, higher frequency sensors (100kHz to 1MHz) are capturing vastly more granular data. In the automobile industry, more sensors are being deployed (thousands to millions) throughout individual vehicles. Organizations like these are analyzing significantly more data faster, so that they can deliver better products and user experiences to their customers.
Kdb+ is ideally suited for these demands because of its unique combination of a higher performance in-memory, columnar and relational database with an integrated vector-oriented programming system. Our customers are using kdb+ to get significant improvements to the performance and scalability of their applications in the face of these data volumes, particularly for supervisory control and data acquisition, data historians, fault detection and prediction, advanced data warehouses, and capital markets trading and surveillance systems.
Przemek Tomczak is Senior Vice-President of Internet of Things and Utilities at Kx. For over twenty five years, Kx has been providing the world’s fastest database technology and business intelligence solutions for high velocity and large data sets. Previously, Przemek held senior roles at the Independent Electricity System Operator in Ontario, Canada and top-tier consulting firms and systems integrators. Przemek also has a CPA, CISA and has a background in business, technology, and risk management.