By Glenn Wright
Energy consumption by computers is now recognized as a serious issue with long-term implications. The Semiconductor Industry Association (SIA) recently reported that in roughly 20 years today’s computer chips will require more power than global energy production can provide. As a result, the SIA is looking for new technologies to analyze, store and enable decision-making based on increasingly massive data streams.
Source: SIA report: “Rebooting the IT Revolution, A Call to Action”
Those companies who are looking to improve their computing energy efficiency are evaluating both their hardware and software choices. At Kx, our kdb+ database platform has been designed from the start to be extremely efficient. Kdb+ sets records for speed at performing complex analytics due to its vector-based algorithms, which have been optimized to make best use of the hardware they run on. As a result kdb+ makes fewer demands on the hardware and requires less electricity. This lowers its total cost of ownership, while earning high marks for green computing.
The most significant cost of running a computer is the cost of the energy to power it and the energy to cool it. As a result, companies with on-premises servers juggling power, heat and space constraints are turning to cloud or remote equipment options, with further costs and limitations. These physical realities are among the reasons that the Green500 List was developed, to not just rank the fastest supercomputers in the world, but to rank them by energy efficiency for sustainable computing.
Choosing efficient software, especially for the most demanding computing problems, is another effective, yet often overlooked, way to manage energy consumption. But how do you compare the energy efficiency of different database platforms?
We came up with a simple method, based on recent benchmarks run on our software, and a dozen other combinations of database software and hardware by tech blogger Mark Litwintschik. Earlier this year he published an article where he tested kdb+ on a set of standard queries: 1.1 Billion Taxi Rides on kdb+/q & 4 Xeon Phi CPUs. Mark’s article focused on the time taken to compute the results of several queries, where each query in every case has to scan each and every one of the taxi ride records.
Not only did kdb+/q come out with the fastest query times by far of any Intel platform, when we then looked at the underlying energy required to calculate these results, we discovered we were doing this whilst at the same time we were the most energy efficient solution by far. We completed these queries using an average of 14.67 queries per watt of energy, with each query operating against the full set of taxi ride records.
The closest competitor to this came in with a result of 0.67 queries per watt of energy. In other words we used a fraction (35%) of the power they needed to perform a query AND at the same time completed it in a fraction (12%) of the time!
These results are based on averaging all four query times in the tests that Mark Litwintschik ran on the taxi ride dataset from the New York City Taxi and Limousine Commission. Energy consumption for this exercise was calculated using maximum power for each of the server platforms, which included CPU, GPU, memory and all ancillary equipment. So these are true apples-to-apples comparisons, using real-world numbers.
In today’s increasingly demanding business environment, operational efficiency is being measured more carefully in terms of energy efficiency. These benchmarks show the ability of a range of database platforms and hardware combinations to deliver results while taking their power demands into account. Kdb+ is not only faster, it uses significantly less power compared to queries run on a fixed number of compute instances, like Spark, as well as solutions using a cloud “per-query” billing model, like Google BigQuery and Amazon Redshift.