kdb Products
Overview
KDB.AI
kdb+
kdb Insights
kdb Insights Enterprise
Capabilities
The Data Timehouse
Vector Database Explained
kdb+ Time Series Database
PyKX Python Interoperability
Services & Support
Financial Services
Quant Research
Trading Analytics
Industry & IoT
Automotive
Energy & Utilities
Healthcare & Life Sciences
Manufacturing
Telco
Learn
Overview
Featured Courses
KX Academy
KX University Partnerships
Connect
KX Community
Community Events
Developer Blog
Build
Download
Documentation
Support
About Us
Partner with Us
Become a Partner
Find a Partner
Partner Signup
Join Us
Connect with Us
by Steve Wilcockson
Ferenc Bodon, Head of Benchmarking at KX, recently posted an interesting LinkedIn blog regarding research he conducted on simultaneous execution of functions across multiple kdb+ processes.
The work stemmed from an analysis of the speed and scalability profile of coordinating multiple workers from a central controller in storage efficiency tests using the KX Nano performance tool.
The analysis looked at two different methods for coordinating the worker processes: one using inter-process communications and one using file operations. For the former, he looked at a number of different approaches using each and peach functions for managing process coordination over multiple connections. Variations included using async flush to address blocking, broadcasts to reduce message serialization cost and a timer for coordinating simultaneous starts. For the file-based approach, Linux’s inotify function was used.
Analysis of results over multiple runs shows that timer-based approach delivers the best and most consistent results. It should be noted, however, that the optimal trigger offset depends on the hardware and network environment it’s running in.
The peach handlers approach, which manages the synchronous connections automatically, offered comparable performance but carries the overhead of dual operation (and maintenance) on both the controller and the worker processes. The peach one-shot approach, which manages available connections asynchronously, offers an expedient balance between the two, delivering speed and resilience without the overhead of complex configuration, implementation and maintenance.