PyKX open source, a year in review

Conor McCarthyTechnical Lead for PyKX
17 May 2024 | 4 minutes

Initially conceived as a comprehensive integration between Python and kdb+/q PyKX has gone from strength-to-strength both in terms of user adoption and functionality since the first lines of code were written in January 2020. During the first years of development, PyKX was a closed source library available to clients via closely guarded credentials unique to individual organisations.

In May of last year this changed when we announced PyKX as an open source offering available via PyPi and Anaconda, with the source code available on Github. Since then, across these two distribution channels the library has been downloaded more than 200,000 times.

In this blog we will run through how the central library has evolved since that initial release in response to user feedback.

What’s remained consistent?

Since initial release the development team have taken pride in providing a seamless developer experience for users working at the interface between Python and q. Our intention is to provide as minimal a barrier for entry as possible by providing Python first methods without hiding the availability and power of q. Coupled with this the team has been striving to make PyKX the go-to library for all Python integrations to q/kdb+.

These dual goals are exemplified by the following additions/changes since the open-source release:

  • PyKX under q as a replacement of embedPy moved to production usage in Release 2.0.0
  • Addition of License Management functionality for manual license interrogation/management alongside a new user driven licenses installation workflow outlined here. Additionally expired or invalid licenses will now prompt users to install an updated license.
  • When using a Jupyter Notebooks, tables, dictionaries and keyed tables now have HTML representations increasing legibility.
  • PyKX now supports kdb+ 4.1 following the release on February 13th 2024

What’s changed?

Since being open-sourced, PyKX has seen a significant boost in development efforts. This increase is due to feedback on features from users and responses to the issues they commonly face. A central theme behind requests from clients and users has been to expand the Python first functionality provided when interacting with PyKX objects and performing tasks common for users of kdb+/q.

This can be seen through additions in the following areas:

  • Increases in the availability of Pandas Like API functionality against “pykx.Table” objects including the following methods: merge_asof, set_index, reset_index, apply, groupby, agg, add_suffix, add_prefix, count, skew and std.
  • Interactions with PyKX vectors and lists have been improved considerably allowing users to:
  • PyKX atomic values now contain null/infinity representations making the development of analytics dependent on them easier. In a similar vein we have added functionality for the retrieval of current date/time information. See here for an example.
  • In cases where PyKX does not support a type when converting from Python to q we’ve added a register class which allows users to specify custom data translations.
  • The addition of beta features for the following in PyKX
    • The ability for users to create and manage local kdb+ Partitioned Databases using a Python first API here.
    • The ability for users with existing q/kdb+ server infrastructure to remotely execute Python code on a server from a Python session. Full documentation can be found here.  

What’s next?

The central principles behind development of the library haven’t changed since our initial closed source release and the direction of development will continue to be driven by user feedback and usage patterns that we see repeated in client interactions. On our roadmap for 2024 are developments in the following areas:

  1. Streaming use-cases allowing users in a Python first manner to get up and running with data ingestion and persistence workflows.
  2. Enhancements to the performance of the library in data conversions to and from our supported Python conversion types.
  3. Expansion to the complexity of data-science related functions and methods supported by PyKX to allow for efficient data analysis.

Conclusion

Over the last year, significant progress in the library has greatly enhanced PyKX’s usability. This progress was largely influenced by daily interactions with the development community using PyKX. By making the library available on PyPi, Anaconda, and the KX GitHub, we’ve accelerated development and deepened our understanding of what users want to see next with PyKX.

If you would like to be involved in development of the library, I encourage you to open issues for the features that you would like to see or to open a pull request to have your work incorporated in an upcoming release.

Demo kdb, the fastest time-series data analytics engine in the cloud








    For information on how we collect and use your data, please see our privacy notice. By clicking “Download Now” you understand and accept the terms of the License Agreement and the Acceptable Use Policy.