Back to Blog

PyKX open source, a year in review

Author

Daniel Tovey

Senior Content Marketing Manager

Published

17 May, 2024

Reading Time

Initially conceived as a comprehensive integration between Python and kdb+/q PyKX has gone from strength-to-strength both in terms of user adoption and functionality since the first lines of code were written in January 2020. During the first years of development, PyKX was a closed source library available to clients via closely guarded credentials unique to individual organisations.

In May of last year this changed when we announced PyKX as an open source offering available via PyPi and Anaconda, with the source code available on Github. Since then, across these two distribution channels the library has been downloaded more than 200,000 times.

In this blog we will run through how the central library has evolved since that initial release in response to user feedback.

What’s remained consistent?

Since initial release the development team have taken pride in providing a seamless developer experience for users working at the interface between Python and q. Our intention is to provide as minimal a barrier for entry as possible by providing Python first methods without hiding the availability and power of q. Coupled with this the team has been striving to make PyKX the go-to library for all Python integrations to q/kdb+.

These dual goals are exemplified by the following additions/changes since the open-source release:

PyKX under q as a replacement of embedPy moved to production usage in Release 2.0.0
Addition of License Management functionality for manual license interrogation/management alongside a new user driven licenses installation workflow outlined here. Additionally expired or invalid licenses will now prompt users to install an updated license.
When using a Jupyter Notebooks, tables, dictionaries and keyed tables now have HTML representations increasing legibility.
PyKX now supports kdb+ 4.1 following the release on February 13^th 2024

What’s changed?

Since being open-sourced, PyKX has seen a significant boost in development efforts. This increase is due to feedback on features from users and responses to the issues they commonly face. A central theme behind requests from clients and users has been to expand the Python first functionality provided when interacting with PyKX objects and performing tasks common for users of kdb+/q.

This can be seen through additions in the following areas:

Increases in the availability of Pandas Like API functionality against “pykx.Table” objects including the following methods: merge_asof, set_index, reset_index, apply, groupby, agg, add_suffix, add_prefix, count, skew and std.
Interactions with PyKX vectors and lists have been improved considerably allowing users to:
- Append and Extend new elements to their vectors using Python familiar syntax
- Run analytics and apply functions to their vectors in a Python first manner
PyKX atomic values now contain null/infinity representations making the development of analytics dependent on them easier. In a similar vein we have added functionality for the retrieval of current date/time information. See here for an example.
In cases where PyKX does not support a type when converting from Python to q we’ve added a register class which allows users to specify custom data translations.
The addition of beta features for the following in PyKX
- The ability for users to create and manage local kdb+ Partitioned Databases using a Python first API here.
- The ability for users with existing q/kdb+ server infrastructure to remotely execute Python code on a server from a Python session. Full documentation can be found here.

What’s next?

The central principles behind development of the library haven’t changed since our initial closed source release and the direction of development will continue to be driven by user feedback and usage patterns that we see repeated in client interactions. On our roadmap for 2024 are developments in the following areas:

Streaming use-cases allowing users in a Python first manner to get up and running with data ingestion and persistence workflows.
Enhancements to the performance of the library in data conversions to and from our supported Python conversion types.
Expansion to the complexity of data-science related functions and methods supported by PyKX to allow for efficient data analysis.

Conclusion

Over the last year, significant progress in the library has greatly enhanced PyKX’s usability. This progress was largely influenced by daily interactions with the development community using PyKX. By making the library available on PyPi, Anaconda, and the KX GitHub, we’ve accelerated development and deepened our understanding of what users want to see next with PyKX.

If you would like to be involved in development of the library, I encourage you to open issues for the features that you would like to see or to open a pull request to have your work incorporated in an upcoming release.

Demo the world’s fastest database for vector, time-series, and real-time analytics

Start your journey to becoming an AI-first enterprise with 100x* more performant data and MLOps pipelines.

Process data at unmatched speed and scale
Build high-performance data-driven applications
Turbocharge analytics tools in the cloud, on premise, or at the edge

*Based on time-series queries running in real-world use cases on customer environments.

Book a demo with an expert

"*" indicates required fields

First Name*

Last Name*

Company*

Job Title*

Business Telephone*

Business Email*

Industry*

Country*

How can KX help you?*

How did you hear about us?*

By submitting this form, you will also receive sales and/or marketing communications on KX products, services, news and events. You can unsubscribe from receiving communications by visiting our Privacy Policy. You can find further information on how we collect and use your personal data in our Privacy Policy.

CAPTCHA

This field is for validation purposes and should be left unchanged.

Modernizing infrastructures that mix Python and q

The ultimate guide to choosing embedding models for AI applications

Apex innovators: How hedge funds can evolve analytics at speed and scale

Outrun the competition: Winning the digital assets race

PyKX 3.0: Easier to use and more powerful than ever

Supercharging your quants with real-time analytics

Apex innovators: How hedge funds can evolve analytics at speed and scale

Developer

What’s remained consistent?

What’s changed?

What’s next?

Conclusion

Demo the world’s fastest database for vector, time-series, and real-time analytics

Start your journey to becoming an AI-first enterprise with 100x* more performant data and MLOps pipelines.

Book a demo with an expert