At our 2024 Capital Markets Summit, KX’s Conor McCarthy, Lead Architect of PyKX at KX, shared insights into how PyKX is redefining Python integration with kdb+, expanding access and enabling more efficient workflows for quant teams and data engineers.
For firms relying on Python and kdb+ for high-performance analytics, PyKX offers seamless interoperability, powerful database management capabilities, and an accessible pathway to leveraging q’s speed and efficiency.
Whether you’re a data scientist, quant developer, or IT leader, PyKX’s latest enhancements are designed to streamline your workflows and maximize your infrastructure’s potential. Read on to explore the key takeaways from Conor’s session.
Unlocking the full potential of Python and q
PyKX is the most comprehensive bridge between kdb+ and Python to date, replacing older integration methods like QPython, EmbedPy, and PyQ with a unified, high-performance solution. It serves as an entry point for Python users to leverage the efficiency of q, making it easier to integrate kdb+ within Python-centric environments.
“It’s a gateway,” Conor explained. “Users of PyKX get the ability to see the performance benefits of kdb+ and q but in a way that’s familiar to them.”
A new standard for Python interoperability
One of PyKX’s primary goals is to offer fast, seamless conversions between kdb+ and the most commonly used Python data formats, including Pandas, PyArrow, and NumPy. By minimizing data movement constraints, PyKX eliminates inefficiencies that previously hindered Python-q integration.
“Conversions of data to and from the most common Python data formats (Pandas, PyArrow, NumPy) was pretty much a key initial requirement,” Conor said. “In previous iterations of integrations between kdb+ and Python, there have been limitations on what data formats users could interact with. We wanted to try and limit that as much as possible.”
Democratizing access to high-performance analytics
A major barrier to leveraging kdb+ has historically been the specialized knowledge required to work with q. PyKX lowers this barrier by providing a Python-first interface, allowing a broader range of users—including data scientists, engineers, and analysts—to interact with kdb+ without needing deep q expertise.
“One of the things with kdb+ that some people get scared of is when they start to get to the point that they need to modify a column on an on-disk database or add a new partition of data or add a new table to a database,” Conor noted. “So, running database maintenance operations—deleting columns, applying functions, those things that you expect to be able to do on your database—you can do now, Python-first.”
With PyKX, organizations can extend the capabilities of their existing teams, allowing Python developers to work with time-series data more effectively and reducing reliance on q specialists for routine tasks. This democratization not only improves efficiency but also fosters greater collaboration between Python and q teams, ensuring seamless workflow integration.
Simplifying database management
Managing kdb+ databases has historically required specialized expertise, but PyKX changes that by introducing Python-first database management. This means Python users can now create, modify, and maintain kdb+ databases using familiar syntax—reducing the reliance on q specialists for routine tasks.
“PyKX provides an API for integration with a pandas-like syntax,” Conor explained. “So, users that are dealing with a PyKX table, which is ultimately a kdb+ table under the hood, are in a position to run pandas-like syntax against it—running iloc commands, running max commands, running aggregations—basically doing the basic data science tasks that they expect to do in a Python-first way.”
Flexibility in deployment
PyKX extends the reach of kdb+ analytics by allowing q to run anywhere Python runs, including cloud environments like Databricks and Snowflake. It supports flexible deployment models, making it easier for firms to integrate high-performance q analytics within their existing Python-based infrastructures.
“You can run q anywhere where Python would run,” Conor noted. “So if you’re licensed to do so, you can run it in Databricks or Snowflake. You can run it on any cloud environment that Python would normally run.”
Real-world impact: How firms are using PyKX
Several major financial institutions and enterprises are already leveraging PyKX to modernize their data workflows. Some key examples include:
- Market position tracking: A global bank replaced its Pandas-based position tracking system with PyKX, significantly improving performance through multithreading.
- High-performance joins: A pharmaceutical firm replaced Spark with PyKX for large-scale outer joins, achieving a 9x speed improvement.
- Optimized DAG workflows: A leading hedge fund cut execution times from minutes to milliseconds by replacing NumPy and Pandas operations with PyKX, reducing CPU load and enhancing efficiency.
The power of Python-first streaming workflows
One of the most exciting advancements in PyKX 3.0 is its support for Python-first streaming workflows. Python traditionally struggles with real-time data processing, but PyKX enables firms to integrate Python analytics within high-performance streaming infrastructures.
“Python’s very bad at streaming, q is very good at streaming,” Conor said. “So what we’ve provided with the 3.0 release is a new way to initialize ticker plant infrastructures and extend them.”
Looking ahead: The future of PyKX
The PyKX roadmap is packed with innovations designed to further enhance performance, usability, and integration within the broader KX ecosystem. Key developments on the horizon include:
Expanded database management: Future updates will introduce enhanced support for splayed tables, making it easier to maintain and manipulate smaller datasets while improving query efficiency.
Significant performance gains: Future releases of PyKX will introduce major performance optimizations, including a 4x speed improvement for Pandas-to-q conversions. This is achieved by migrating the conversion stack from Cython to C and implementing multi-threaded operations.
Broader data format support: More seamless integrations with additional data formats will be introduced, ensuring that PyKX remains the most versatile bridge between Python and kdb+.
Deeper integration with KX’s ML toolkit: The roadmap includes enhancements that will bring KX’s open-source ML toolkit into the PyKX environment, allowing users to leverage powerful machine learning capabilities directly within their Python workflows.
“We don’t work in isolation,” Conor said. “Lots and lots of users are using this every day, and lots of people provide us feedback. That accounts for about 70% of what defines our roadmap.”
Explore PyKX
PyKX is redefining how Python and q coexist, offering a seamless, high-performance bridge that empowers both Python developers and q engineers to work more efficiently.
Learn more about PyKX and how it can transform your data workflows, or read our whitepaper: Transforming data science with PyKX .