Analytic development using PyKX – Part 1

作者

Daniel Baker

Head of Builder Content

Common Python practices for analytical development mirror what has made q an industry leader in solving high-performance, computationally intensive vector-oriented analytical problems. The modern paradigm for most data scientists and engineers is to use dynamically typed languages like Python and incorporate vector operations from the likes of NumPy and table types from libraries like Pandas and PyArrow. This has always been q’s approach to solving problems, allowing developers to effectively express their ideas and make their analytics more efficient.

PyKX maintains the functionality and performance of q while offering developers a Pythonic medium to design their vector-oriented analytics more elegantly and efficiently. When using PyKX for creating functions and analytics developers can leverage several different feature APIs that harness the power of q. In this blog, we will explore how data can be queried and transformed using both the qSQL and SQL PyKX APIs as well as native q functions applied using the context interface and Pandas-like API.

Post-trade analytics walkthrough

To highlight the features listed above we will develop an analytic regularly used across capital markets, a calculation of slippage.

Slippage is an important metric used when doing post-trade analysis, it can be used to track order and execution performance against the market. To do this we usually compare the execution price or the average executed price to a benchmark price. In this example, we will use the midpoint as the benchmark price and compare that to the average price (avgPX) immediately before or at the same time of the trade.

To begin, we will load a historical database containing quote and execution data using the PyKX database API.

Python
>>> import os
>>> os.environ['PYKX_BETA_FEATURES']='True'
>>> import pykx as kx
>>> db=kx.DB(path='database')

To complete our final slippage calculation later we first need to join our data from both the executions table and the quote table, as well as calculate the midpoint.

There are various ways that data can be queried within PyKX, for this example we will use the SQL and qSQL APIs.

Python
>>> e = kx.q.sql("SELECT time,sym,side,avgPrice from execsUS WHERE date=2023.11.10 and sym = 'AAPL'")
>>> e.head(3)
pykx.Table(pykx.q('
time                          sym  side avgPrice
------------------------------------------------
2023.11.10D14:30:03.462500000 AAPL 2    183.99
2023.11.10D14:30:03.470000000 AAPL 2    183.99
2023.11.10D14:30:03.470000000 AAPL 2    183.99
'))
>>> q = kx.q.qsql.select(db.quoteUS,
...     columns = {'time': 'time',
...                'sym' : 'sym',
...                'midpoint': 'fills (askPrice + bidPrice)%2'},
...     where = ['date=2023.11.10',
...              'sym=`AAPL',
...              'time>=14:30'])
>>> q.head(3)
pykx.Table(pykx.q('
time                          sym  midpoint
-------------------------------------------
2023.11.10D14:30:00.000400000 AAPL 183.68
2023.11.10D14:30:00.000400000 AAPL 183.595
2023.11.10D14:30:00.007600000 AAPL 183.875
'))

Now that we have both the execution and quote tables for a specified date and symbol we can join the data temporally using an asof join. In the cell below using the q native aj function via the context interface.

Python
>>> res = kx.q.aj('time', e, q)
>>> res.head(3)
pykx.Table(pykx.q('
time                          sym  side avgPrice midpoint
---------------------------------------------------------
2023.11.10D14:30:03.462500000 AAPL SELL 183.99   183.965
2023.11.10D14:30:03.470000000 AAPL SELL 183.99   183.965
2023.11.10D14:30:03.470000000 AAPL SELL 183.99   183.965

Alternatively, if you’re more familiar with Python, you can use the pandas-like API within PyKX to run the asof join using merge_asof. This API allows you to apply pandas-like syntax to PyKX objects and reap the performance benefits of kdb+

Python
>>> res = e.merge_asof(q, on="time")

We can then use some further elements of the pandas-like API to calculate slippage, this is done in the following steps:

  1. Add a new dummy column ‘slippage’ to contain data after final analysis
  2. Using iloc select buy and sell data subsets
  3. Update the content of the ‘slippage’ column with the calculated difference between midpoint and average price depending on side
    • a. avgPrice – midpoint for all BUY executions
    • b. midpoint – avgPrice for all SELL executions
  4. Calculate the result in basis points as 10000*(calculated difference / midpoint)
Q(kdb+データベース)
>>> res['slippage'] = kx.FloatAtom.null
>>> buytab = res.iloc[res['side'] == 'BUY']
>>> selltab = res.iloc[res['side'] == 'SELL']
>>> res.iloc[res['side'] == 'BUY', 'slippage'] = buytab['avgPrice']-buytab['midpoint']
>>> res.iloc[res['side'] == 'SELL', 'slippage'] = selltab['midpoint'] - selltab['avgPrice']
>>> res['slippage'] = 10000 * res['slippage'] / res['midpoint']

By combining the above analytics, we create a new analytic and use a variable input for analysis of any security.

Python
>>> def calculateSlippage(symbol):
...     e = kx.q.sql("SELECT time,sym,side,avgPrice from execsUS WHERE date=2023.11.10 and sym = $1", symbol)
...     kx.q["s"] = symbol
...     q = kx.q.qsql.select(db.quoteUS,
...         columns = {'time': 'time',
...                    'sym': 'sym',
...                    'midpoint' : 'fills (askPrice + bidPrice)%2'},
...         where = ['date=2023.11.10', 'sym=s', 'time>=14:30'])
...     res = kx.q.aj('time', e, q)
...     res['slippage'] = kx.FloatAtom.null
...     buytab = res.iloc[res['side'] == 'BUY']
...     selltab = res.iloc[res['side'] == 'SELL']
...     res.iloc[res['side'] == 'BUY', 'slippage'] = buytab['avgPrice']-buytab['midpoint']
...     res.iloc[res['side'] == 'SELL', 'slippage'] = selltab['midpoint'] - selltab['avgPrice']
...     res['slippage'] = 10000 * res['slippage'] / res['midpoint']
...     return res
>>> calculateSlippage('AAPL')

PyKX combines the syntactical quality and familiarity of Python with the performance and efficiency of kdb+ and q. Many users of PyKX will want to blend kdb+ with the Python libraries that they already use, and PyKX makes this integration seamless by providing a Python-first approach to all aspects of q’s functional expressive programming language.

To learn more, please visit https://code.kx.com/pykx/

AIによるイノベーションを加速する、KXのデモをお客様に合わせてご提供します。

当社のチームが以下の実現をサポートします:

  • ストリーミング、リアルタイム、および過去データに最適化された設計
  • エンタープライズ向けのスケーラビリティ、耐障害性、統合性、そして高度な分析機能
  • 幅広い開発言語との統合に対応する充実したツール群

専門担当者によるデモをリクエスト

*」は必須フィールドを示します

本フォームを送信いただくと、KXの製品・サービス、お知らせ、イベントに関する営業・マーケティング情報をお受け取りいただけます。プライバシーポリシーからお手続きいただくことで購読解除も可能です。当社の個人情報の収集・使用に関する詳しい情報については、プライバシーポリシーをご覧ください。

このフィールドは入力チェック用です。変更しないでください。