kdb+ kafka interface

Kdb+ interface to Kafka

20 Jun 2017 | , , , , ,
Share on:


By Sergey Vidyuk

Kx recently open-sourced a Kafka interface to kdb+ on GitHub under the Apache2 license that eases integration of applications using Kafka with kdb+. Here is a description of the interface.

Kdb+kafka = kfk library

Libkfk is a comprehensive interface to Apache Kafka for kdb+ based on the librdkafka library – similar to 20+ other language bindings.

This interface provides yet another way of combining the ingest, real-time and historical data processing capabilities of kdb+ with external data streams and consumers.

Notable features include:
* Multiple clients, publishers and subscribers in the same process
* Minimal library overhead to utilize the multi-threaded C API
* Message data decoded and delivered
* Metadata, statistics and error list exposed for full production control and monitoring

Kfk library strives to expose the maximum amount of information to the user to give them control and understanding of the data flow within and beyond the application. To spot optimization possibilities remember that events within your application are a data feed in themselves for storing and facilitating further analysis.

What is Kafka?

Tibco RV, RabbitMQ, MQ, 29West LBM(UMS), JMS….If any of these sound familiar, you can just append another one to the list: Kafka.

At a high level, Kafka is just another messaging system which uses brokers to decouple producers and consumers of data. Such environments have existed in the financial world for a long time. Kafka differs in that it provides production quality, performant systems with free, open-source code.

kdb+ interface to Kafka

illustration by Sergey Vidyuk


Kafka Terminology

If you are unfamiliar with enterprise messaging systems, it is useful to review a summary of the main concepts. This might make library usage easier due to almost direct mapping from the API to the main concepts in Kafka. 

  • Maintains feeds of messages in topics
  • Contains processes that publish messages to topics which are called producers
  • Has processes called consumers that subscribe to topics and process the feed of published messages  
  • Runs as a cluster of a lot more servers each of which is called a broker
  • Topics are broken up into ordered commit logs called partitions. Ordering is only guaranteed within the partition for a topic
  • Multiple consumers can subscribe to the topic and are responsible for managing their own offset within the partition
  • Consumers can be organized into consumer groups
  • Delivery is at least once


Here are a few minimalistic examples of producer/consumer and the kind of messages that a user would expect in their application. 

Simple Producer

\l kfk.q
// setup config dictionary
kfk_cfg:(!) . flip(
  (`metadata.broker.list;`localhost:9092);		// mandatory
producer:.kfk.Producer[kfk_cfg]			 // create producer
test_topic:.kfk.Topic[producer;`test;()!()]	 // create topic for producer

show "Publishing on topic:",string .kfk.TopicName test_topic;
.kfk.Pub[test_topic;.kfk.PARTITION_UA;string .z.p;""];
show "Published 1 message";
show producer_meta`topics;

Simple Consumer

\l kfk.q

kfk_cfg:(!) . flip(
    	(`metadata.broker.list;`localhost:9092);         // broker connection
        (`group.id;`0);				        // consumer group

// create consumer
// override data callback. Empty function by default.
// your despatch and processing logic goes here(and runs on main thread)
    data,::enlist msg;}
// subscribe to a topic “test” with automatic partitioning
.kfk.Sub[client;`test;enlist .kfk.PARTITION_UA];

show client_meta`topics;

Normal data message

mtype    | `
topic    | `test
partition| 0i
offset   | 1065
msgtime  | 0Np
data     | "2017.06.07D16:08:51.805544000"
key      | `byte$()
rcvtime  | 2017.06.07D16:08:51.866241000

End of batch message

mtype    | `_PARTITION_EOF
topic    | `test
partition| 0i
offset   | 1076
msgtime  | 0Np
data     | ""
key      | `byte$()
rcvtime  | 2017.06.07D16:08:52.881488000


In this blog post you have learned some of the main features of libkfk to interface with Kafka, a tiny bit about what Kafka is, and the main terminology to feel comfortable using the library. Minimalistic examples of producer and consumer would be a good starting point to try libkfk in your setup. We encourage you to do that and both file issues and suggest changes on GitHub. It is easy to get started with a local test Kafka broker by following these “setup testing instance” instructions.

Code improvements, examples, documentation and steps to make setup easier are very welcome!



Sergey Vidyuk is a senior software developer and expert in data management. He works in the Kx R&D team on the next generation of kdb+/q. Prior to joining Kx, Sergey was the CTO at Saxo Bank in London and previously worked for many years as a senior developer in other financial institutions.

© 2018 Kx Systems
Kx® and kdb+ are registered trademarks of Kx Systems, Inc., a subsidiary of First Derivatives plc.


Head of Products, Solutions and Innovation at Kx on Product Design and the Vision for the Future

16 Mar 2018 | , , ,

As the SVP of Products, Solutions and Innovation at Kx Systems, James Corcoran is part of a new chapter in software development at Kx. Since joining Kx parent First Derivatives as a financial engineer in 2009, James has worked around the world building enterprise systems at top global investment banks before moving to the Kx product team in London. James sat down with us recently to discuss his perspective on product design and our technology strategy for the future.

Kdb+ Utilities: Essential utility for identifying performance problems

28 Feb 2018 | ,

If you are a kdb+/q developer, you will find the utilities created by Kx Managing Director and Senior Solution Architect Leslie Goldsmith to be a valuable resource. The “Kdb+ Utilities” series of blog posts gives a quick introduction to the utilities, available at Leslie Goldsmith’s GitHub. In this third part of the series we look at Leslie’s qprof, which allows a programmer to drill down into q functions or applications to inspect performance and CPU usage in a fine-grained fashion.

kdb+ utility to search codebase

Kdb+ Utilities: Q code Workspace Utilities

6 Feb 2018 | , ,

If you are a kdb+/q developer, you will find the workspace utilities created by Kx Managing Director and Senior Solution Architect Leslie Goldsmith to be a valuable resource. This is the first in a series of blog posts that give a quick introduction to several utilities available at Leslie Goldsmith’s GitHub. In this part of the series we look at an essential tool which contains routines for summarizing and searching the contents of a workspace, ws.