Kdb+ and the history of the Q phrasebook

17 Sep 2019 | , ,
Share on:

By Stephen Taylor, Kx Librarian

 

The Q Phrasebook is a collection of q expressions that solve common problems when programming in q/kdb+. It is useful as a reference library for coders, and as a set of problems for those learning the language to deepen their understanding of the underlying principles of array programming.

The phrasebook also has an interesting genealogy that goes back to the original array programming language, APL, invented by Kenneth E. Iverson.

History

Iverson was a mathematician at  Harvard in the 1950s and 1960s who devised a mathematical notation for describing computational processes, which he called APL. This was the first array-programming language, and introduced the use of vectors instead of scalars, creating an entirely new approach to computer programming.

Subsequent array-programming languages inspired by APL include A+, J, k, q and Niall. These languages all share the expressiveness of APL and its implicit iterations. Iversonian languages use short expressions for many tasks handled in other programming languages by auxiliary libraries. Learning those expressions is an important part of acquiring fluency in them. An early catalog of such expressions – which is a library unto itself – was the FinnAPL Idiom Library, a legendary resource for APL programmers.

Iverson was fascinated by the importance of “Notation as a Tool of Thought,” which he described in his legendary Turing Award speech. Iverson later created the J programming language, a development of APL, with Roger Hui. Iverson and Hui used the American Heritage Dictionary rigorously to align their description of the J language with common English usage. That led them to prefer terms like “phrase“ to “idiom” because, generally, an idiom cannot be understood by analyzing its individual words.

Building on the concept of phrases vs idioms, Eugene McDonnell created a phrasebook for J. This phrasebook is not only an excellent reference for learning and programming in J, but it is also an example of the evolution of the array-programming culture. This culture draws on the original precepts established by the Iversonian languages.

When McDonnell later began learning k, he translated his J phrasebook into k. In later years, when he began learning q, Jay Han ported the k phrasebook to q which existed as a text file on the Kx site as Q Idioms. In my role as librarian for Kx, I have expanded the q Phrasebook and made it more accessible by categorizing and indexing the phrases and providing more examples on the Kx developers’ site, code.kx.com.

Here are two extended examples of the use of phrases to solve interesting problems in q/kdb+.

The ripple effect

A perfect ripple shuffle (strictly riffle shuffle) splits a deck of 52 cards into two packs of 26 cards and interleaves alternate cards from each pack.

This is the famous expression for a ripple shuffle that used to be featured on an APL t-shirt.

      d←⍳52 ⍝ deck
      d[⍋⍒(⍴d)⍴0 1]⍝ ripple shuffle
27 1 28 2 29 3 30 4 31 5 32 6 33 7 34 8 35 9 36 10 37 11 38 12 39 13 40 14 41 ..

Keywords in q replace APL’s distinctive glyphs.

q)d:til 52                      / deck
q)d iasc idesc count[d]#0 1     / ripple shuffle
26 0 27 1 28 2 29 3 30 4 31 5 32 6 33 7 34 8 35 9 36 10 37 11 38 12 39 13 40 ..

Having an even-length vector to split and interleave in this way is an uncommon problem in commercial programming. More common is interleaving two equal-length vectors, for which the q is simple: x,'y But back before APL had (its equivalent of) the Each iterator, you could use ripple shuffle to the same effect.

(x,y) iasc idesc (count[x]+count y)#0 1

Undoing the interleaving can be seen as reversing the shuffle.

q)s:d iasc idesc count[d]#0 1       / shuffled deck
q)s
26 0 27 1 28 2 29 3 30 4 31 5 32 6 33 7 34 8 35 9 36 10 37 11 38 12 39 13 40 ..
q)s idesc count[s]#0 1              / reverse the shuffle
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 ..

Of course, there are other ways to imagine the shuffle and its inverse.

Here, the initial (2,arf)#d cuts the deck into two packs, which are then reversed, flipped and razed.

q)arf:(count d)div 2
q)(2,arf)#d
0  1 2  3 4 5  6 7 8 9  10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51
q)show s:raze flip reverse (2,arf)#d
26 0 27 1 28 2 29 3 30 4 31 5 32 6 33 7 34 8 35 9 36 10 37 11 38 12 39 13 40 ..
q)raze reverse flip 2 cut s
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 ..

All very well with d an integer vector. But with a vector of dictionaries we might prefer to avoid all that taking, reversing and flipping of the data.

Better to manipulate the indexes.

q)show s:d raze (arf,0)+/:til arf
26 0 27 1 28 2 29 3 30 4 31 5 32 6 33 7 34 8 35 9 36 10 37 11 38 12 39 13 40 ..
q)s raze flip 1 0+/:2*til arf
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 ..

For short vectors such as these, this is all just elegant variation.

But when you are trying to trim your execution time, you want variations you can test. Studying the q phrases and their variations expands your repertoire.

Way to go

There is fascinating new material still to add. The Finance section does not yet reflect decades of experience with q in financial markets. And applying arrays with iterators permits representations of finite-state machines not possible in the ancestor languages.

q)yrp                               / a European tour
from   to wp
----------------
London Paris  0
Paris  Genoa 1
Genoa  Milan 1
Milan  Vienna 1
Vienna Berlin 1
Berlin London 0
q)show route:yrp[`from]!yrp[`to]
London| Paris
Paris | Genoa
Genoa | Milan
Milan | Vienna
Vienna| Berlin
Berlin| London

The dictionary route is a finite-state machine: its values are also valid keys.

q)(route\)`Genoa                           / a circular tour
`Genoa`Milan`Vienna`Berlin`London`Paris
q)3 route\`London                          / 3 legs of the tour
`London`Paris`Genoa`Milan
q)(`Berlin<>)route\`Paris                  / Paris to Berlin
`Paris`Genoa`Milan`Vienna`Berlin
q)waypoints:(!/)yrp`from`wp
q)waypoints route\`Paris                   / Paris to the end
`Paris`Genoa`Milan`Vienna`Berlin

Fibonacci numbers

The phrases contain the elegant q lambda for Fibonacci numbers.

q)10{x,sum -2#x}/0 1   / first ten Fibonacci numbers
0 1 1 2 3 5 8 13 21 34 55 89

That much comes from the earlier array-programming languages, and is satisfyingly close to how the series is usually defined. But the notation also makes explicit a relationship between the Fibonacci numbers and matrix multiplication:

q)m:(0 1f;1 1f)
q)10(m mmu)\1 1f
1  1
1  2
2  3
3  5
5  8
8  13
13 21
21 34
34 55
55 89
89 144

Looking forward

Each of the antecedents of the Q Phrasebook have included new powerful abstractions using notations, including this last version. Please consider making your own contribution to the next version of the Q Phrasebook. Send your comments, suggestions for extensions, clarifications and new notations to .

We look forward to what the Q Phrases will become!

SUGGESTED ARTICLES

Kdb+ Utilities: Essential utility for identifying performance problems

28 Feb 2018 |

If you are a kdb+/q developer, you will find the utilities created by Kx Managing Director and Senior Solution Architect Leslie Goldsmith to be a valuable resource. The “Kdb+ Utilities” series of blog posts gives a quick introduction to the utilities, available at Leslie Goldsmith’s GitHub. In this third part of the series we look at Leslie’s qprof, which allows a programmer to drill down into q functions or applications to inspect performance and CPU usage in a fine-grained fashion.

kdb+ utility to search codebase

Kdb+ Utilities: Q code Workspace Utilities

6 Feb 2018 |

If you are a kdb+/q developer, you will find the workspace utilities created by Kx Managing Director and Senior Solution Architect Leslie Goldsmith to be a valuable resource. This is the first in a series of blog posts that give a quick introduction to several utilities available at Leslie Goldsmith’s GitHub. In this part of the series we look at an essential tool which contains routines for summarizing and searching the contents of a workspace, ws.