Nick Psaris’ kdb+ coding contest

21 Jul 2015 | ,
Share on:

Nick Psaris, author of Q Tips: Fast, Scalable and Maintainable Kdb+, stopped by the July Kx Community NYC Meetup last week and picked the winner of his latest kdb+ coding contest.

Congratulations go to Anjani Singh!

Here is Nick’s account of the contest results. The rules are below.

The implementation used the following techniques to win:

  1. placing an attribute on the key of the character mapping dictionary,
  2. vectorizing the conversion from character to number by first razing the list of VINs, and
  3. reducing the number of computations by skipping invalid check digits.
m:(`u#"0123456789ABCDEFGHJKLMNPRSTUVWXYZ")!0 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 1 2 3 4 5 7 9 2 3 4 5 6 7 8 9f;
x:$[t:type x;enlist x;x];
if[count x@:k:where j:x[;8]in c:"0123456789X";j[k]:x0[8+17*til count x]=c"i"$mod[;11f](0N 17#m x0:raze x)$8 7 6 5 4 3 2 10 0 9 8 7 6 5 4 3 2f];
$[t;first j;j]}
q)vins:(1000000#17)?:.Q.nA except "IOQ"
q)ts validvin vins

Another technique that could make this implementation even faster is to use a lookup table instead of the ‘mod’ operator.

A cleaned-up solution using all these techniques:

if[type x;:first .z.s enlist x];
m:(`u#.Q.nA except "IOQ")!"f"$(40#til 10) _/ 31 30 28 26 20 19 10;
w:8 7 6 5 4 3 2 10 0 9 8 7 6 5 4 3 2f;
v:x[;8]in c;
if[count k:where v;v[k]:r[8+17*til count x]=(802#c)"j"$(17 cut m r:raze x@:k)$w];

Kostyantyn Oliynyk came in a close second by flipping the list of VINs before performing the weighted sum.

Nick’s Coding Contest Rules

The participants of the Kx Community Hong Kong Meetup recently competed to write the fastest computation of an HKID check digit, see Computing ‘check digits’ contest with kdb+.

There are many more identifiers that contain check digits.  One such identifier is the North American Vehicle Identifier Number (or VIN).

The goal for the Kx Community NYC Meetup will be to write a function that validates a list of VINs as fast as possible.

This differs from the Hong Kong Meetup in two ways.  First, the submitted function should not return the check digit, but just ensure that the supplied identifier has the proper check digit in the 9th position.

q)validvin "11111111111111111"

And secondly, the function should work for both single identifiers and a list of identifiers:

q)validvin ("5GZCZ43D13S812715";"SGZCZ43D13S812715";"WP0ZZZ99ZTS392124"; "KLATF08Y1VB363636";"1M8GDM9AXKP042788")

The performance will be based on running the function on a list of 1 million randomly generated VINs:

q)vins:(1000000#17)?:.Q.nA except "IOQ"
q)ts validvin vins
1796 520388816

Three extra rules:

  1. User-generated global variables can not be used
  2. Code can not be generated from text by using ‘parse’, ‘eval’, ‘get’, etc…
  3. Assume q will be run without slaves and using ‘peach’ will therefore not help

The winner will receive a signed copy of Q Tips: Fast, Scalable and Maintainable Kdb+.