Kdb+/q Insights: Scripting with q

20 Jun 2019 | ,
Share on:

by David Crossey

Introduction

A script is a program or sequence of instructions which are interpreted by another program before execution of each instruction. A q script contains a set of instructions (program), written in q (or k for the adventurous types) and are interpreted, then executed in a linear sequence by a runtime q session. These instructions are written in the same manner as those executed interactively in a q session.

In kdb+/q, we can interactively execute q commands in our command prompt:

q)a:10;
q)b:3.2;
q)f:{x+y};
q)f[a;b]
13.2

Here we have defined some variables and a function, however if we close our q session and start another q session, we lose these variables and functions. Variables and functions are the fundamental building blocks to creating complex q programs and algorithms. The values contained in variables can be initialized and provide the input to an entry function which generates a result, then follows a tree of functions to generate some end result.

Instead of manually executing instructions for each instance, scripting allows us to automate the execution of elaborate instruction sets as a program; a process that is significantly faster and less prone to error than with human interaction.

I will continue to make reference to interactive and non-interactive sessions throughout this blog, so let me loosely define these now as:

  • Interactive – user interacting with the q session through standard input i.e. entering q commands
  • Non-interactive – no user interaction with the q session; excluding session initialization

For the purposes of this blog I will be demonstrating the application of q scripts using a Linux distribution, Ubuntu 18.04.1 LTS installed via the Windows Store, along with Anaconda to install q on the Linux environment.

Creating q scripts

To create a q script, open any editor application of preference and create a new file with the .q file extension. The following q script will print “Hello World” to the console and exit the q session with a “success” 0 status code:

$ cat helloworld.q
0N!"hello world";
exit 0;

We can either load a script from the command line (non-interactive)[1][2]:

$ q helloworld.q -q
"hello world"

Or load the script in from a q session (interactive session) using one of the load file system commands depending on your needs:

$ q -q
\l helloworld.q
"hello world"
$ q -q
system "l helloworld.q"
"hello world"

Passing parameters to a script

Parameter variables can be used by scripts to influence the output of the script, which can be defined within the interactive session prior to script execution, or passed to a script dynamically from the command line.

Here we edit our script to say hello to a single person, instead of the world:

0N!"hello ",name;
exit 0;
$ q
q)name:"david";
q)\l helloworld.q
"hello david"

Note – it is generally recommended to avoid writing scripts which are dependent on local variables being manually declared.

A better approach using the command line would be to pass in the variable and have our script parse this value:

d:.Q.opt .z.x;
0N!"hello ",first d[`name];
exit 0;

Here we see the introduction of .z.x which captures the command line arguments as a list of strings.

$ q -name "david"
q).z.x
"-name"
"david"

.Q.opt takes this list of strings and builds a dictionary for us to easily access the command line parameters using key/value pairs which are assigned to variable d:

q)d:.Q.opt .z.x;
name| "david"
q)d[`name]
"david"

To access variables in our dictionary d, we can use the first command to extract the item from the list, associated with the key.

q)d:.Q.opt .z.x
q)d[`name]
"David"
q)type d[`name]
0h
q)type first d[`name]
10h

Taking this a step further, we can introduce typed command line params using .Q.def. This function takes a dictionary of the command line args (as symbols) and default values.

$ q -q
q)d:.Q.def[enlist[`name]!enlist enlist "Joe"] .Q.opt .z.x;
q)d[`name]
"Joe"
$ q -q -name "David"
q)d:.Q.def[enlist[`name]!enlist enlist "Joe"] .Q.opt .z.x;
q)d[`name]
"David"

Hiding Source code

We may wish to hide the contents of our code when sharing scripts with other sources, such as to protect intellectual property. This can be achieved during runtime with the system command \_ which will prevent the q code from being read or serialized.

A new file will be created on our filesystem with an underscore suffix:

$ q -q
\_ helloworld.q
`helloworld.q_

Let’s edit our helloworld.q script to contain a function which only prints “hello world” when we call the function and protect our code:

$ cat helloworld.q
sayhello:{0N!"hello world"};

Next we will start a q session calling our private script. This will load our function definition for sayhello. This function is locked (i.e. we cannot view it’s definition), but generates the correct result when executed:

$ q helloworld.q_ -q
q)\f
,`sayhello
q)sayhello
locked
q)sayhello[];
"hello world"

Script Layout

To help yourself, and others who may read the script you write, it’s good practice to have a common layout of various components of your scripts.

A typical script outline may be broken as follows:

  • Capture command line args
  • Define variables
  • Define functions
  • Define main function
  • Call main to start
/command line args
d:first each .Q.def[enlist[`name]!enlist enlist "Joe"] .Q.opt .z.x;
/variable definitions
currentTime:.z.t;
/function definitions
sayhello:{0N!"Hello ",x};
telltime:{0N!"It is currently ",string x};

/main function call
main:{
  sayhello d[`name];
  telltime currentTime;
  exit 0;
 };

/start script
@[main;`;{0N!"Error running main: ",x;exit 1}];
$ q helloworld.q -q
"Hello Joe"
"It is currently 11:17:57.966"

The above is only a suggestion of how to layout a script; personal preference usually takes precedence in the real world.

You can find some more examples on layout here:

Adding logging to scripts

It is generally good practice to log as much detail as possible in scripts for reference and debugging.

In the following script loghello.q we will define a namespace .log to logically organize our logging functions. A function to log normal messages to stdout (-1) and error message to stderr (-2) will be defined. See File Handles for more info.

d:first each .Q.def[enlist[`name]!enlist enlist "Joe"] .Q.opt .z.x;

\d .log
out:{-1 string[.z.p]," ### INFO ### ",x};
err:{-2 string[.z.p]," ### ERROR ### ",x};
\d .
sayhello:{.log.out["Hello ",x]};

main:{
  sayhello d[`name];
  exit 0;
 };

@[main;`;{.log.err "Error running main: ",x;exit 1}];
$ q hellworld.q -q -name "David" 
2019.06.04D10:08:45.426552000 ### INFO ### Hello David

If the script throws an error during the main function, you will see the following exception handling:

$ q helloworld.q -q
2019.06.12D03:10:40.434211000 ### INFO ### Hello Joe
2019.06.12D03:10:40.434306000 ### ERROR ### Error running main: type

Extending q scripts to OS scripts

For the following section we are going to explore how a Linux based environment can interact and expand our q scripts capabilities using the Bash shell.

Bash commands can be grouped together and executed collectively in a bash script; these are identified with the .sh file extension.

With this knowledge, we can extend OS functionality and utilities to our non-interactive q scripts by wrapping our sessions in shell scripts.

This is particularly useful when we want to limit the number of cores a q session can run, using OS utilities to copy data, or sending data across servers using rsync etc.

To demonstrate, let’s revisit our ‘helloworld.q’ script, and instead of calling it directly we will pass our parameters to Bash which will invoke our q script but will be limited to execution on core 3.

$ cat helloworld.sh
#!/bin/bash
taskset -c 2 q ./helloworld.q -q -name $1
exit 0;

To run helloworld.sh we need to make the file executable:

$ chmod +x helloworld.sh
$ ls -l helloworld.sh
-rwxrwxrwx 1 dcrossey dcrossey 87 Jun  2 18:00 helloworld.sh
$ bash helloworld.sh "David"
2019.06.04D11:27:22.712672000 ### INFO ### Hello David
2019.06.04D11:27:22.713419000 ### INFO ### Exiting

If we edit the q script and comment out the exit command, we can run the q session and then we can open another terminal to view which core it is assigned to using the taskset utility:

First terminal

$ bash helloworld.sh "David"
2019.06.12D03:02:47.074146000 ### INFO ### Hello David

Second terminal

$ ps -ef | grep helloworld.q | grep -v "grep"
dcrossey   131   130  1 13:02 tty1     00:00:00 q ./helloworld.q -q -name David
$ taskset -pc 131
pid 131's current affinity list: 2

 

[1] This q session has been started with the -q command line option to hide the startup banner

[2] Scripts should always be listed before command line args i.e. q [scriptname] [command line options…]

SUGGESTED ARTICLES