Axis labels in Kap

Last weekend I was listening to episode 99 of Array Cast. The topic of the fortnight was array indexing, or the way to read values out of arrays.

A programmer unfamiliar with array languages may wonder how such a simple concept can fill an entire hour-and-a-half episode (and they may even have to extend it to another episode), and to answer that question I would recommend listening to the episode, but the short answer is that array languages focus on organising data in arrays and then performing actions on entire arrays in one operation. This means that reading and writing data to/from arrays is probably the most important operation one does when using an array language.

In the episode, Stephen Taylor talks about how Q allows for indexing the columns of a table by name rather than a number. This is a nice feature that it shares with R.

However, Kap also has this functionality. It's really not well documented at all and I'll try to improve the documentation, but the hope is that this blog post will serve as an introduction to this feature.

Creating labelled arrays

In Kap, arrays can carry some metadata that describes its content. This metadata includes the axis labels, which are updated or accessed using the function labels (docs).

Here's how you can assign labels to the columns of an array:

    content ← 4 3 ⍴ ⍳12
┌→──────┐
↓0  1  2│
│3  4  5│
│6  7  8│
│9 10 11│
└───────┘
    content ← "foo" "bar" "abc" labels content
┌───┬───┬───┐
│foo│bar│abc│
├→──┴───┴───┤
↓  0   1   2│
│  3   4   5│
│  6   7   8│
│  9  10  11│
└───────────┘

The labels is additional metadata and does not change the shape nor content of the array:

    ⍴ content
┌→──┐
│4 3│
└───┘

The labels function accepts an array of strings on the left, and assigns them to the given axis. The number of elements have to match the size of the given axis. By default this function assigns labels to the last axis, but an axis argument can be used to specify which axis to use:

    content ← "Satu" "Dua" "Tiga" "Empat" labels[0] content
┌───┬───┬───┐
│foo│bar│abc│
├→──┴───┴───┤
↓  0   1   2│
│  3   4   5│
│  6   7   8│
│  9  10  11│
└───────────┘

The above call assigned labels to each row, but it's not displayed. The labels are there, but the current version of the text renderer does not include them. However, we can transpose the array to show that they are there:

    ⍉ content
┌────┬───┬────┬─────┐
│Satu│Dua│Tiga│Empat│
├→───┴───┴────┴─────┤
↓   0   3    6     9│
│   1   4    7    10│
│   2   5    8    11│
└───────────────────┘

Calling labels monadically will return the labels for a given axis:

    labels content
┌→────────────────┐
│"foo" "bar" "abc"│
└─────────────────┘

Using labels

Some functions returns arrays with labels where appropriate. The main one being the sql:query (reference) function which adds the column names as labels.

When reading CSV data, the first row is often a set of labels:

    csv ← io:readCsv "file.csv"
┌→──────────────────────────┐
↓"col1" "col2" "col3" "col4"│
│     0      1      2      3│
│     4      5      6      7│
│     8      9     10     11│
│    12     13     14     15│
│    16     17     18     19│
└───────────────────────────┘

When loading a CSV, instead of just dropping the first row, it can be assigned as column labels:

    csv ← (>1↑)«labels»↓ csv
┌────┬────┬────┬────┐
│col1│col2│col3│col4│
├→───┴────┴────┴────┤
↓   0    1    2    3│
│   4    5    6    7│
│   8    9   10   11│
│  12   13   14   15│
│  16   17   18   19│
└───────────────────┘

Selection by label

The Kap standard library contains the s namespace, which contains a few utility functions which are not documented yet, since they are still a work in progress.

One of these functions is the s:cols function, which returns columns from a table based on the column labels:

    "col3" "col4" s:cols csv
┌────┬────┐
│col3│col4│
├→───┴────┤
↓   2    3│
│   6    7│
│  10   11│
│  14   15│
│  18   19│
└─────────┘

There is also the s:col function if you only want to read one column:

    "col2" s:col csv
┌→──────────┐
│1 5 9 13 17│
└───────────┘

Chart functions

If present, axis labels will automatically be used as labels in charts:

temperatures ← `
        "Monday" "Tuesday" "Wednesday" "Thursday" "Friday" labels "Singapore" "London" labels[0] 2 5 ⍴ 31 29 30 31 31 5 7 10 13 13
chart:line temperatures

This will display the following chart:

Labelled axis in chart