An example of lazy evaluation in Kap

March 12, 2022

Kap is built around the observation that a lot of straightforward APL code ends up expressing the solution to a problem in a very clear and concise way, but has the drawback of performing excessive computation.

Let's say you have a number of products, let's say mobile applications, made by a wide range of authors. Each application has a rating between 0 and 5. For the purpose of this example, we only care about the ratings of the applications and nothing else, so we can represent this information using a simple 2-dimensional matrix. Each row represents an application, with the first column being the name of the developer and the second the score.

    applist ← ("John" "Lars" "Peter" "John" "John" "Lars" "Lars") ,[0.5] 3 4 1 5 0 3 2
┏━━━━━━━━━┓
┃ "John" 3┃
┃ "Lars" 4┃
┃"Peter" 1┃
┃ "John" 5┃
┃ "John" 0┃
┃ "Lars" 3┃
┃ "Lars" 2┃
┗━━━━━━━━━┛

You are now tasked to find a developer who has had no applications rated 5 (perhaps the purpose is to sell them developer training). For the example above, that would be Lars.

A straightforward way to think about this problem is to break it up into three separate steps:

Collect the the scores for each developer
Filter the results to remove all developers with applications with a score of 5
Return the first result

In APL (using Dyalog for this example), this can be achieved as follows:

names ← applist[;0]
scores ← applist[;1]
⊃ {⍵[⍵[;1] ⍳ 1 ; 0]} names {⍺ , ~5∊⍵}⌸ scores

The above is a pretty much one-to-one mapping to the simple solution. It is definitely not the best solution, and the astute reader may note that this code checks whether all the developers had a score-5 application, even though we're only interested in the first one. In this particular case we're only calling Member, and it doesn't matter much from a performance perspective, but if the operation was much more complicated, this could result in a rather substantial amount of data being calculated which are immediately thrown away.

A detour: Implementing Key in Kap

At the time of this writing, Kap does not implement the Dyalog Key operator. The reason for this is that the precise syntax to use hasn't been decided, but we can easily implement it ourselves:

∇ (keys) key (values) {
	a ← ∪keys
	a ,[0.5] {values ⫽⍨ keys≡¨⊂⍵}¨ a
}

This function implements a variant of the dyadic form of Key which has the following properties: Given a vector of keys on the left, group the corresponding values on the right and return a matrix with two columns where the first column is the key and the second is a vector of all values mapped to that key.

Thus, the following:

1 2 1 1 key "foo" "bar" "abc" "def"

Returns:

┏━━━━━━━━━━━━━━━━━━━━━┓
┃1 ┏━━━━━━━━━━━━━━━━━┓┃
┃  ┃"foo" "abc" "def"┃┃
┃  ┗━━━━━━━━━━━━━━━━━┛┃
┃2             ┏━━━━━┓┃
┃              ┃"bar"┃┃
┃              ┗━━━━━┛┃
┗━━━━━━━━━━━━━━━━━━━━━┛

The way this implementation works should be reasonably simple to understand. The variable a is assigned a vector of the unique values, then for each unique key, select all values mapped to that key and collect it into a vector. Finally, concatenate the list of keys with the results.

Porting the original problem to Kap

With this new function in our toolbox, we can now create a version of the previous program that works in Kap:

⊃ {⍵[1 ⍳⍨ ~5 ∊¨ ⍵[;1] ; 0]} names key scores

At this point, we have two very similar programs in APL and Kap. The method used may not be the best for either language, but it's a very simple implementation of the original idea.

The effect of lazy evaluation

In Kap, the function io:println is used to print its argument, followed by a newline. The function then returns its argument. Thus the following will print all the values in scores while computing its average:

    (+/÷≢) io:println¨ scores
3
4
1
5
0
3
2

2.5714285714285716

The above is not surprising, but let's do the same thing to the previous example:

    ⊃ {⍵[1 ⍳⍨ ~5 ∊¨ ⍵[;1] ; 0]} names key io:println¨ scores
3
5
4
3
2

"Lars"

Note how it only printed 5 values from the scores vector. What happened to the other two values?

Whenever possible, Kap will avoid performing any computations until the result of that computation is needed. In the case above, io:println¨ scores does not evaluate the function, but instead returns an object that will at some point in the future return the value you need. If that value was never requested, the function is never called.

There are some exceptions to the rule above with the main one being that variable assignment forces what is referred to as “collapse”. The collapse operation ensures that all results are computed. Thus, the following prints all 7 scores:

    ⊃ {⍵[1 ⍳⍨ ~5 ∊¨ ⍵[;1] ; 0]} names key a←io:println¨ scores
3
4
1
5
0
3
2

"Lars"

It is also possible to use the function comp (for “compress”) to force evaluation.

Benefits and drawbacks of lazy evaluation

The benefits in cases like the above should be obvious. The more simple approach to solving a problem will often result in fewer computations than the APL solution.

The drawback is that evaluation is more complicated which precludes Kap from implementing low level optimisations that Dyalog and BQN are able to do. These implementations have highly optimised code paths for performing vector calculations on very large arrays of numbers. Doing this in Kap would be difficult, as the implementation would need to determine, at runtime, whether or not it is more beneficial to perform an optimised full-array operation which can use vector instructions in the CPU, or whether it's better to delay the computation in case not all results will be needed.

I can be reached on the Fediverse at @loke@functional.cafe