Elias Mårtenson

@loke@functional cafe

Kap is built around the observation that a lot of straightforward APL code ends up expressing the solution to a problem in a very clear and concise way, but has the drawback of performing excessive computation.

Let's say you have a number of products, let's say mobile applications, made by a wide range of authors. Each application has a rating between 0 and 5. For the purpose of this example, we only care about the ratings of the applications and nothing else, so we can represent this information using a simple 2-dimensional matrix. Each row represents an application, with the first column being the name of the developer and the second the score.

    applist ← ("John" "Lars" "Peter" "John" "John" "Lars" "Lars") ,[0.5] 3 4 1 5 0 3 2
┏━━━━━━━━━┓
┃ "John" 3┃
┃ "Lars" 4┃
┃"Peter" 1┃
┃ "John" 5┃
┃ "John" 0┃
┃ "Lars" 3┃
┃ "Lars" 2┃
┗━━━━━━━━━┛

You are now tasked to find a developer who has had no applications rated 5 (perhaps the purpose is to sell them developer training). For the example above, that would be Lars.

A straightforward way to think about this problem is to break it up into three separate steps:

  • Collect the the scores for each developer
  • Filter the results to remove all developers with applications with a score of 5
  • Return the first result

In APL (using Dyalog for this example), this can be achieved as follows:

names ← applist[;0]
scores ← applist[;1]
⊃ {⍵[⍵[;1] ⍳ 1 ; 0]} names {⍺ , ~5∊⍵}⌸ scores

The above is a pretty much one-to-one mapping to the simple solution. It is definitely not the best solution, and the astute reader may note that this code checks whether all the developers had a score-5 application, even though we're only interested in the first one. In this particular case we're only calling Member, and it doesn't matter much from a performance perspective, but if the operation was much more complicated, this could result in a rather substantial amount of data being calculated which are immediately thrown away.

A detour: Implementing Key in Kap

At the time of this writing, Kap does not implement the Dyalog Key operator. The reason for this is that the precise syntax to use hasn't been decided, but we can easily implement it ourselves:

∇ (keys) key (values) {
	a ← ∪keys
	a ,[0.5] {values ⫽⍨ keys≡¨⊂⍵}¨ a
}

This function implements a variant of the dyadic form of Key which has the following properties: Given a vector of keys on the left, group the corresponding values on the right and return a matrix with two columns where the first column is the key and the second is a vector of all values mapped to that key.

Thus, the following:

1 2 1 1 key "foo" "bar" "abc" "def"

Returns:

┏━━━━━━━━━━━━━━━━━━━━━┓
┃1 ┏━━━━━━━━━━━━━━━━━┓┃
┃  ┃"foo" "abc" "def"┃┃
┃  ┗━━━━━━━━━━━━━━━━━┛┃
┃2             ┏━━━━━┓┃
┃              ┃"bar"┃┃
┃              ┗━━━━━┛┃
┗━━━━━━━━━━━━━━━━━━━━━┛

The way this implementation works should be reasonably simple to understand. The variable a is assigned a vector of the unique values, then for each unique key, select all values mapped to that key and collect it into a vector. Finally, concatenate the list of keys with the results.

Porting the original problem to Kap

With this new function in our toolbox, we can now create a version of the previous program that works in Kap:

⊃ {⍵[1 ⍳⍨ ~5 ∊¨ ⍵[;1] ; 0]} names key scores

At this point, we have two very similar programs in APL and Kap. The method used may not be the best for either language, but it's a very simple implementation of the original idea.

The effect of lazy evaluation

In Kap, the function io:println is used to print its argument, followed by a newline. The function then returns its argument. Thus the following will print all the values in scores while computing its average:

    (+/÷≢) io:println¨ scores
3
4
1
5
0
3
2

2.5714285714285716

The above is not surprising, but let's do the same thing to the previous example:

    ⊃ {⍵[1 ⍳⍨ ~5 ∊¨ ⍵[;1] ; 0]} names key io:println¨ scores
3
5
4
3
2

"Lars"

Note how it only printed 5 values from the scores vector. What happened to the other two values?

Whenever possible, Kap will avoid performing any computations until the result of that computation is needed. In the case above, io:println¨ scores does not evaluate the function, but instead returns an object that will at some point in the future return the value you need. If that value was never requested, the function is never called.

There are some exceptions to the rule above with the main one being that variable assignment forces what is referred to as “collapse”. The collapse operation ensures that all results are computed. Thus, the following prints all 7 scores:

    ⊃ {⍵[1 ⍳⍨ ~5 ∊¨ ⍵[;1] ; 0]} names key a←io:println¨ scores
3
4
1
5
0
3
2

"Lars"

It is also possible to use the function comp (for “compress”) to force evaluation.

Benefits and drawbacks of lazy evaluation

The benefits in cases like the above should be obvious. The more simple approach to solving a problem will often result in fewer computations than the APL solution.

The drawback is that evaluation is more complicated which precludes Kap from implementing low level optimisations that Dyalog and BQN are able to do. These implementations have highly optimised code paths for performing vector calculations on very large arrays of numbers. Doing this in Kap would be difficult, as the implementation would need to determine, at runtime, whether or not it is more beneficial to perform an optimised full-array operation which can use vector instructions in the CPU, or whether it's better to delay the computation in case not all results will be needed.

I can be reached on the Fediverse at @loke@functional.cafe

It will chew you up and spit you out, and you'll not even complain.

I often start new projects at the beginning of the year, around December/January. This is not because I particularly plan it that way, but probably more because the holidays gives me time to actually experiment with new ideas that I've spent the previous year thinking about.

Around the time of the start of the pandemic (January 2020) I started working on designing my own programming language. It seemed like a great idea at the time, since I wanted to explore some language ideas I felt had never really been tried before, and it should be easy enough to build a proof-of-concept before I went back to more rewarding projects.

Someone should have told me that once you design a programming language you fall down a rabbit hole so deep you can't even see the light when you look up anymore.

Programming language design is at the same time the most interesting technical challenge you can take on, while at the same time being for the most part an utter waste of time.

My own project has exactly 1 user (myself). There are a handful of other people who have an academic interest in what I do, but they are also language designers, working on their own related languages.

Even though you are aware of the fact that very few people, if any, will benefit from your work, it is impossible to not keep thinking about ways to improve your project. You keep adding new features that are really cool, and eventually everything programming-related you think about gets focused on how that particular topic can be applied to your language.

It's also difficult to go back to working on other languages. You end up thinking about how certain features or ideas could be applied to your language, and as soon as you have some spare time, it's back to the drawing board to implement these ideas in your project.

You are fully aware that your work is almost guaranteed to be of little to no value, once you've fallen down the rabbit hole, you're stuck. Even when working on something else it's impossible to stop thinking about all the small things you want to add to your project. I think this is the most devious part of language design. It covers so many different areas that you are never “done”:

  • Compiler techniques
  • Code optimisation
  • Maths
  • GUI programming (your language needs an IDE, right?)
  • Graphics programming
  • Operating system integration (standard library implementation)
  • Mobile development (programming on mobile devices is a thing these days)
  • Networking (again, needed for the standard library)
  • Writing (your language needs documentation)

In fact, I can't think of another programming project that literally touches on every aspect of software engineering. This is the main reason designing a programming language is a project that you can never finish.

It also affects your communication with the rest of the programming community. I'm sure you've all seen the person, that guy who can't help himself from answering programming questions with an explanation how it's done in their own language. The person that discusses software optimisation techniques with regards how they implemented it in their compiler.

I have to bite my tongue every time, and I still can't stop myself completely.

This is the destiny of most people who design programming languages. A few get lucky and stumble across something other people start to use.

That said, I'm not entirely sure they are lucky. Getting actual users is probably a recipe for never being able to leave the project. Perhaps I should be happy that I am the only user of my language.

@loke@functional.cafe

When mentioning APL to someone who has never used it, the reaction is often along the lines of “isn't that the unreadable language with weird symbols?”.

The classic example of APL code is Conway's Game of Life which can be written as such:

{⊃1 ⍵ ∨.∧ 3 4 = +/ +⌿ ¯1 0 1 ∘.⊖ ¯1 0 1 ⌽¨ ⊂⍵}

This implies that the answer to the question above is yes, but there is an implication in that statement that suggests that the symbols is what makes the language different.

Another language in the same family as APL is J. This language uses many of the same principles as APL, and Game of Life in J looks like this:

n1dim=:3 +/\ (0,,&0)
neighbors=: -~ [: n1dim"1 n1dim
life=: [: (0.5 >: 3 |@- ]) -:+neighbors

For someone without knowledge of either APL nor J, both of these examples are probably inscrutable. The point of the example above is to emphasise that it's not the symbols that makes APL seem alien to a newcomer.

The question then becomes: Why does APL look the way it does?

To answer this question, we need to look into the history of the language and the requirements that led to its design.

The designer of both APL and J was Ken Iverson. He was a mathematician, and looked at problem solving from a mathematical perspective. APL was originally a notation for array manipulation, and it was only later that it was actually implemented as a programming language for computers.

Thus, to understand the APL notation, it's important to think about it as a concise way to describe mathematical operations, especially operations working on arrays of values.

The first step is to look at the syntax of maths, not programming languages. Let's take a simple example of summing all the elements of an array \(b\):

\[ \sum_i b_i \]

The above syntax is a bit imprecise and we are assuming that \(i\) will take on all the valid indexes in \(b\).

If we want to take the product of all the elements in \(b\), then we can replace the summation sign with the product symbol:

\[ \prod_i b_i \]

What if we want to create a chain of exponentiations? There is no dedicated symbol for this, so we need to write:

\[ b_0^{b_1^{b_2^{\cdot^{\cdot^{\cdot^{b_{n-1}}}}}}} \]

In the example above, we're assuming that the reader will understand that \(n\) represents the number of elements in the array.

The observation here is that these operations are all variations of the same underlying operation. In functional programming this is usually referred to as a reduction.

To perform the sum in APL, one wrould write the following:

+/b

The + is the operation to perform, and / is the reduction operator. The natural extension to the product is:

×/b

And for exponentiation:

*/b

The argument in favour of APL notation is that it's simpler and more concise than traditional mathematical notation. That's no short order, as maths is already supposed to be both simple and concise. Not only that, the APL notation is free of imprecision as opposed to the mathematical examples, as was discussed above.

The APL syntax also allows you to use any other function as part of a reduction. For example, the function returns the smaller of the two values, so the expression ⌊/b can be used to find the smallest value in \(b\). The way to write this in mathematical notation is as follows:

\[ \min_i(b_i) \]

It is assumed that \(\min\) is well known, since in APL we make the same assumption concerning .

The reason why APL syntax looks the way it does, is because it's a mathematical notation that happens to be suitable for evaluation by a computer, rather than a set of instructions for a computer to perform.

Common Lisp is one of the few languages which is both dynamic and also gives you a full native compiler and the ability to declare types so that the optimiser can provide more efficient code.

The following examples are with full optimisation enabled and safety flags off, just to get rid of the explicit type checks in the generated code.

Let's take a small function that simply adds 100 to its argument:

(defun foo (x)
  (+ x 100))

After defining the function (no need to run a separate compiler or anything like that, this is all in the REPL) I can see the generated machine code by calling (disassemble #'foo)

CL-USER> (disassemble #'foo)
; disassembly for FOO
; Size: 18 bytes. Origin: #x52BF4B06                          ; FOO
; 06:       BFC8000000       MOV EDI, 200
; 0B:       FF142500010052   CALL QWORD PTR [#x52000100]      ; GENERIC-+
; 12:       488BE5           MOV RSP, RBP
; 15:       F8               CLC
; 16:       5D               POP RBP
; 17:       C3               RET

OK, that's a bit much, and notice how it calls a library function to perform the actual addition. This is because Common Lisp supports things like rational numbers and bignums, and since there are no declarations the compiler have to assume that the argument can be any type of number.

Let's help the compiler a bit by telling it that the argument is in fact a FIXNUM (Lisp terminology for an integer that fits in a register):

(defun foo (x)
  (declare (type fixnum x))
  (+ x 100))

This is what the disassembly looks like now:

CL-USER> (disassemble #'foo)
; disassembly for FOO
; Size: 25 bytes. Origin: #x52BBBDC9                          ; FOO
; C9:       4883C264         ADD RDX, 100
; CD:       48D1E2           SHL RDX, 1
; D0:       710A             JNO L0
; D2:       48D1DA           RCR RDX, 1
; D5:       FF142560010052   CALL QWORD PTR [#x52000160]      ; ALLOC-SIGNED-BIGNUM-IN-RDX
; DC: L0:   488BE5           MOV RSP, RBP
; DF:       F8               CLC
; E0:       5D               POP RBP
; E1:       C3               RET

Interesting. It now does a regular machine addition but there is more stuff going on here. What really happens is that even though we know that the input will fit in a FIXNUM, we don't know if the result will. After all, the input could have been just right at the maximum value for a FIXNUM. The code checks for an overflow, and if that happens, it allocates a bignum and returns it.

Let's change the declaration to say that the input is simply an integer between 0 and 10000000. This way the compiler can be sure the result will fit in a FIXNUM:

(defun foo (x)
  (declare (type (integer 0 10000000) x))
  (+ x 100))

And now let's disassemble the code:

CL-USER> (disassemble #'foo)
; disassembly for FOO
; Size: 13 bytes. Origin: #x52BFC186                          ; FOO
; 86:       4881C2C8000000   ADD RDX, 200
; 8D:       488BE5           MOV RSP, RBP
; 90:       F8               CLC
; 91:       5D               POP RBP
; 92:       C3               RET

Nice, now we can see that it simply performs a simple ADD instruction and returns the result.

For completeness sake, I want to explain two things that may be confusing:

First of all, why is it adding 200 and not 100? This is because SBCL (which I am using for these examples) use the lowest bits of a register for type tags. If the least significant bit is 0, then the rest of the bits represents a FIXNUM. Since the value to add is a constant, the compiler simply adds 200 instead of emitting instructions to do bit shifting.

The second question may be why there is a CLC instruction there. This is because Common Lisp supports multiple return values (a function can return any number of arguments, although returning a single one is the typical case). If the carry flag is cleared after a function call, that means that the function returned a single value, which is stored in RDX. If the carry flag is set, then that means there are multiple return values. When returning multiple values, the primary return value is still in RDX which means that a caller who doesn't care about the other values can simply ignore the carry flag. It's pretty clever if you ask me.

What's really cool about all of this is that I can do these things without ever restarting my Lisp runtime. I get the benefit of a high level language where I can work with the code without having to restart anything, but I can still take advantage of a native compiler and really good performance.

There are a lot of declarations one can apply. Not just type declarations, but several others that can be used to control the compiler. All the details can be found in the reference: http://clhs.lisp.se/Body/s_declar.htm#declare

I can be reached on the Fediverse @loke@functional.cafe

With all the discussions about how we're now entering a new recession, it may be easy for people to try to compare to earlier recessions. As a long time has passed, there are now adults that were not working during the dot-com bust, so they are not fully aware of the events leading up to it.

This is a slightly edited version of a post I made on the Fediverse where I try to highlight this.

In the months leading up to the crash there was a lot of regular workers who invested everything they had in tech stocks because they had been going up so much. The expectation was that this was going to keep going up because the economic system had fundamentally changed. People honestly believed that this was the next step after capitalism and a new world where everything was available to everyone.

And it wasn't just regular people. I have told this story on the fediverse before, but this is clearly a good time to repeat it:

Not so long before the bust, I was working at Sun, and I was at some IT expo, working on the hardware that was being shown. As is common at these events, they had a small stage where different people did some presentations. One of the speakers in the stall was the former Minister of Finance in Sweden.

I was behind the screen behind him, probably trying to get one of the servers running and was listening to what he was saying.

At one point, he said: “one might think it's strange how there is a stream of money flowing into these companies, that does not make any profit”.

That's a good question, so what was his answer to this?

“That's the old way of thinking about things. This is the new economy. It doesn't work like the old one”.

The former Minister of Finance said that. I believe just before the bust.

In fact, now that I think about this, the term “the new economy” was being used a lot during the dot com bubble. People really believed it. Even ministers did. It was crazy.

I remember sighing a bit when I heard it, since I didn't believe it. Having worked for the last several years in the middle of the doc com bubble helped you get a pit of perspective, since you knew just how far away from the hype the tech actually was.

I can be reached on the Fediverse @loke@functional.cafe

In this post I would like to explain what KAP is, the reasons for implementing it, and what the current state of the project is.

It all starts with APL. APL is a programming language designed in 1966 by Kenneth E. Iverson. To many it's mostly known for it's unusual syntax which tends to be very compact and uses a large number of special symbols, making APL code very different from most other languages.

The language itself can be extremely terse, allowing you to describe very complex computations with very little code. A classic example of this is the routine that computes the next iteration of Conway's Game of Life:

{↑1 ⍵∨.∧3 4=+/,¯1 0 1∘.⊖¯1 0 1∘.⌽⊂⍵}

However, as powerful as APL is, it also suffers from some limitations. In particular, while array based computations are extremely simple, traditional imperative code is very ugly. I could probably spend a lot of time explaining all the details but suffice it to say that if you want to construct a simple loop in APL you have to use what essentially amounts to GOTO. Anything more complex and the code quickly becomes very painful.

There is one commercial software vendor that produces an APL dialect: Dyalog. The Dyalog product is really nice. It's fast, it has a lot of extensions and it has support for structured programming constructs such as regular while-loops, and also object orientation.

The drawbacks of Dyalog is that it's not free, and due to APL history, a lot of syntax for its extensions tends to be rather ugly (in this author's opinion, of course).

If you want to play with APL yourself, I recommend using GNU APL. It's a free implementation which implements everything that is part of standard APL as well as a number of extensions (some of which are Dyalog-compatible).

Experimenting with new concepts

GNU APL is aligning very strongly with the ISO APL specification and IBM's APL2 implementation. Extensions are only implemented when they are seen as being a natural part of what APL is. This makes for a very stable product, and there are a lot of users who rely on these features.

The stability of GNU APL does mean that it's difficult to experiment with new language features, and because of this I decided to implement a new language that takes some features from APL and then adds new features that are very different from traditional APL, to the point that fundamental low-level functionality can be changed.

The result of this project is KAP. KAP is a project that has been in development for a few months, and at its core it's mostly APL. It implements a large subset of APL functionality, but not all. It is also not compatible with APL for some key features. My intention is to write separate articles about these features.

Main features of KAP

So what are these new features that KAP provides? Well, the most important ones are listed here:

  • First-class functions
  • Closures
  • Hash tables
  • Namespaces
  • Custom syntax (like a light version of macros)
  • Tables can have labels
  • First-class symbols (taken from Lisp)

There are also a number of features that makes backward compatibility difficult or impossible:

  • All arrays are immutable (this allows for data sharing)
  • Computation is lazy (a function may or may not be called, depending on whether the data is needed)

The language is implemented in Kotlin as a multiplatform project. It can currently be compiled for the JVM or as a native binary. At some point it may also be possible to compile to Javascript, but currently the Javascript compiler does not support reflection. Once this has been implemented, it should be possible to generate Javascript code.

This article is long enough already, so I will conclude by providing links to the Github project page: https://github.com/lokedhs/array

I also have a link to a video that illustrates what the graphics capabilities look like: https://peertube.mastodon.host/videos/watch/4a19ca9e-7ca6-4142-bda6-c353915bfe23

If you have an questions, I can be reached on the Fediverse under @loke@functional.cafe

This is a new test of write.as.

As with the previous posts here, this post does not contain any useful information.

I will, however, at some point post some real content here.

This is a test post.

This post is not supposed to contain any useful information. I'm just testing the platform.

Here is some source code:

(defun render-op-list (stream maxima-sym displayed-sym exprs)
  (with-aligned-rendering (stream)
    (iterate-exprs (expr exprs maxima-sym :first-sym first)
      (unless first
        (aligned-spacing 0.2)
        (render-aligned-string "~a" displayed-sym)
        (aligned-spacing 0.2))
      (render-aligned () (render-maxima-expression stream expr)))))

Just a test of write.as

This is not an actual blog post, but rather just a test of the platform.

I'd like to write actual blog posts, but I don't know if I have the time or dedication to do so. But, if I actually do it, I'll probably try this one.