# Factor: a stack-based programming language

,

As you may already know, I’m a big fan of stack-based languages such as Forth, functional languages such as Haskell and reflexive languages such as Smalltalk. You can imagine how happy I was when I discovered Factor a few days ago: it combines all those aspects.

Today, a friend sent me someone email signature and asked me if I could decipher it:

``````01101001001000000110011001110101011000110110101101100101011001000010
00000111100101101111011101010111001000100000011011010110111101101101
00001101000010100110000101101110011001000010000001101110011011110111
01110010000001100110011011110111001000100000011110010110111101110101``````

As any programmer on the Earth would have, I immediately assumed that those were ASCII codes printed in binary format. I had a Factor interactive shell opened in one of my windows, so I cut and pasted the whole string (surrounded by quotes) and entered:

``8 group [ 2 base> ] map >string print``

and the cleartext version appeared instantly. All in all, it took me around 20 seconds to uncover the original text using Factor.

How does this work? Factor is a stack-based language, meaning that data are put onto a stack and words (equivalent of functions in other languages) use the data on the stack and put results there. Factor is a (dynamically) typed language: complex data can be pushed onto the stack, while untyped languages such as Forth can only push numbers there.

Writing the string pushes it on the stack. Using

``8 group``

takes the string on the top of the stack, considers it as a sequence (a succession of characters), group them by eight and returns an array of strings of length 8. At this point, there is only one element on the stack: an array of eight-characters-long strings.

Then

``[ 2 base> ]``

pushes another element on the top of the stack: a quotation (the equivalent of a lambda expression in functional languages), which is a block containing code. The `base&gt;` word consumes two elements from the stack, a string `S` and a number `B`, and pushes back a number which is the value represented by `S` in base `B`. For example, the expression

``"01101001" 2 base>``

will let the value `105` on the stack, as `01101001` in binary represents `105` in decimal.

``map``

takes a sequence and a quotation. It represents the classical `map` operation in functional languages: it applies the quotation to every element of the sequence and gathers the results in a new sequence. As a consequence, each eight-characters-long string gets transformed into its decimal representation. At the end, we end up with a single element on the stack, which is an array containing all the ASCII codes of the sentence.

``>string``

transforms a sequence of ASCII codes into the corresponding string. Then

``print``

is similar to C’s `puts` and prints the string on the standard output and adds a newline at the end.

The content of the unencrypted text itself is not important; my point is that Factor is very compact and its stack-oriented nature helps writing concise and clear programs. For example, here is one of the many possible implementations of the reverse functionality: it takes a string from the stack and lets its ASCII representation in binary onto the stack.

``[ 2 >base 8 CHAR: 0 pad-head ] [ append ] map-reduce``

I’m sure that you’re thrilled to know that “Hello, world!” encodes as

``````0100100001100101011011000110110001101111001011000010
0000011101110110111101110010011011000110010000100001``````

(edited on 2009-06-13 to use `pad-head` and `map-reduce`)

(edited on 2016-12-08 to clarify details about `print`, thanks to Jon Harper)

If you like this post, you can send some bitcoin dust to 1Bo78aNzJvkmeLTw8aptaFipvRWyNQP2WF (or click here).