• Type:

CH Show HN: Writing a VM for a Small Stack-Base Language in Pointless

Tutorial: Writing a VM for a Small Stack-Base Language in Pointless

- Avery N. Nortonsmith

The age of functional programming will soon be over.

Object-oriented programming has come and gone. Functional programming — the wise, quiet observer — has waited patiently for its turn. But FP, like OOP, will fade away in time as software yields to the inexorable pull of the one True Paradigm: stack-based programming!


Joking aside, stack-based programming is pretty cool.

This tutorial covers the implementation of a virtual-machine for a small, stack-based language in Pointless. We’ll take an incremental approach to building the VM and improving its capabilities until it can run a short prime-factorization program.

This project seeks to demonstrate:

  • How an imperative language can be implemented in a purely functional context
  • How to structure and refactor code that uses persistent data-structures
  • How Pointless can — even as a young language — be used to solve non-trivial tasks

Before we tackle prime-factorization, let’s take a look at a small example of the stack-based code that our VM will evaluate. The following Pointless code defines a list of ‘instructions’, where labeled-tuples are used to represent instructions with an argument, and labels are used for those without. A list of instructions like this represents a program in our stack-based language:

program = [
  Const(7),
  Const(2),
  Mod,
  Const(0),
  Eq,
  Print,
  Exit,
]

Let’s take a minute to fancy things up by writing a function to pretty-print these instruction lists — this will come in handy later when we work with more complex programs. The function showInst below takes an instruction and returns a string showing the instruction’s type (left-justified to 5 characters) and argument. The function argOrBlank returns an instruction’s argument if there is one, or an empty string otherwise.

showInst(inst) =
 format("[ {
      

In the code above, format, getLabel, is, and unwrap are defined in the Pointless prelude (standard-library).

To see our new functions in action, we’ll define an output variable for our program; in Pointless, a program’s output is defined by its output variable. The prelude contains functions like printLines which generate sequences of IO commands that can be assigned to a program’s output variable.

The output definition below takes our list of instructions, generates a new list of instruction-strings by calling showInst on each instruction, and produces IO commands to print each string on a separate line. Take a look at the docs for more info on the function pipe operator |>, and for partial application.


Note that the name ‘main.ptls’ is chosen arbitrarily; it doesn’t have any special meaning

output =
  program
  |> map(showInst)
  |> printLines
$ ./bin/pointless main.ptls

[ Const ] 7
[ Const ] 2
[ Mod   ] 
[ Const ] 0
[ Eq    ] 
[ Print ] 
[ Exit  ]

Seeing is believing. Time to start making our VM.

We’ll represent the VM as an object with the following fields:

  • slots: used to store “heap” variables (discussed later on)
  • insts: an array containing the program’s instructions
  • index: the position in instruction array of the current instruction
  • stack: a linked-list serving as the stack for the VM
  • outVal: used to get output from VM as program runs

The function vmFromInsts sets up a new VM object for a given list of instructions. The code also adds the label VM to the VM object (objects and tuples can be labeled in Pointless). Object labels can be used to write code that tells different types of objects apart, or simply for documentation (as we do here).

vmFromInsts(program) = VM {
  slots  = zeroArray(8)
  insts  = toArray(program)
  index  = 0
  stack  = []
  outVal = None
}

To check that vmFromInsts works, we’ll modify our output definition to create and print a new VM object. Note that we’ve replaced printLines with printlnprintLines expects a sequence of values (and can print these values on-the-fly as they’re computed in the case of lists!), whereas println can handle sequence and non-sequences types.

output =
  program
  |> vmFromInsts
  |> println
$ ./bin/pointless main.ptls

VM {stack = []; insts = [Const(7) Const(2) Mod Const(0) Eq Print Exit]; slots = [0 0 0 0 0 0 0 0]; index = 0; outVal = None}

We run the program, and see that println gives us a nice visualization of our initial VM object, fields and all.

Back to our list of instructions:

program = [
  Const(7),
  Const(2),
  Mod,
  Const(0),
  Eq,
  Print,
  Exit,
]

The instructions above do the following — (once we’ve implemented them, that is):

  • Const(arg) pushes arg onto the VM stack
  • Mod pops values v0 and v1 from the stack, and pushes v1 % v0
  • Eq pops values v0 and v1 from the stack, and pushes v1 == v0
  • Print pops a value from the stack and prints it
  • Exit signals the end of the program

Additionally, each of these instructions will advance the VM to the next instruction after performing the behaviors above. Nice and imperative 😉

With this in mind, we see that the program above does the following: loads the values 7 and 2 onto the stack, computes 7 % 2, loads the value 0, compares 0 with the result of 7 % 2, prints the result of this comparison, and exits. In other words, this program tells us whether or not 7 is even.

Now that we’ve defined some instructions, let’s start evaluating them.

eval(vm) = cond {
  case is(Const, vm.insts[vm.index])
    vm with $.stack = [unwrap(vm.insts[vm.index])] ++ vm.stack
}

Our eval function uses a cond conditional — eventually this will include cases for all instructions, but we’ll start with just Const for now. The code above calls the function is to check whether the current instruction has the label Const (where the current instruction is the instruction in the vm.insts array at index vm.index). If is returns true, then conditional evaluates and returns the body of the case expression.

Structures in Pointless persistent, and thus immutable. As such, our program will never mutate the state of a VM object; rather, it will produce new VM objects. The code above uses with syntax (which can handle both shallow and nested updates — one of Pointless’s little innovations) to create a new VM object with a new stack value: the old stack with the instruction argument pushed to the top. The unwrap function is used to get the argument from the Const instruction.

The code above is dense and repetitive! We can improve it by defining a couple of variables. In the new version, inst is the current instruction, and the arg is the argument of that instruction.

eval(vm) = cond {
  case is(Const, inst)
    vm with $.stack = [arg] ++ vm.stack

} where {
  inst = vm.insts[vm.index]
  arg = unwrap(inst)
}

We modify the program’s output definition to call eval on the VM to evaluate the first instruction, Const(7). Looking at output, we see that the value 7 has been pushed to the stack.


insts field abbreviated for readability

output =
  program
  |> vmFromInsts
  |> eval
  |> println
$ ./bin/pointless main.ptls

VM {stack = [7]; insts = ...; slots = [0 0 0 0 0 0 0 0]; index = 0; outVal = None}

But wait! We need our Const handler to do one more thing — advance the VM to the next instruction. To do this, we’ll make a new helper function advance that takes a VM object and returns a new VM object with the instruction index incremented. We’ll also move the code for pushing a value onto the VM stack into the new function pushVal.

advance(vm) = vm with $.index += 1

pushVal(arg, vm) = vm with $.stack = [arg] ++ vm.stack

Using these two new functions, the code for eval becomes:

eval(vm) = cond {
  case is(Const, inst)
    vm
    |> pushVal(arg)
    |> advance

} where {
  inst = vm.insts[vm.index]
  arg = unwrap(inst)
}
$ ./bin/pointless main.ptls

VM {stack = [7]; insts = ...; slots = [0 0 0 0 0 0 0 0]; index = 1; outVal = None}

We call the program again and get the result seen above: the new VM object returned by eval has the value 7 on its stack and its index incremented.

At this point, the entire program looks something like this:


Note that the comment bars are not necessary, but are helpful for dividing code visually — it’s a habit of mine, like using fake em-dashes. They can also be used to mark API documentation

output =
  program
  |> vmFromInsts
  |> eval
  |> println

------------------------------------------------------------------------------

program = [
  Const(7),
  Const(2),
  Mod,
  Const(0),
  Eq,
  Print,
  Exit,
]

------------------------------------------------------------------------------

showInst(inst) =
 format("[ { pushVal(arg)
    |> advance

} where {
  inst = vm.insts[vm.index]
  arg = unwrap(inst)
}

------------------------------------------------------------------------------

advance(vm) = vm with $.index += 1

pushVal(arg, vm) = vm with $.stack = [arg] ++ vm.stack

Let’s keep going.

The Const instruction pushes values onto the stack; other instructions will need to access and pop stack values. We introduce two new functions to help with this: valAt(n, vm) returns the value at depth n on the stack; popVals(n, vm) returns a VM object with the top n stack values removed.

valAt(n, vm) = at(n, vm.stack)

popVals(n, vm) = vm with $.stack = drop(n, vm.stack)

We use these two functions to create a case to handle our first numerical operation, Mod:

case is(Mod, inst)
  vm
  |> popVals(2)
  |> pushVal(valAt(1, vm) % valAt(0, vm))
  |> advance

The Mod instruction pops the top two values from the stack — v0 and v1 — and pushes the value v1 % v0 to the stack (Mod expects the divisor to be at the top of the stack, and the dividend beneath it). The VM is then advanced to the next instruction.

Hold up! How can we access the top two stack values after we’ve already popped them? To see how this works, remember that popVals does not modify the VM object, but rather creates a new one. Thus valAt(1, vm) and valAt(0, vm) refer to the original VM object, whose stack still contains the two values in question.

To test our implementation for Mod, we’ll comment some instructions out:

program = [
  Const(7),
  Const(2),
  Mod,
  -- Const(0),
  -- Eq,
  -- Print,
  Exit,
]

Instead of single calling eval once as before, we now call eval repeatedly using the iterate function. This generates a sequence of VM objects where the first element is the initial VM object, and the following elements are the VM objects produced by each subsequent application of eval. We use takeUntil to get these updated VM objects until we reach the VM object with Exit as its current instruction. Node that the Exit instruction is not evaluated — it’s just there to mark the end of the program.

output =
  program
  |> vmFromInsts
  |> iterate(eval)
  |> takeUntil(vm => vm.insts[vm.index] == Exit)
  |> printLines
$ ./bin/pointless main.ptls

VM {stack = []; insts = ...; slots = [0 0 0 0 0 0 0 0]; index = 0; outVal = None}
VM {stack = [7]; insts = ...; slots = [0 0 0 0 0 0 0 0]; index = 1; outVal = None}
VM {stack = [2, 7]; insts = ...; slots = [0 0 0 0 0 0 0 0]; index = 2; outVal = None}
VM {stack = [1]; insts = ...; slots = [0 0 0 0 0 0 0 0]; index = 3; outVal = None}

The lines in the result above show VM objects:

  1. In the initial VM state
  2. After running Const(7)
  3. After running Const(2)
  4. After running Mod

The stack of the final VM objects holds the value 1, the result of 7 % 2, as expected.

We can do some copy/pasting to get a case for Eq that’s almost identical to the handler for Mod — the only difference is the == operator!

case is(Eq, inst)
  vm
  |> popVals(2)
  |> pushVal(valAt(1, vm) == valAt(0, vm))
  |> advance

Unfortunately this leaves us with some duplicated code between handlers for Mod and Eq. We can improve things by factoring these handlers out into a helper function for generic binary operations. This new function, binaryOp, takes a binary function argument op, which is used to calculate the result that gets pushed to the stack.

binaryOp(op, vm) =
 vm
 |> popVals(2)
 |> pushVal(op(valAt(0, vm), valAt(1, vm)))
 |> advance

Note that v0 comes before v1 in binaryOp. This is done because the prelude functions that we’ll pass binaryOp take their arguments in reversed order (for example the function mod which wraps the operator %). These numerical prelude functions are designed this way with partial application in mind: it’s more useful for mod(2) to mean n => n % 2 instead of n => 2 % n. This happens to match the order that the operands will popped from the stack, so no reversal is needed.

We can now rewrite the cases for Mod and Eq like this:

case is(Mod, inst)
  vm |> binaryOp(mod)

case is(Eq, inst)
  vm |> binaryOp(eq)

At this point the eval function can handle all instructions in our example program except for Print. We’ll comment out the Print instruction and run the program.

program = [
  Const(7),
  Const(2),
  Mod,
  Const(0),
  Eq,
  -- Print,
  Exit,
]
$ ./bin/pointless main.ptls

VM {stack = []; insts = ...; slots = [0 0 0 0 0 0 0 0]; index = 0; outVal = None}
VM {stack = [7]; insts = ...; slots = [0 0 0 0 0 0 0 0]; index = 1; outVal = None}
VM {stack = [2, 7]; insts = ...; slots = [0 0 0 0 0 0 0 0]; index = 2; outVal = None}
VM {stack = [1]; insts = ...; slots = [0 0 0 0 0 0 0 0]; index = 3; outVal = None}
VM {stack = [0, 1]; insts = ...; slots = [0 0 0 0 0 0 0 0]; index = 4; outVal = None}
VM {stack = [false]; insts = ...; slots = [0 0 0 0 0 0 0 0]; index = 5; outVal = None}

Our evaluation calculates 7 % 2 as before, and loads the number 0, and compares this to the result of the modulo operation.

We need one more helper function to implement the case for the Print instruction. Our new function setOutput takes an argument value and sets the .outVal field on the updated VM object to this value.

setOutput(val, vm) = vm with $.outVal = val

The handler for the Print instruction pops a value from the stack, sets the VM .outVal field to this value, and advances the VM:

case is(Print, inst)
  vm
  |> popVals(1)
  |> setOutput(valAt(0, vm))
  |> advance

Now we can update our output definition to get the output from our VM. Instead of printing the value of the VM object on each evaluation loop, we instead get the value of the .outVal field of each VM object using map(vm => vm.outVal), and print each of these values on a new line using printLines. We also want to clear the .outVal field before each instruction runs, so we modify our call to iterate to call setOutput(None) on each VM object before calling eval. As before, we run the iteration loop until the Exit instruction is reached.

output =
  program
  |> vmFromInsts
  |> iterate(compose(setOutput(None), eval))
  |> takeUntil(vm => vm.insts[vm.index] == Exit)
  |> map(vm => vm.outVal)
  |> printLines
$ ./bin/pointless main.ptls

None
None
None
None
None
None
false

We see that numerical instructions yield no print output (None), while the Print instruction yields the output value false — the result of 7 % 2 == 1.

To keep the definition for the output variable simple, we’ll refactor our output code into a new function, runGetOutput, which takes a list of instructions, creates a new VM object, evaluates the instructions, and get the VM output. runGetOutput also uses the filter function to discard the None outputs that come from the non-Print instructions.

runGetOutput(insts) =
  vmFromInsts(insts)
  |> iterate(compose(setOutput(None), eval))
  |> takeUntil(vm => vm.insts[vm.index] == Exit)
  |> map(vm => vm.outVal)
  |> filter(notEq(None))

Our new output definition calls runGetOutput and produces the result below, which tells us that, unsurprisingly, 7 is not even.

output =
  program
  |> runGetOutput
  |> printLines
$ ./bin/pointless main.ptls

false

Here’s what our current program looks like:

output =
  program
  |> runGetOutput
  |> printLines

runGetOutput(insts) =
  vmFromInsts(insts)
  |> iterate(compose(setOutput(None), eval))
  |> takeUntil(vm => vm.insts[vm.index] == Exit)
  |> map(vm => vm.outVal)
  |> filter(notEq(None))

------------------------------------------------------------------------------

program = [
  Const(7),
  Const(2),
  Mod,
  Const(0),
  Eq,
  Print,
  Exit,
]

------------------------------------------------------------------------------

showInst(inst) =
 format("[ { pushVal(arg)
    |> advance

  case is(Print, inst)
    vm
    |> popVals(1)
    |> setOutput(valAt(0, vm))
    |> advance

  case is(Mod, inst) vm |> binaryOp(mod)
  case is(Eq, inst)  vm |> binaryOp(eq)

} where {
  inst = vm.insts[vm.index]
  arg = unwrap(inst)
}

------------------------------------------------------------------------------

advance(vm)        = vm with $.index += 1
pushVal(arg, vm)   = vm with $.stack = [arg] ++ vm.stack
popVals(n, vm)     = vm with $.stack = drop(n, vm.stack)
setOutput(val, vm) = vm with $.outVal = val

valAt(n, vm) = at(n, vm.stack)

binaryOp(op, vm) =
 vm
 |> popVals(2)
 |> pushVal(op(valAt(0, vm), valAt(1, vm)))
 |> advance

Maybe not so Pointless after all…

Let’s look at our prime-factorization program:

n = 80122

program = [
  Const(2),
  Store(0),            -- store value 2 in slot 0 (initial divisor)
  Const(n),
  Store(1),            -- store value n in slot 1 (initial dividend)
  Label("checkDone"),
  Load(1),             -- load dividend
  Const(1),
  Eq,                  -- compare dividend to value 1
  JmpIf("end"),        -- jump to end if dividend == 1
  Label("checkDiv"), 
  Load(1),             -- load dividend
  Load(0),             -- load divisor
  Mod,
  Const(0),
  Eq,                  -- compare dividend % dividor == 0 
  JmpIf("divisible"),
  Load(0),             -- load divisor
  Const(1),
  Add,                 -- increment divisor
  Store(0),            -- store new divisor
  Jmp("checkDiv"),     -- check end condition
  Label("divisible"),
  Load(0),             -- load divisor
  Print,               -- print divisor (current factor)
  Load(1),             -- load dividend
  Load(0),             -- load divisor
  Div,
  Store(1),            -- store dividend / divisor as new dividend
  Jmp("checkDone"),    -- check end condition
  Label("end"),
  Const("done"),
  Print,               -- print "done"
  Exit,
]

This program is more complex than the last; it makes use of the VM’s variable slots for “long term” storage. Slots serve as the VM’s “heap” and are accessed by integer “addresses” (indices). In this program, the current factor being tested (the divisor) lives in slot 0, and the current number whose factor is being calculated (the dividend) lives in slot 1.

The prime-factorization program also uses some new instructions:

Add, Div, Load, Store, Jmp, JmpIf

These instructions behave as follows:

  • Add pops values v0 and v1 from the stack, pushes v1 + v0, and advances
  • Div pops values v0 and v1 from the stack, pushes v1 / v0, and advances
  • Load(arg) pushes the value in the slot at index arg to stack, and advances
  • Store(arg) pops value v0, stores v0 in the slot at index arg, and advances
  • Jmp(arg) sets VM instruction index to arg
  • JmpIf(arg) pops value v0, sets VM instruction index to arg if v0 == true, otherwise advances

The instructions Jmp and JmpIf take numerical arguments, but in the program above they have string arguments! The program contains Label instructions whose argument strings mark the target locations of the corresponding jump instructions. Before evaluating our new program, we’ll need to write code that converts these string jump targets into numerical indices and removes the Label instructions. That code starts with the following functions:

getLabelInds(insts) =
  insts
  |> reduce(scanInst, (0, {})) -- keep track of (current index, index map)
  |> at(1) -- return index map

scanInst(pair, inst) = 
  if not is(Label, inst)
  then (ind + 1, inds)
  else (ind, newInds)
  where {
    newInds = inds with $[unwrap(inst)] = ind
    (ind, inds) = pair -- (current index, index map)
  }

getLabelInds uses the reduce function to iterate through the given instructions list while keeping track of the current index (not counting Label instructions)
and maintaining a dict matching jump target strings with instruction indices. After iterating through all of the instructions, getLabelInds returns this dict.

scanInst (called for each instruction by getLabelInds) takes the pair (current index, index dict) and the current instruction in the list and adds the instruction to the index dict if the instruction is a Label, or increments the current index otherwise.

The output definition below displays the index dict calculated for the prime-factorization program instructions:

output =
  program
  |> getLabelInds
  |> println
$ ./bin/pointless main.ptls

{"checkDone": 4, "divisible": 19, "checkDiv": 8, "end": 26}

To produce the instructions that the VM will run, we’ll need to filter out the Label instructions and replace the target strings of jump instructions with indices. The function convertJumps does both of these things, using the index dict calculated with getLabelInds. To do this, convertJumps calls the helper function convertJump, which translates jump instructions using the index dict obtained from getLabelInds.

convertJumps(insts) =
  insts
  |> filter(notIs(Label))
  |> map(convertJump(inds))
  where inds = getLabelInds(insts)

convertJump(inds, inst) = cond {
  case is(Jmp, inst) Jmp(inds[unwrap(inst)])
  case is(JmpIf, inst) JmpIf(inds[unwrap(inst)])
  else inst
}

To make sure our code works, we’ll modify our output definition to call convertJumps and use the function showInst from before to get string representations for each resulting instruction. In addition to this, we’ll call enumerate to get (index, value) pairs for each instruction string, and call format("{>2} {}") on each pair to print each instruction along with its index (this helps us see where the jump instructions now point).

output =
  program
  |> convertJumps
  |> map(showInst)
  |> enumerate
  |> map(format("{>2} {}"))
  |> printLines
$ ./bin/pointless main.ptls

 0 [ Const ] 2
 1 [ Store ] 0
 2 [ Const ] 80122
 3 [ Store ] 1
 4 [ Load  ] 1
 5 [ Const ] 1
 6 [ Eq    ] 
 7 [ JmpIf ] 26
 8 [ Load  ] 1
 9 [ Load  ] 0
10 [ Mod   ] 
11 [ Const ] 0
12 [ Eq    ] 
13 [ JmpIf ] 19
14 [ Load  ] 0
15 [ Const ] 1
16 [ Add   ] 
17 [ Store ] 0
18 [ Jmp   ] 8
19 [ Load  ] 0
20 [ Print ] 
21 [ Load  ] 1
22 [ Load  ] 0
23 [ Div   ] 
24 [ Store ] 1
25 [ Jmp   ] 4
26 [ Const ] done
27 [ Print ] 
28 [ Exit  ] 

Things are looking good: convertJumps replaced the old jump targets with indices, and removed the old Label instructions. Now we can update vmFromInsts to call convertJumps.

vmFromInsts(program) = VM {
  slots  = zeroArray(8)
  insts  = toArray(convertJumps(program))
  index  = 0
  stack  = []
  outVal = None
}

Time to implement the last few instruction handlers. We’ll write the following helper methods for accessing and writing to the VM’s slots: load pushes the value at slot index arg to the stack, and store saves the value at the top of the stack to slot index arg.

load(arg, vm) = pushVal(vm.slots[arg], vm)

store(arg, vm) = vm with $.slots[arg] = head(vm.stack)

We implement the handlers for the Load and Store instructions like this:

case is(Load, inst)
  vm
  |> load(arg)
  |> advance

case is(Store, inst)
  vm
  |> store(arg)
  |> popVals(1)
  |> advance

Don’t partial application and function piping go together so nicely?

We can see this code in action using one of our previous output definitions (which shows intermediate VM states), along with a short example program:

output =
  program
  |> vmFromInsts
  |> iterate(eval)
  |> takeUntil(vm => vm.insts[vm.index] == Exit)
  |> printLines

program = [
  Const(1),
  Store(0), -- store 1 in slot 0
  Load(0),  -- load 1 from slot 0
  Const(2),
  Mul,      -- 2 * 1 = 2
  Store(1), -- store 2 in slot 1
  Load(1),  -- load 2 from slot 1
  Const(2),
  Mul,      -- 2 * 2 = 4
  Store(0), -- store 4 in slot 1
  Exit,
]
$ ./bin/pointless main.ptls

VM {stack = []; insts = ...; slots = [0 0 0 0 0 0 0 0]; index = 0; outVal = None}
VM {stack = [1]; insts = ...; slots = [0 0 0 0 0 0 0 0]; index = 1; outVal = None}
VM {stack = []; insts = ...; slots = [1 0 0 0 0 0 0 0]; index = 2; outVal = None}
VM {stack = [1]; insts = ...; slots = [1 0 0 0 0 0 0 0]; index = 3; outVal = None}
VM {stack = [2, 1]; insts = ...; slots = [1 0 0 0 0 0 0 0]; index = 4; outVal = None}
VM {stack = [2]; insts = ...; slots = [1 0 0 0 0 0 0 0]; index = 5; outVal = None}
VM {stack = []; insts = ...; slots = [1 2 0 0 0 0 0 0]; index = 6; outVal = None}
VM {stack = [2]; insts = ...; slots = [1 2 0 0 0 0 0 0]; index = 7; outVal = None}
VM {stack = [2, 2]; insts = ...; slots = [1 2 0 0 0 0 0 0]; index = 8; outVal = None}
VM {stack = [4]; insts = ...; slots = [1 2 0 0 0 0 0 0]; index = 9; outVal = None}
VM {stack = []; insts = ...; slots = [4 2 0 0 0 0 0 0]; index = 10; outVal = None}

The results above show the Load and Store reading from and writing to the .slots fields of the VM objects.

Our last helper function, jumpIf, updates the VM index to the value arg if the value pred is true, or advances the VM otherwise.

jumpIf(pred, arg, vm) =
  if pred
  then vm with $.index = arg
  else advance(vm)

We use jumpIf to implement handlers for both the Jmp and JmpIf instructions. JmpIf pops a value from the stack and passes it as the predicate to jumpIf, along with its target index (arg). Jmp also passes its target index to jumpIf, and passes true as its predicate, since Jmp instructions always jump.

case is(JmpIf, inst)
  vm
  |> popVals(1)
  |> jumpIf(valAt(0, vm), arg)

case is(Jmp, inst)
  vm
  |> jumpIf(true, arg)

The following output definition and example program show JmpIf in action:

output =
  program
  |> vmFromInsts
  |> iterate(eval)
  |> takeUntil(vm => vm.insts[vm.index] == Exit)
  |> printLines

program = [
  Const(false),
  JmpIf("skip"), -- top-of-stack is false, don't jump 
  Const("A"),    -- loads "A"
  Const(true),
  JmpIf("skip"), -- top-of-stack is true, jump to "skip"
  Const("B"),    -- (skipped)
  Label("skip"),
  Exit,          -- exits with only "A" on the stack
]
$ ./bin/pointless main.ptls

VM {stack = []; insts = ...; vars = [0 0 0 0 0 0 0 0]; index = 0; outVal = None}
VM {stack = [false]; insts = ...; vars = [0 0 0 0 0 0 0 0]; index = 1; outVal = None}
VM {stack = []; insts = ...; vars = [0 0 0 0 0 0 0 0]; index = 2; outVal = None}
VM {stack = ["A"]; insts = ...; vars = [0 0 0 0 0 0 0 0]; index = 3; outVal = None}
VM {stack = [true, "A"]; insts = ...; vars = [0 0 0 0 0 0 0 0]; index = 4; outVal = None}
VM {stack = ["A"]; insts = ...; vars = [0 0 0 0 0 0 0 0]; index = 6; outVal = None}

We’ll wrap things up by adding handlers for the remaining binary operations using the binaryOp function from before. Handlers for Sub and Mul and included for completeness.

case is(Eq,  inst) vm |> binaryOp(eq)
case is(Add, inst) vm |> binaryOp(add)
case is(Sub, inst) vm |> binaryOp(sub)
case is(Mul, inst) vm |> binaryOp(mul)
case is(Div, inst) vm |> binaryOp(div)
case is(Mod, inst) vm |> binaryOp(mod)

We can now run the entire prime-factorization program using the output definition from before:

output =
  program
  |> runGetOutput
  |> printLines
$ ./bin/pointless main.ptls

2
7
59
97
done

We did it! We implemented a VM in Pointless. And we never even mentioned monads.

You can run the final prime-factorization program and VM implementation online. There are many numerical programs that could now be written for our VM — see this collatz sequence example.

Visit the dev page for instructions on installing Pointless locally. Contributions to the language and example code are welcome!

Our entire program (split into several file) is shown below:

main.ptls
import "stackVM.ptls" as stackVM

------------------------------------------------------------------------------

output =
  program
  |> stackVM.runGetOutput
  |> printLines

------------------------------------------------------------------------------

n = 80122

program = [
  Const(2),
  Store(0),            -- store value 2 in slot 0 (initial divisor)
  Const(n),
  Store(1),            -- store value n in slot 1 (initial dividend)
  Label("checkDone"),
  Load(1),             -- load dividend
  Const(1),
  Eq,                  -- compare dividend to value 1
  JmpIf("end"),        -- jump to end if dividend == 1
  Label("checkDiv"), 
  Load(1),             -- load dividend
  Load(0),             -- load divisor
  Mod,
  Const(0),
  Eq,                  -- compare dividend % dividor == 0 
  JmpIf("divisible"),
  Load(0),             -- load divisor
  Const(1),
  Add,                 -- increment divisor
  Store(0),            -- store new divisor
  Jmp("checkDiv"),     -- check end condition
  Label("divisible"),
  Load(0),             -- load divisor
  Print,               -- print divisor (current factor)
  Load(1),             -- load dividend
  Load(0),             -- load divisor
  Div,
  Store(1),            -- store dividend / divisor as new dividend
  Jmp("checkDone"),    -- check end condition
  Label("end"),
  Const("done"),
  Print,               -- print "done"
  Exit,
]

stackVM.ptls
export {
  runGetOutput
}

------------------------------------------------------------------------------

import "instructions.ptls" as instructions

------------------------------------------------------------------------------

vmFromInsts(program) = VM {
  slots  = zeroArray(8)
  insts  = toArray(instructions.convertJumps(program))
  index  = 0
  stack  = []
  outVal = None
}

------------------------------------------------------------------------------

runGetOutput(insts) =
  vmFromInsts(insts)
  |> iterate(compose(setOutput(None), eval))
  |> takeUntil(vm => vm.insts[vm.index] == Exit)
  |> map(vm => vm.outVal)
  |> filter(notEq(None))

------------------------------------------------------------------------------

eval(vm) = cond {
  case is(Const, inst)
    vm
    |> pushVal(arg)
    |> advance

  case is(Print, inst)
    vm
    |> popVals(1)
    |> setOutput(valAt(0, vm))
    |> advance

  case is(Load, inst)
    vm
    |> load(arg)
    |> advance

  case is(Store, inst)
    vm
    |> store(arg)
    |> popVals(1)
    |> advance

  case is(JmpIf, inst)
    vm
    |> popVals(1)
    |> jumpIf(valAt(0, vm), arg)

  case is(Jmp, inst)
    vm
    |> jumpIf(true, arg)

  case is(Eq,  inst) vm |> binaryOp(eq)
  case is(Add, inst) vm |> binaryOp(add)
  case is(Sub, inst) vm |> binaryOp(sub)
  case is(Mul, inst) vm |> binaryOp(mul)
  case is(Div, inst) vm |> binaryOp(div)
  case is(Mod, inst) vm |> binaryOp(mod)

} where {
  inst = vm.insts[vm.index]
  arg = unwrap(inst)
}

------------------------------------------------------------------------------

advance(vm)        = vm with $.index += 1
pushVal(arg, vm)   = vm with $.stack = [arg] ++ vm.stack
popVals(n, vm)     = vm with $.stack = drop(n, vm.stack)
setOutput(val, vm) = vm with $.outVal = val
store(arg, vm)     = vm with $.slots[arg] = head(vm.stack)
load(arg, vm)      = pushVal(vm.slots[arg], vm)

jumpIf(pred, arg, vm) =
  if pred
  then vm with $.index = arg
  else advance(vm) 

valAt(n, vm) = at(n, vm.stack)

binaryOp(op, vm) =
 vm
 |> popVals(2)
 |> pushVal(op(valAt(0, vm), valAt(1, vm)))
 |> advance

instructions.ptls
export {
  showInst, makeInstArray
}

------------------------------------------------------------------------------

showInst(inst) =
 format("[ { filter(notIs(Label))
  |> map(convertJump(inds))
  where inds = getLabelInds(insts)

------------------------------------------------------------------------------

convertJump(inds, inst) = cond {
  case is(Jmp, inst) Jmp(inds[unwrap(inst)])
  case is(JmpIf, inst) JmpIf(inds[unwrap(inst)])
  else inst
}

------------------------------------------------------------------------------

getLabelInds(insts) =
  insts
  |> reduce(scanInst, (0, {})) -- keep track of (current index, index map)
  |> at(1) -- return index map

------------------------------------------------------------------------------

scanInst(pair, inst) = 
  if not is(Label, inst)
  then (ind + 1, inds)
  else (ind, newInds)
  where {
    newInds = inds with $[unwrap(inst)] = ind
    (ind, inds) = pair -- (current index, index map)
  }

Read More

Previous Post

CH Show HN: Remember More From Reading – an automatic reflection tool

Next Post

CH Show HN: Quik – Internet Explorer as a Service

Leave a Reply

Your email address will not be published. Required fields are marked *

Scroll to top