x-machine.p

The X-machine
H. James Hoover
Version 2.1
2022-04-08

When studying what it means to compute, the so-called theory of computation, we often model the real hardware with abstractions called "models of computation". A model of computation captures the machine aspects that we want to explore, and simplifies away details that do not interest us at the time.

Today we are going to look at a very simple model of computation, called the X-Machine. I invented it back in my early days (1988) as a prof to help students to understand the concepts of addresses and pointers in CMPUT 330 (the earlier version of CMPUT 229, Dept. of Computing Science, University of Aleberta). I found that programming the X-Machine helped clarify the notions of memory, addresses, indexing, and linkage conventions in preparation for the more complicated assembly language of the IBM 370 series machine.

The X-machine is very simple. It consists of memory, and a 2-instruction CPU.

Memory

Memory consists of an infinite sequence of cells, each cell contains an integer. We will use decimal notation for writing our integers. The range of values in a cell is unlimited, in theory. But in practice we would have some idea of the largest magnitude that a cell would ever need. doc-mem.src

000: 1

001: 1001

002: 4

003: 15

004: 42

We refer to the contents of a memory location using the notation [ ]. If the address of a location is A, then the contents of the location is [A]. We allow addresses to be any integer value, including negative. The notation "expr -> [ addr ]" means "set the contents of location addr to the value of expr".

For example

[1] is the contents of memory location 1, and [1] = 1001 in our example above.

[10] + 3 -> [11] means take the contents of location 10, add 3 to that value, and store it as the new contents of location [11]

[1] + [[2]] -> [1] means increment the contents of location 1 by the contents of the location whose address comes from the contents of location 2. So in the above example [1] is 1001, [2] is 4, so [[2]] is 42. So the new value of location 1, i.e. [1], is 1043.

Instruction Format

Each instruction takes 4 memory cells. The first cell is the opcode, the next 3 cells are the operands. There are two instructions, SUB and BGE, with opcodes 1001 and 1002 respectively.

Opcode Op1 Op2 Op3 Action

SUB (1001) A B C [A] - [B] -> [C]

BGE (1002) A B T if [A] >= [B] branch to T

So the instruction at address 1 above is: SUB, 4, 15, 42.

Fetch Execute Cycle

The advantage of this simple machine is that we can completely specify its "operational semantics". Memory location 0 is special, it holds the current value of the program counter, which is the address in memory of where the next instruction to be executed is located. We refer to it with the symbol PC, i.e. PC = 0. We also define the symbols SUB = 1001 and BGE = 1002

All memory cells with positive addresses contain 0 on boot, and are set to the values of the program before the machine is started. This requires some magic - called the boot loader.

Negative memory addresses are used for memory mapped IO. Specific addresses have special behaviour. The symbols IN and CIN are read only, and for input. The symbols OUT and COUT are write only, and for output. In general IN and CIN are attached to stdin, and OUT and COUT are attached to stdout. Integer IO: Reading from IN interprets the next characters from stdin as a signed integer. Writing to OUT outputs the signed integer value to stdout as if it was printed using the %d format. Character IO: Reading from CIN reads a single character from stdin and returns the 8bit ASCII character code corresponding input character. Writing a value in [1, 255] to COUT converts the integer into an 8bit ASCII character and sends it to stdout. This operation has extended behaviour. If the number is bigger than 255, then it is broken into 8bit chunks, and the output as individual characters from left to right.

while running
    if [PC] == 0: halt

    SUB, A, B, C i.e. [A] - [B] -> [C]
    if [ [PC] ] == SUB:
        [ [ [PC]+1 ] ] - [ [ [PC]+2 ] ] -> [ [ [PC]+3 ] ]
        [PC] + 4 -> [PC]

    BGE, A, B, T i.e. if [A] >= [B] branch to T
    if [PC] == BGE:
        if [ [ [PC]+1 ] ] >= [ [ [PC]+2 ] ]
             then [ [PC]+3 ] -> [PC]
             else [PC]+4 -> [PC]

    otherwise: halt

Note the layers of fetching actual values to operate on!
[PC]+1 is the location of the first operand of the instruction.
[ [PC] + 1 ] is the value of the operand, i.e. the address A.
and finally [ [ [PC] + 1 ] ] is the value [A] at address A.

Simple Sets of Instructions

We now look at some simple sets of instructions.

Copy from location A to location B
Increment location X by 1, i.e. [X] + 1 -> [X]
Branch to location T
Test if location X and location Y have the same values. I.e. if [X] = [Y] then do something, otherwise do something else.
Compute the min of location X and Y and store in location Z, i.e. min( [X], [Y]) -> [Z]
Relative branch forward 4 instructions
Load the ith element of an array L

doc-simple-sets.xm

#

# Start loading into memory at location 12

#

@ = 12;

#

# Some simple useful constants

#

Zero: 0;

NegOne: -1;

# Arithmetic on symbols to initialize constants

_far_delta: here - far;

# Instead of naming constants, we can adopt the convention of

# _x for the constant x, and _nx for the constant -x

_n1: -1;

_0: 0;

_1: 1;

_2: 2;

#

# Some variables

#

A: 10;

B: 0;

X: 42;

Y: -42;

Z: 0;

#

# Copy from location A to location B, i.e. [A] -> [B]

#

SUB, A, Zero, B;

#

# Increment location T by 1, i.e. [T] + 1 -> [T]

#

SUB, T, NegOne, T;

#

# Branch to location Dest: Dest -> [PC]

#

BGE, Zero, Zero, Dest;

# ...

Dest:

#

# Test if location X and location Y have the same values. I.e.

# if [X] == [Y] then do something, otherwise do something else.

#

BGE, X, Y, ge;

BGE, Zero, Zero, ne;

ge: BGE, Y, X, eq;

BGE, Zero, Zero, ne;

eq:

# do stuff if equal, branch when done

BGE, Zero, Zero, done;

ne:

# do stuff if not equal, branch when done

BGE, Zero, Zero, done;

# ...

done:

#

# Compute the min of location X and Y and store in location Z,

# i.e. min( [X], [Y] ) -> [Z]

#

BGE, Y, X, xmin;

ymin: SUB, Y, _0, Z;

BGE, _0, _0, next;

xmin: SUB, X, _0, Z;

next:

# ...

#

# Using the @ symbol, which has the value of the address of the

# beginning of the instruction, you can eliminate having to

# invent unique names for intermediate symbols. Best when

# using a macro-processor like M4.

#

BGE, Y, X, @+16;

SUB, Y, _0, Z;

BGE, _0, _0, @+8;

SUB, X, _0, Z;

#

# Relative branch 2 instructions after "here" by modifying PC.

# During instruction execution PC is not changed until instruction

# complete, which is why [_far_delta] is -12 and not -8.

#

SUB, _0, _0, OUT;

here: SUB, PC, _far_delta, PC;

near: SUB, _1, _0, OUT;

SUB, PC, PC, PC;

far: SUB, _2, _0, OUT;

SUB, PC, PC, PC;

#

# Indexed load from element L[i] of an array: [L+i] -> [V]

#

SUB, _0, i, T;

SUB, L_addr, T, load+1;

load: SUB, 0, _0, V;

# ...

# index

i: 0;

# temp

T: 0;

# value to work with

V: 0;

# pointer to array L

L_addr: L;

# number of elements in L

L_len: L_end - L;

# start of array L

L:

@ = @ + 10; # 10 un-initialized locations for L

# 1 past end of array L, so L_end - L is length of array

L_end: ;

doc-simple-sets.txt

Code listing:

#

# Start loading into memory at location 12

#

000: @ = 12

#

# Some simple useful constants

#

012: 0 Zero: 0

013: -1 NegOne: -1

# Arithmetic on symbols to initialize constants

014: -12 _far_delta: here - far

                             # Instead of naming constants, we can adopt the convention of

                             # _x for the constant x, and _nx for the constant -x

015: -1 _n1: -1

016: 0 _0: 0

017: 1 _1: 1

018: 2 _2: 2

#

# Some variables

#

019: 10 A: 10

020: 0 B: 0

021: 42 X: 42

022: -42 Y: -42

023: 0 Z: 0

#

                             # Copy from location A to location B, i.e. [A] -> [B]

#

024: 1001 19 12 20 SUB, A, Zero, B

#

# Increment location T by 1, i.e. [T] + 1 -> [T]

#

028: 1001 129 13 129 SUB, T, NegOne, T

#

# Branch to location Dest: Dest -> [PC]

#

032: 1002 12 12 36 BGE, Zero, Zero, Dest

# ...

036: Dest:

#

                             # Test if location X and location Y have the same values.  I.e.

                             # if [X] == [Y] then do something, otherwise do something else.

#

036: 1002 21 22 44 BGE, X, Y, ge

040: 1002 12 12 56 BGE, Zero, Zero, ne

044: 1002 22 21 52 ge: BGE, Y, X, eq

048: 1002 12 12 56 BGE, Zero, Zero, ne

052: eq:

# do stuff if equal, branch when done

052: 1002 12 12 60 BGE, Zero, Zero, done

056: ne:

# do stuff if not equal, branch when done

056: 1002 12 12 60 BGE, Zero, Zero, done

# ...

060: done:

#

                             # Compute the min of location X and Y and store in location Z,

# i.e. min( [X], [Y] ) -> [Z]

#

060: 1002 22 21 72 BGE, Y, X, xmin

064: 1001 22 16 23 ymin: SUB, Y, _0, Z

068: 1002 16 16 76 BGE, _0, _0, next

072: 1001 21 16 23 xmin: SUB, X, _0, Z

076: next:

# ...

#

                             # Using the @ symbol, which has the value of the address of the

                             # beginning of the instruction, you can eliminate having to

                             # invent unique names for intermediate symbols.  Best when

# using a macro-processor like M4.

#

076: 1002 22 21 92 BGE, Y, X, @+16

080: 1001 22 16 23 SUB, Y, _0, Z

084: 1002 16 16 92 BGE, _0, _0, @+8

088: 1001 21 16 23 SUB, X, _0, Z

#

                             # Relative branch 2 instructions after "here" by modifying PC.

                             # During instruction execution PC is not changed until instruction

                             # complete, which is why [_far_delta] is -12 and not -8.

#

092: 1001 16 16 -2 SUB, _0, _0, OUT

096: 1001 0 14 0 here: SUB, PC, _far_delta, PC

100: 1001 17 16 -2 near: SUB, _1, _0, OUT

104: 1001 0 0 0 SUB, PC, PC, PC

108: 1001 18 16 -2 far: SUB, _2, _0, OUT

112: 1001 0 0 0 SUB, PC, PC, PC

#

                             # Indexed load from element L[i] of an array: [L+i] -> [V]

#

116: 1001 16 128 129 SUB, _0, i, T

120: 1001 131 129 125 SUB, L_addr, T, load+1

124: 1001 0 16 130 load: SUB, 0, _0, V

# ...

# index

128: 0 i: 0

129:

# temp

129: 0 T: 0

130:

# value to work with

130: 0 V: 0

131:

# pointer to array L

131: 133 L_addr: L

132:

# number of elements in L

132: 10 L_len: L_end - L

133:

# start of array L

133: L:

133: @ = @ + 10

# 10 un-initialized locations for L

                             # 1 past end of array L, so L_end - L is length of array

143: L_end:

143:

Symbol Table

143: @

19: A

20: B

1002: BGE

-3: CIN

-4: COUT

36: Dest

-1: IN

133: L

131: L_addr

143: L_end

132: L_len

13: NegOne

-2: OUT

0: PC

1001: SUB

129: T

130: V

21: X

22: Y

23: Z

12: Zero

16: _0

17: _1

18: _2

14: _far_delta

15: _n1

60: done

52: eq

108: far

44: ge

96: here

128: i

124: load

56: ne

100: near

76: next

72: xmin

64: ymin

Example Programs

Minimum of 2 integers
demo/min.txt

Code listing:

# Program to:

# read integer and store in A,

# read integer and store in B,

# put the minimum of [A] and [B] into C, i.e.

# min{ [A], [B] } -> [C]

# Idiom to initialize PC with START addr

000: 1 _:START

# Read in two integers

001: START:

001: 1001 -1 33 35 SUB, IN, _0, A

005: 1001 -1 33 36 SUB, IN, _0, B

009: 1002 36 35 21 BGE, B, A, MinIsA

# B is smaller

013: 1001 36 33 37 SUB, B, _0, C

017: 1002 33 33 25 BGE, _0, _0, Continue

# A is smaller

021: 1001 35 33 37 MinIsA: SUB, A, _0, C

025: Continue:

# Print the min

025: 1001 37 33 -2 SUB, C, _0, OUT

# Halt by setting PC to 0

029: 1001 33 33 0 SUB, _0, _0, PC

# Some constants

033: 0 _0: 0

034: 1 _1: 1

# Variables

035: 0 A: 0

036: 0 B: 0

037: 0 C: 0

Symbol Table

37: @

35: A

36: B

1002: BGE

37: C

-3: CIN

-4: COUT

25: Continue

-1: IN

21: MinIsA

-2: OUT

0: PC

1: START

1001: SUB

0: _

33: _0

34: _1

Sum of 1+2+...+n
demo/sum.txt

Code listing:

# Program to

# read in n

# sum the first n terms of 1+2+3+ ... and

# put the result in sum

# output sum

                             # Idiom to initialize PC with START addr, so the constants

# at the top are skipped.

000: 7 _:START

# Constants

001: 0 _0: 0

002: 1 _1: 1

003: -1 _m1: -1

# Variables

004: 0 n: 0

005: 0 sum: 0

# Working variables

006: 0 i: 0

007: START:

# Read n

007: 1001 -1 1 4 SUB, IN, _0, n

# initialize sum

011: 1001 1 1 5 SUB, _0, _0, sum

# loop i = 1 .. n

015: 1001 4 3 4 SUB, n, _m1, n

019: 1001 2 1 6 SUB, _1, _0, i

023: loop_1_begin:

023: 1002 6 4 47 BGE, i, n, loop_1_end

# Sum i negatively

027: 1001 5 6 5 SUB, sum, i, sum

# Debug - print i and sum

031: 1001 6 1 -2 SUB, i, _0, OUT

035: 1001 5 1 -2 SUB, sum, _0, OUT

# Increment i

039: 1001 6 3 6 SUB, i, _m1, i

# Back to top of loop

043: 1002 1 1 23 BGE, _0, _0, loop_1_begin

047: loop_1_end:

# Flip sign of sum, and outpu

047: 1001 1 5 5 SUB, _0, sum, sum

051: 1001 5 1 -2 SUB, sum, _0, OUT

# Halt by setting PC to 0

055: 1001 0 0 0 SUB, PC, PC, PC

Symbol Table

55: @

1002: BGE

-3: CIN

-4: COUT

-1: IN

-2: OUT

0: PC

7: START

1001: SUB

0: _

1: _0

2: _1

3: _m1

6: i

23: loop_1_begin

47: loop_1_end

4: n

5: sum

Convert Non-negative to Binary
demo/binrep.txt

Code listing:

                             # Program to convert a non-negative integer x into its binary

# represention

                             # Reads an integer n and extracts its binary representation

# Pre: integer n must be non-negative and < 2^nbits

# Post: bit i of n will be placed into bits[i]

# idiom to initialize PC with START addr

000: 1 _:START

001: START:

# bits posn 0 1 ... (nbits-1)

# bits value 2^0 2^1 ... 2^(nbits-1)

# Maximum number of bits

001: nbits = 8

                             # Begin by computing a table of the power of 2 associated with each

# bit position.

# Pointer p_pow2 to bit 0, incremented on each bit

001: 1001 139 127 136 SUB, pow2_start_addr, _0, p_pow2

# Current power of 2

005: 1001 128 127 131 SUB, _1, _0, two_to_i

# loop i = 0 .. n-1 times, using i as the index

009: 1001 127 127 135 SUB, _0, _0, i

013: 1002 135 134 45 loop_1_begin: BGE, i, n, loop_1_end

# indirect store: [two_to_i] -> [ [p_pow2] ]

# build instruction to store pow2 into that address

017: 1001 136 127 24 SUB, p_pow2, _0, @+7

021: 1001 131 127 0 SUB, two_to_i, _0, 0

# double the power of 2

025: 1001 127 131 138 SUB, _0, two_to_i, tmp

029: 1001 131 138 131 SUB, two_to_i, tmp, two_to_i

# increment i and address of next pow2 table entry

033: 1001 135 126 135 SUB, i, _n1, i

037: 1001 136 126 136 SUB, p_pow2, _n1, p_pow2

# back to top of loop

041: 1002 127 127 13 BGE, _0, _0, loop_1_begin

045: loop_1_end:

# Now read in a value and determine its bin rep

045: 1001 -1 127 130 SUB, IN, _0, value

049: 1001 130 127 132 SUB, value, _0, remain

                             # end address is 1 beyond actual table, and we want to start

# at last entries and work down

053: 1001 140 128 136 SUB, pow2_end_addr, _1, p_pow2

057: 1001 142 128 137 SUB, bits_end_addr, _1, p_bits

# Work in decreasing powers of 2

# loop i = 0 .. n-1 times, using i as the index

061: 1001 127 127 135 SUB, _0, _0, i

065: 1002 135 134 121 loop_2_begin: BGE, i, n, loop_2_end

# indirect load: [ [p_pow2] ] -> [ two_to_i ]

# build instruction to load 2^(n-1-i) into two_to_i

069: 1001 136 127 74 SUB, p_pow2, _0, @+5

073: 1001 0 127 131 SUB, 0, _0, two_to_i

                             # remain < pow => remain does not contain that power so bit is 0

# otw bit is 1

077: 1001 127 127 133 SUB, _0, _0, bit

081: 1002 132 131 89 BGE, remain, two_to_i, bigger

085: 1002 127 127 97 BGE, _0, _0, smaller

089: bigger:

089: 1001 128 127 133 SUB, _1, _0, bit

# remove that power from remainder

093: 1001 132 131 132 SUB, remain, two_to_i, remain

097: smaller:

# debug

# SUB, i, _0, OUT;

# SUB, a, _0, OUT;

# SUB, b, _0, OUT;

# SUB, two_to_i, _0, OUT;

# SUB, bit, _0, OUT;

# SUB, remain, _0, OUT;

# SUB, _n1, _0, OUT;

                             # build instruction to store the bit into the bits table.

097: 1001 137 127 104 SUB, p_bits, _0, @+7

101: 1001 133 127 0 SUB, bit, _0, 0

                             # increment i and decrement address of current entry in power

# table and bit value

105: 1001 135 126 135 SUB, i, _n1, i

109: 1001 136 128 136 SUB, p_pow2, _1, p_pow2

113: 1001 137 128 137 SUB, p_bits, _1, p_bits

# back to top of loop

117: 1002 127 127 65 BGE, _0, _0, loop_2_begin

121: loop_2_end:

# Halt by setting PC to 0

121: 1001 127 127 0 SUB, _0, _0, PC

                             # Some constants.  Note that if you put them in this order

                             # then you can index off _0, for example, this stores 1 - (-1)

# into tmp.

# SUB, _0+1, _0-1, tmp;

125: -2 _n2: -2

126: -1 _n1: -1

127: 0 _0: 0

128: 1 _1: 1

129: 2 _2: 2

# Input value to be converted

130: 0 value: 0

# Current power 2^n

131: 1 two_to_i: 1

                             # Remaining bits of value to be processed, should be 0 when

# finished.

132: 0 remain: 0

# Current bit value

133: 0 bit: 0

# Variables

# Number of bits expected

134: 8 n: nbits

135:

# Index

135: 0 i: 0

# Pointer to pow2 entry

136: 0 p_pow2: 0

# Pointer to bits entry

137: 0 p_bits: 0

# Volatile tmp value

138: 0 tmp: 0

                             # the idiom to reserve space for a table of size k is

#

# table_start_addr: table - addr of element 0

# table_end_addr: table_end - 1 past element k

#

# table: ;

# @ = @ + size;

# table_end: ;

                             # power table is [pow2_start_addr ... pow2_end_addr] inclusive

139: 143 pow2_start_addr: pow2

140: 151 pow2_end_addr: pow2_end

                             # bits table is [bits_start_addr ... bits_end_addr - 1]

141: 151 bits_start_addr: bits

142: 159 bits_end_addr: bits_end

                             # powers of 2 table starts here, pow2[i] = 2^i 0 <= i < nbits

143: pow2:

143: @=@+nbits

151: pow2_end:

                             # bits of binary representation start here, bits[i] = value of bit i,

# in little endian ordering.

# bit[i] is value of bit representing 2^i

151: bits:

151: @=@+nbits

159: bits_end:

Symbol Table

159: @

1002: BGE

-3: CIN

-4: COUT

-1: IN

-2: OUT

0: PC

1: START

1001: SUB

0: _

127: _0

128: _1

129: _2

126: _n1

125: _n2

89: bigger

133: bit

151: bits

159: bits_end

142: bits_end_addr

141: bits_start_addr

135: i

13: loop_1_begin

45: loop_1_end

65: loop_2_begin

121: loop_2_end

134: n

8: nbits

137: p_bits

136: p_pow2

143: pow2

151: pow2_end

140: pow2_end_addr

139: pow2_start_addr

132: remain

97: smaller

138: tmp

131: two_to_i

130: value

Compute x div y and x mod y
demo/divide.txt

Code listing:

# Division Algorithm

# Given a >= 0, b > 0 compute

# a = b * q + r, 0 <= r < b

# q = a div b

# r = a rem b

                             # Based on the binary representation algorithm which in essence

                             # computes a div 1. The bin rep algorithm reduces the remainder by the

                             # largest possible power of 2 at each step. Instead of reducing by

                             # 2^i, reducing by b * 2^1 extracts out 2^i multiples of b.  The resulting

                             # sequence of bits is the representation of q, and the remaining value

# that is smaller than 2^0 * b is r.

# idiom to initialize PC with START addr

000: 1 _:START

001: START:

# bits posn 0 1 ... (nbits-1)

# bits value 2^0 2^1 ... 2^(nbits-1)

                             # Maximum number of bits - note this is an asm symbol not a mem cell

001: nbits = 8

# Read in a and b

001: 1001 -1 227 230 SUB, IN, _0, a

005: 1001 -1 227 231 SUB, IN, _0, b

# Make sure a is >= 0, b is > 0

009: 1002 230 227 17 BGE, a, _0, aok

013: 1001 227 230 230 SUB, _0, a, a

017: aok:

017: 1002 231 228 29 BGE, b, _1, bok

021: 1001 227 231 231 SUB, _0, b, b

025: 1002 227 231 221 BGE, _0, b, error

029: bok:

                             # Begin by computing a table of the multiples of b scaled by powers of

# 2 associated.

# Pointer p_b2n to bit 0, incremented on each bit

029: 1001 242 227 239 SUB, b2n_start_addr, _0, p_b2n

# Current value of b * 2^1

033: 1001 231 227 234 SUB, b, _0, two_to_i

# loop i = 0 .. n-1 times, using i as the index

037: 1001 227 227 238 SUB, _0, _0, i

041: 1002 238 237 73 loop_1_begin: BGE, i, n, loop_1_end

# indirect store: [two_to_i] -> [ [p_b2n] ]

# build instruction to store b2n into that address

045: 1001 239 227 52 SUB, p_b2n, _0, @+7

049: 1001 234 227 0 SUB, two_to_i, _0, 0

# double the power of 2

053: 1001 227 234 241 SUB, _0, two_to_i, tmp

057: 1001 234 241 234 SUB, two_to_i, tmp, two_to_i

# increment i and address of next b2n table entry

061: 1001 238 226 238 SUB, i, _n1, i

065: 1001 239 226 239 SUB, p_b2n, _n1, p_b2n

# back to top of loop

069: 1002 227 227 41 BGE, _0, _0, loop_1_begin

073: loop_1_end:

# Initialize the remainder

073: 1001 230 227 235 SUB, a, _0, remain

                             # end address is 1 beyond actual table, and we want to start

# at last entries and work down

077: 1001 243 228 239 SUB, b2n_end_addr, _1, p_b2n

081: 1001 245 228 240 SUB, bitsq_end_addr, _1, p_bitsq

# Work in decreasing powers of 2

# loop i = 0 .. n-1 times, using i as the index

085: 1001 227 227 238 SUB, _0, _0, i

089: 1002 238 237 145 loop_2_begin: BGE, i, n, loop_2_end

# indirect load: [ [p_b2n] ] -> [ two_to_i ]

# build instruction to load 2^(n-1-i) into two_to_i

093: 1001 239 227 98 SUB, p_b2n, _0, @+5

097: 1001 0 227 234 SUB, 0, _0, two_to_i

                             # remain < pow => remain does not contain that power so bit is 0

# otw bit is 1

101: 1001 227 227 236 SUB, _0, _0, bit

105: 1002 235 234 113 BGE, remain, two_to_i, bigger

109: 1002 227 227 121 BGE, _0, _0, smaller

113: bigger:

113: 1001 228 227 236 SUB, _1, _0, bit

# remove that power from remainder

117: 1001 235 234 235 SUB, remain, two_to_i, remain

121: smaller:

                             # build instruction to store the bit into the bitsq table.

121: 1001 240 227 128 SUB, p_bitsq, _0, @+7

125: 1001 236 227 0 SUB, bit, _0, 0

                             # increment i and decrement address of current entry in power

# table and bit value

129: 1001 238 226 238 SUB, i, _n1, i

133: 1001 239 228 239 SUB, p_b2n, _1, p_b2n

137: 1001 240 228 240 SUB, p_bitsq, _1, p_bitsq

# back to top of loop

141: 1002 227 227 89 BGE, _0, _0, loop_2_begin

145: loop_2_end:

145: 1001 235 227 233 SUB, remain, _0, r

# Now reconstruct q from its bits

149: 1001 245 228 240 SUB, bitsq_end_addr, _1, p_bitsq

153: 1001 227 227 238 SUB, _0, _0, i

157: 1001 227 227 232 SUB, _0, _0, q

161: 1002 238 237 197 loop_3_begin: BGE, i, n, loop_3_end

# we want to compute q <- 2 * q + bits[n-i]

# tmp <- 0 - bits[n-i] - q - q

# q <- 0 - tmp

# compute -2q, i.e.shift left 1 bit

165: 1001 227 232 241 SUB, _0, q, tmp

169: 1001 241 232 241 SUB, tmp, q, tmp

# build instruction to subtract bit[n-i] from tmp

173: 1001 240 227 179 SUB, p_bitsq, _0, @+6

177: 1001 241 0 241 SUB, tmp, 0, tmp

# and then q <- -tmp

181: 1001 227 241 232 SUB, _0, tmp, q

                             # increment i and decrement address of current entry in power

# table and bit value

185: 1001 238 226 238 SUB, i, _n1, i

189: 1001 240 228 240 SUB, p_bitsq, _1, p_bitsq

# back to top of loop

193: 1002 227 227 161 BGE, _0, _0, loop_3_begin

197: loop_3_end:

197: 1001 230 227 -2 SUB, a, _0, OUT

201: 1001 231 227 -2 SUB, b, _0, OUT

205: 1001 232 227 -2 SUB, q, _0, OUT

209: 1001 233 227 -2 SUB, r, _0, OUT

# Keep asking for input until get error

213: 1002 227 227 1 BGE, _0, _0, START

# Halt by setting PC to 0

217: 1001 227 227 0 SUB, _0, _0, PC

# Error

221: 1001 227 227 0 error: SUB, _0, _0, PC

                             # Some constants.  Note that if you put them in this order

                             # then you can index off _0, for example, this stores 1 - (-1)

# into tmp.

# SUB, _0+1, _0-1, tmp;

225: -2 _n2: -2

226: -1 _n1: -1

227: 0 _0: 0

228: 1 _1: 1

229: 2 _2: 2

# Inputs a and b

230: 0 a: 0

231: 0 b: 0

# Results q and r

232: 0 q: 0

233: 0 r: 0

# Current power 2^n

234: 1 two_to_i: 1

                             # Remainder bits to be reduced, should be < b when finished.

235: 0 remain: 0

# Current bit value

236: 0 bit: 0

# Variables

# Number of bits expected

237: 8 n: nbits

# Index

238: 0 i: 0

# Pointer to b2n entry

239: 0 p_b2n: 0

# Pointer to bitsq entry

240: 0 p_bitsq: 0

# Volatile tmp value

241: 0 tmp: 0

                             # the idiom to reserve space for a table of size k is

#

# table_start_addr: table - addr of element 0

# table_end_addr: table_end - 1 past element k

#

# table: ;

# @ = @ + size;

# table_end: ;

                             # power table is [b2n_start_addr ... b2n_end_addr] inclusive

242: 246 b2n_start_addr: b2n

243: 254 b2n_end_addr: b2n_end

                             # bitsq table is [bitsq_start_addr ... bitsq_end_addr - 1]

244: 254 bitsq_start_addr: bitsq

245: 262 bitsq_end_addr: bitsq_end

                             # b * 2^i table starts here, b2n[i] = b * 2^i 0 <= i < nbits

246: b2n:

246: @=@+nbits

254: b2n_end:

                             # bits of binary rep of q start here, bitsq[i] = value of bit i,

# in little endian ordering.

# bitq[i] is value of bit representing b * 2^i

254: bitsq:

254: @=@+nbits

262: bitsq_end:

Symbol Table

262: @

1002: BGE

-3: CIN

-4: COUT

-1: IN

-2: OUT

0: PC

1: START

1001: SUB

0: _

227: _0

228: _1

229: _2

226: _n1

225: _n2

230: a

17: aok

231: b

246: b2n

254: b2n_end

243: b2n_end_addr

242: b2n_start_addr

113: bigger

236: bit

254: bitsq

262: bitsq_end

245: bitsq_end_addr

244: bitsq_start_addr

29: bok

221: error

238: i

41: loop_1_begin

73: loop_1_end

89: loop_2_begin

145: loop_2_end

161: loop_3_begin

197: loop_3_end

237: n

8: nbits

239: p_b2n

240: p_bitsq

232: q

233: r

235: remain

121: smaller

241: tmp

234: two_to_i

Recursive computation of 2 ** n
demo/recursion.txt

Code listing:

# Program to show stack-based calling conventions

# Read in n

# Compute the following recursive function fn(n)

# fn(0) = 1;

# fn(n) = 2 * fn(n-1);

# that computes fn(n) = 2 ** n

# loc 0 is always PC by hardware

# loc 1 is stack pointer sp by convention

# loc 2 is address of bottom of stack,

# grows upward to increasing addrs

# Idiom to initialize PC with START addr

000: 16 _:START

001: 180 sp: stack

002: 180 addr_stack: stack

# Constants, convention is _ddddd or _nddddd.

# e.g. _12 for 12, _n2 for -2

003: 0 _0: 0

004: 1 _1: 1

005: 2 _2: 2

006: 3 _3: 3

007: -1 _n1: -1

008: -2 _n2: -2

009: -3 _n3: -3

010: -8 _n8: -8

# We compute m = fn(n) = 2 ** n

011: 0 n: 0

012:

012: 0 m: 0

# count the number of calls

013: 0 n_calls: 0

                             # Temp variables, be careful not to use across a fn call

# just like a register

014: 0 tmp1: 0

015: 0 result: 0

# Always reset the stack pointer when you start

016: 1001 2 3 1 START: SUB, addr_stack, _0, sp

# A stack frame looks like this:

# [[sp]-2] - function param

# [[sp]-1] - return result

# [[sp]] - return address

# Read n

020: 1001 -1 3 11 SUB, IN, _0, n

# Call fn(n),

                             # Allocate space on the stack for param n, result m, return addr

024: 1001 1 9 1 SUB, sp, _n3, sp

# Put n into the param position at [sp]-2

# i.e. [ n ] -> [ [sp]-2 ]

028: 1001 1 5 35 SUB, sp, _2, @+7

032: 1001 11 3 0 SUB, n, _0, 0

# Put 0 into the result position at [sp]-1 to 0

# i.e. 0 -> [ [sp]-1 ]

036: 1001 1 4 43 SUB, sp, _1, @+7

040: 1001 3 3 0 SUB, _0, _0, 0

                             # Put return address (next instruction after the call) into

# the return position at [sp].

044: 1001 1 3 51 SUB, sp, _0, @+7

048: 1001 0 10 0 SUB, PC, _n8, 0

# Call the function

052: 1002 3 3 80 BGE, _0, _0, fn

# Return here

# Fetch result

056: 1001 1 4 61 SUB, sp, _1, @+5

060: 1001 0 3 12 SUB, 0, _0, m

# Pop stack frame: param, result, and return address

064: 1001 1 6 1 SUB, sp, _3, sp

# Print the result

068: 1001 12 3 -2 SUB, m, _0, OUT

# Print the the number of recursive calls

072: 1001 13 3 -2 SUB, n_calls, _0, OUT

# Halt by setting PC to 0

076: 1001 0 0 0 SUB, PC, PC, PC

# Body of function fn

# fn(0) = 1;

# fn(n) = 2 * fn(n-1);

080: fn:

# Count calls

080: 1001 13 7 13 SUB, n_calls, _n1, n_calls

                             # Fetch the param n into tmp1, the global copy of n, not

# safe over a function call

084: 1001 1 5 89 SUB, sp, _2, @+5

088: 1001 0 3 14 SUB, 0, _0, tmp1

# Don't recurse if n <= 0;

092: 1002 3 14 156 BGE, _0, tmp1, base_case

# Recursive call, put new param on stack

                             # Allocate space on the stack for param n, result m, return addr

096: 1001 1 9 1 SUB, sp, _n3, sp

# Put n-1 into the param position.

100: 1001 1 5 107 SUB, sp, _2, @+7

104: 1001 14 4 0 SUB, tmp1, _1, 0

# Clear result location

108: 1001 1 4 115 SUB, sp, _1, @+7

112: 1001 3 3 0 SUB, _0, _0, 0

# Return address

116: 1001 1 3 123 SUB, sp, _0, @+7

120: 1001 0 10 0 SUB, PC, _n8, 0

# Call fn.

124: 1002 3 3 80 BGE, _0, _0, fn

# return here

# Fetch result

128: 1001 1 4 133 SUB, sp, _1, @+5

132: 1001 0 3 14 SUB, 0, _0, tmp1

# Pop stack frame

136: 1001 1 6 1 SUB, sp, _3, sp

# Multiply result by 2

# result is unsafe over function calls

140: 1001 3 14 15 SUB, _0, tmp1, result

144: 1001 15 14 15 SUB, result, tmp1, result

148: 1001 3 15 15 SUB, _0, result, result

152: 1002 3 3 160 BGE, _0, _0, done

156: base_case:

# Result is 1

156: 1001 4 3 15 SUB, _1, _0, result

# now return to the caller

160: done:

                             # Return, stack to be cleaned up by caller who is only

                             # one that knows how many things were put on the stack.

                             # Put returned result into stack, just below return address

160: 1001 1 4 167 SUB, sp, _1, @+7

164: 1001 15 3 0 SUB, result, _0, 0

# Get the return address from top of the stack

# and install it into the branch

168: 1001 1 3 173 SUB, sp, _0, @+5

172: 1001 0 3 179 SUB, 0, _0, @+7

176: 1002 3 3 0 BGE, _0, _0, 0

# Stack starts here

180: 0 stack: 0

                             # Advance a bit so that the trace routine shows the stack cells

# with labels

181: @ = @ + 32

213: 0 stack_end: 0

Symbol Table

213: @

1002: BGE

-3: CIN

-4: COUT

-1: IN

-2: OUT

0: PC

16: START

1001: SUB

0: _

3: _0

4: _1

5: _2

6: _3

7: _n1

8: _n2

9: _n3

10: _n8

2: addr_stack

156: base_case

160: done

80: fn

12: m

11: n

13: n_calls

15: result

1: sp

180: stack

213: stack_end

14: tmp1

Horrible recursive computation Fibonacci
demo/fib.txt

Code listing:

                             # Program to show more complex stack-based calling conventions

# Read in n

# Compute fib(n) very inefficiently by

# fib(0) = 0;

# fib(1) = 1;

# fib(n) = fib(n-2) + fib(n-1) for n >=2

#

# n: 0 1 2 3 4 5 6 7

# Fib: 0 1 1 2 3 5 8 13

# loc 0 is always PC by hardware

# loc 1 is stack pointer sp by convention

# loc 2 is address of bottom of stack,

# grows upward to increasing addrs

# Idiom to initialize PC with START addr

000: 17 _:START

001: 285 sp: stack

002: 285 addr_stack: stack

# Constants, convention is _ddddd or _nddddd.

# e.g. _12 for 12, _n2 for -2

003: 0 _0: 0

004: 1 _1: 1

005: 2 _2: 2

006: 3 _3: 3

007: -1 _n1: -1

008: -2 _n2: -2

009: -3 _n3: -3

010: -8 _n8: -8

# We compute m = fib(n)

011: 0 n: 0

012:

012: 0 m: 0

# count the number of calls

013: 0 n_calls: 0

                             # Temp variables, be careful not to use across a function call

# just like a register

014: 0 tmp_r: 0

015: 0 tmp_n: 0

016: 0 result: 0

# Always reset the stack pointer when you start

017: 1001 2 3 1 START: SUB, addr_stack, _0, sp

# A stack frame on call looks like this:

# [[sp]-2] - function param n

# [[sp]-1] - return result fib(n)

# [[sp]] - return address

                             # During execution after entry and allocating local variables

# frame, the stack looks like this::

# [[sp]-4] - function param n

# [[sp]-3] - return result

# [[sp]-2] - return address

# [[sp]-1] - value of n on initial entry

# [[sp]] - value of fib(n-2)

# Read n

021: 1001 -1 3 11 SUB, IN, _0, n

# Call fib(n),

                             # Allocate space on the stack for param n, result m, return addr

025: 1001 1 9 1 SUB, sp, _n3, sp

# Put n into the param position at [sp]-2

# i.e. [ n ] -> [ [sp]-2 ]

029: 1001 1 5 36 SUB, sp, _2, @+7

033: 1001 11 3 0 SUB, n, _0, 0

# Put 0 into the result position at [sp]-1 to 0

# i.e. 0 -> [ [sp]-1 ]

037: 1001 1 4 44 SUB, sp, _1, @+7

041: 1001 3 3 0 SUB, _0, _0, 0

                             # Put return address (next instruction after the call) into

# the return position at [sp].

045: 1001 1 3 52 SUB, sp, _0, @+7

049: 1001 0 10 0 SUB, PC, _n8, 0

# Call the function

053: 1002 3 3 85 BGE, _0, _0, fib

# Return here

# Fetch result

057: 1001 1 4 62 SUB, sp, _1, @+5

061: 1001 0 3 12 SUB, 0, _0, m

# Pop stack frame: param, result, and return address

065: 1001 1 6 1 SUB, sp, _3, sp

# Print n

069: 1001 11 3 -2 SUB, n, _0, OUT

# Print the result

073: 1001 12 3 -2 SUB, m, _0, OUT

# Print the the number of recursive calls, negated

077: 1001 3 13 -2 SUB, _0, n_calls, OUT

# Halt by setting PC to 0

081: 1001 0 0 0 SUB, PC, PC, PC

#

# Body of function fib

#

085: fib:

# Count calls

085: 1001 13 7 13 SUB, n_calls, _n1, n_calls

                             # Fetch the param into tmp_n, a global temp not safe over

# function calls

089: 1001 1 5 94 SUB, sp, _2, @+5

093: 1001 0 3 15 SUB, 0, _0, tmp_n

# Only recurse if n > 2

097: 1002 3 15 257 BGE, _0, tmp_n, basecase_0

101: 1002 4 15 249 BGE, _1, tmp_n, basecase_1

                             # We need to store n and fib(n-2) across function calls.  So we

                             # need to allocate two local variables on the stack to hold

# n

# fib(n-2)

105: 1001 1 8 1 SUB, sp, _n2, sp

109: 1001 1 3 116 SUB, sp, _0, @+7

113: 1001 3 3 0 SUB, _0, _0, 0

117: 1001 1 4 124 SUB, sp, _1, @+7

121: 1001 15 3 0 SUB, tmp_n, _0, 0

# Recursive call, put new param on stack

                             # Allocate space on the stack for param n, result m, return addr

125: 1001 1 9 1 SUB, sp, _n3, sp

# Put n-2 into the param position.

129: 1001 1 5 136 SUB, sp, _2, @+7

133: 1001 15 5 0 SUB, tmp_n, _2, 0

# Clear result location

137: 1001 1 4 144 SUB, sp, _1, @+7

141: 1001 3 3 0 SUB, _0, _0, 0

# Return address

145: 1001 1 3 152 SUB, sp, _0, @+7

149: 1001 0 10 0 SUB, PC, _n8, 0

# Call fib.

153: 1002 3 3 85 BGE, _0, _0, fib

# return here

# Fetch result fib(n-1)

157: 1001 1 4 162 SUB, sp, _1, @+5

161: 1001 0 3 14 SUB, 0, _0, tmp_r

# Pop stack frame

165: 1001 1 6 1 SUB, sp, _3, sp

# Save fib(n-2) in local stack variable

169: 1001 1 3 176 SUB, sp, _0, @+7

173: 1001 14 3 0 SUB, tmp_r, _0, 0

# Get n-1

177: 1001 1 4 182 SUB, sp, _1, @+5

181: 1001 0 4 15 SUB, 0, _1, tmp_n

# Call fib(n-1)

                             # Allocate space on the stack for param n, result m, return addr

185: 1001 1 9 1 SUB, sp, _n3, sp

# Put n-1 into the param position.

189: 1001 1 5 196 SUB, sp, _2, @+7

193: 1001 15 3 0 SUB, tmp_n, _0, 0

# Clear result location

197: 1001 1 4 204 SUB, sp, _1, @+7

201: 1001 3 3 0 SUB, _0, _0, 0

# Return address

205: 1001 1 3 212 SUB, sp, _0, @+7

209: 1001 0 10 0 SUB, PC, _n8, 0

# Call fib.

213: 1002 3 3 85 BGE, _0, _0, fib

# return here

# Fetch result of fib(n-1)

217: 1001 1 4 222 SUB, sp, _1, @+5

221: 1001 0 3 16 SUB, 0, _0, result

# Pop stack frame

225: 1001 1 6 1 SUB, sp, _3, sp

# Fetch saved fib(n-2), negated

229: 1001 1 3 235 SUB, sp, _0, @+6

233: 1001 3 0 14 SUB, _0, 0, tmp_r

# fib(n-1) + fib(n-2)

237: 1001 16 14 16 SUB, result, tmp_r, result

# Pop local variables

241: 1001 1 5 1 SUB, sp, _2, sp

245: 1002 3 3 265 BGE, _0, _0, done

249: basecase_1:

# n=1 Result is 1

249: 1001 4 3 16 SUB, _1, _0, result

253: 1002 3 3 265 BGE, _0, _0, done

257: basecase_0:

# n=0 Result is 0

257: 1001 3 3 16 SUB, _0, _0, result

261: 1002 3 3 265 BGE, _0, _0, done

# now return to the caller

265: done:

                             # Return, stack to be cleaned up by caller who is only

                             # one that knows how many things were put on the stack.

                             # Put returned result into stack, just below return address

265: 1001 1 4 272 SUB, sp, _1, @+7

269: 1001 16 3 0 SUB, result, _0, 0

# Get the return address from top of the stack

# and install it into the branch

273: 1001 1 3 278 SUB, sp, _0, @+5

277: 1001 0 3 284 SUB, 0, _0, @+7

281: 1002 3 3 0 BGE, _0, _0, 0

# Stack starts here

285: 0 stack: 0

                             # Advance a bit so that the trace routine shows the stack cells

# with labels

286: @ = @ + 32

318: 0 stack_end: 0

Symbol Table

318: @

1002: BGE

-3: CIN

-4: COUT

-1: IN

-2: OUT

0: PC

17: START

1001: SUB

0: _

3: _0

4: _1

5: _2

6: _3

7: _n1

8: _n2

9: _n3

10: _n8

2: addr_stack

257: basecase_0

249: basecase_1

265: done

85: fib

12: m

11: n

13: n_calls

16: result

1: sp

285: stack

318: stack_end

15: tmp_n

14: tmp_r

The Z-machine

We could introduce a new 4-address instruction, SAB (subtract and branch), that combines the SUB and BGE instructions.

Opcode Op1 Op2 Op3 Op4 Action

SAB (1003) A B C T [A] - [B] -> [C]
if [C] >= 0 branch to T

SAB can implement SUB and BGE. For SUB, just set the branch target T of SAB to be the next instruction (T = @+5). For BGE, throw away the result into a TMP location.

SUB, A, B, C => SAB, A, B, C, @+5

BGE, A, B, T => SAB, A, B, TMP, T

But that means that we only need the SAB instruction. We could build a new machine, the Z-machine that has only one instruction. So it does not require an opcode - thus the Z for Zero opcode machine. So a program is just a string of addresses, every sequence of 4 address being a legitimate instruction! The SUB and BGE instructions in the Z-machine are implemented as:

SUB, A, B, C => A, B, C, @+4

BGE, A, B, T => A, B, TMP, T

This makes for an interesting code generation and optimization exercise.

Opcode	Op1	Op2	Op3	Action
SUB (1001)	A	B	C	[A] - [B] -> [C]
BGE (1002)	A	B	T	if [A] >= [B] branch to T

SUB, A, B, C	=>	SAB, A, B, C, @+5
BGE, A, B, T	=>	SAB, A, B, TMP, T