Compilers class where I created and optimized a compiler in C++ using flex and bison. The compiler was written for a simplified language known as Tiny.
Tiny is a very simple assembly code interpreter.
The executable runs on Sun architectures
an example Tiny program (a longer program is attached at the end):
var i
str prompt "enter a number: "
str announce "\nthe square is"
label myloop ; main loop
sys writes prompt
sys readi i
move i r3
muli i r3
; some more comment
sys writes announce
sys writei r3 ;
cmpi 1 r3
jne myloop
sys halt ; optional if at end
end
Tiny simulates an architecture that has 4 data registers, a stack pointer (sp),
a frame pointer (fp) and both integer an floating point arithmetic. All data
elements have size 1. The data representation is unknown to the user.
Tiny accepts the following assembly codes:
var id ; reserves and names a memory cell. first letter alphanum
then alphanum with punctuations, case sensitive
; both integer and real (float) have the size of one memory cell
str sid "a string constant" ; the only operation on string constants is sys writes sid
strings can include \n for end-of-line
; var and str declarations must preceed all code and labels (during debugging,
; enforcement of this rule can be disabled. See the "mix" command line option)
label target ; a jump target
move opmrl opmr ; only one operand can be a memory id or stack variable
addi opmrl reg ; integer addition, reg = reg + op1
addr opmrl reg ; real (i.e. floatingpoint) addition
subi opmrl reg ; computes reg = reg - op1
subr opmrl reg
muli opmrl reg ; computes reg = reg * op1
mulr opmrl reg
divi opmrl reg ; computes reg = reg / op1
divr opmrl reg
inci reg ; increment the (integer) register value by 1
deci reg ; decrement the (integer) register value by 1
cmpi opmrl reg ; integer comparison; must preceed a conditional jump;
it compares the first operand with the second op and
sets the "processor status". (The status remains the
same until the next cmp instruction is executed.)
E.g, a subsequent jgt will jump if op1 > op2
push opmrl ; push a data item onto the stack. Operand can be
; omitted, in which case an empty element is pushed.
pop opmr ; pops an element from the stack. If the operand is
; non-empty, the element is moved there
jsr target ; jump to target and push the current pc onto the stack
ret ; pop an address from the stack and jump there
link x ; push frame pointer (fp) onto stack, copy sp into fp,
; push x empty cells onto stack
unlnk ; copy fp into sp, pop fp from stack
cmpr opmrl reg ; real comparison
jmp target ; unconditional jump
jgt target ; jump if (op1 of the preceeding cmp was) greater (than op2)
jlt target ; jump if less than
jge target ; jump if greater of equal
jle target ; jump if less or equal
jeq target ; jump if equal
jne target ; jump if not equal
sys readi opmr ; system call for reading an integer from input
sys readr opmr ; system call for reading a real value
sys writei opmr ; system call for outputting an integer
sys writer opmr ; system call for outputting an integer
sys writes sid ; system call for outputting a string constant
sys halt ; system call to end the execution
end ; end of the assembly code (not an opcode)
notation used for the operands:
id stands for the name of a memory location
sid stands for the name of a string constant
x stands for an integer number
target stands for the name of a jump target
$offset stands for a stack variable at address fp+offset
reg stands for a register, named r0,r1,r2, or r3, case insensitive
opmrl stands for a memory id, stack variable, register or a number (literal),
the format for real is digit*[.digit*][E[+|-]digit*]
opmr stands for a memory id, stack variable, or a register
; semicolon leads in a comment (which is ignored by the interpreter). It can
be at the beginning on a line or after an assembly code
data representation:
No assumption can be made about the representations. Real and integer cannot be
mixed. Using an integer where a real is expected (and vice versa) leads to
undefined results.
Running tiny
syntax : tiny sourcefile [d1|d2|d3 [mix]]
the second argument generates debug output
d1: print a program listing
d2: d1 + print each line as it gets interpreted
d2: d2 + print machine status and variable content at each step
mix: allow declarations inbetween code
A longer sample program. The program asks for a number and prints 5 asterisk
triangles with this length.
var length
str star "*"
str prompt "enter number: "
str eol "\n"
move 0 r2
sys writes prompt
sys readi length
move 1 r3
label outerloop
move r3 r0
label starloop
sys writes star
subi 1 r0
cmpi 0 r0
jne starloop
sys writes eol
addi 1 r3
cmpi length r3
jge outerloop
move 1 r3
addi 1 r2
cmpi 4 r2
jge outerloop
sys halt
end