Skip to content

Latest commit

 

History

History
452 lines (393 loc) · 27.3 KB

File metadata and controls

452 lines (393 loc) · 27.3 KB

ElfC Internal Information

Build Walkthrough

ElfC creates an Elf/OS binary from a C source file using the following build process shown below using the traditional "Hello World" example file hello.c.

hello.c

#include <stdio.h>

int main() {
  printf("Hello, World!");
}

The build process consists of the following three major steps:

  1. The ElfC program compiles the source file hello.c and the included header file <stdio.h> into the assembly code file hello.asm.
  2. The ElfC program invokes the Asm/02 assembler to assemble the hello.asm file into the hello.prg object file. The assembler creates the assembly list file hello.lst and the assembly build file hello.build.
  3. The ElfC program invokes the Link/02 linker to link the hello.prg object file to the runtime startup module file crt0.prgand the default C library files stdlib.lib and stdio.lib to create the Elf/OS binary file hello.elfos. The linker outputs the hello.lkb file with linker build number, which serves as the version number for the binary program, and the symbol map hello.sym with the linked object module and library routine addresses.
StepInput FilesDescriptionProgramOutput FIlesDescription
1hello.cC SourceElfC compilerhello.asmAssembly Code
stdio.hC Header File
2hello.asmAssembly CodeAsm/02 assemblerhello.prgObject File
hello.lstAssembly Listing
hello.buildAssembler Build Number
3hello.prgObject FileLink/02 linkerhello.elfosElf/OS binary
crt0.prgRuntime Startup Modulehello.lkbVersion Number (Linker Build Number)
stdlib.libC Library
stdio.libC Libraryhello.symSymbol Map

Notes:

  • The option -S will compile to step one. The command elfc -S will compile, but not assemble and not link the code.
  • The option -c will comiple and assemble to step two. The command elfc -c will compile and assemble, but not link the code.
  • The option -N will link the program code without the default standard C libraries, stdlib and stdio.
  • The option -o can be used to set the generated binary file program name. The default name is the same as the C source file.
  • The linker build number is used to set the binary program Elf/OS version number.
  • The assembly list file and linker symbol map contain useful information for debugging.

Stack Frame

When calling a function ElfC will push the function arguments from right to left onto the stack. ElfC then sets the register RB to current value of R7, the expression stack pointer, to define the base of the stack frame. Character arguments are promoted to integers when passed as arguments, and pointers are passed as integer values. This means that the stack size of every argument is 2 bytes.

Inside the function, local (auto) variables are allocated on the stack. The stack size for integers and pointers is two bytes. Characters occupy one byte of the two byte stack slot. Arrays, structures and unions are expanded to an even number of bytes when allocated on the stack.

If the first local (auto) variable in a function is a structure or union, then a two byte padding element is before it, to preserve the structure or union data on the stack in case that data is used in the return value of the function.

Example 1:

int fn1(int n, char c, int *p) {
  int  i1 = 4;
  char c1 = c+3;
  char a[3] = {'x','y','z'};

  /* Example 1 Breakpoint */
  BRKPT

  return i1;
}

Stackframe for Example 1:

>
Base OffsetObjectStack SizeDescriptionNoteExample Address
R7 points to the Top of Expression Stack2344 + 1
-8a[0]4Array characters 'x','y','z'Lowest Memory Address (TOS)2345
a[1]Array padded to even byte size2346
a[2]2347
xxpadding byte2348
-4c12characterCharacter 'd' padded to even byte size2349
xxpadding byte234A
-2i12integer (LSB)i1 = 4234B
integer (MSB)234C
RB points to the Base of the Stack Frame234C + 1
0n2integern = 42234D
+2c2characterChar 'a' promoted to int234F
+4p2pointerHighest Memory Address2351

Running the example program stackframe.c with _STGROM_ defined yields the following data at the Breakpoint for Example 1:

BREAK AT XP=23 D=78 DF=0
R0=0000 R1=0000 R2=2260 R3=23A0
R4=FFC6 R5=FFD8 R6=24D6 R7=2344
R8=2352 R9=2101 RA=2363 RB=234C
RC=0001 RD=234F RE=0100 RF=236A

>>>E 2340 2350
        0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F
2340>  00 00 00 00 00 78 79 7A 00 64 00 04 00 2A 00 61  .....xyz.d...*.a
2350>  00 63 23 00 00 00 00 00 00 D6 EE 04 00 D6 EE D6  .c#......Vn..VnV
>>>

Example 2:

struct scrabble_tile {
  char letter;
  int  score;
};

/* get the tile from a player's rack */
struct scrabble_tile rack(int pos) {
    struct scrabble_tile tile;

    /* get the scrabble tile from rack */
    tile.letter = 'D';  /* Pretend it is 'D' for demo */
    tile.score = 2;

    /* Example 2 Breakpoint */
     BRKPT

    /* return tile at position in rack */
    return tile;
}

Stackframe for Example 2:

Base OffsetObjectStack SizeDescriptionNoteExample Address
R7 points to the Top of Expression Stack2348 + 1
-6scrabble_tile.letter4char Lowest Memory Address (TOS), letter = 'D'2349
xxpadding bytefield padded to even byte size234A
scrabble_tile.scoreinteger (LSB)score = 2234B
integer (MSB)234C
-2_pad22 byte paddingpadded to preserve structure data234D
RB points to the Base of Stack Frame234E + 1
0pos2integerHighest Memory Address, pos = 3234F

Running the example program stackframe.c with _STGROM_ defined yields the following data at the Breakpoint for Example 2:

BREAK AT XP=23 D=00 DF=0
R0=0000 R1=0000 R2=2260 R3=23F8
R4=FFC6 R5=FFD8 R6=2502 R7=2348
R8=2348 R9=2101 RA=0002 RB=234E
RC=0001 RD=0002 RE=01FF RF=EDDF

>>>E 2340 2350
        0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F
2340>  00 13 40 0B 40 4B 23 02 00 44 EE 02 00 53 23 03  ..@.@K#..Dn..S#.
2350>  00 5B 23 00 00 00 00 00 00 D6 EE 04 00 D6 EE 04  .[#......Vn..Vn.
>>>

Example 3:

struct point {
  int x;
  int y;
};

/* scale a point by a value */
struct point scale(int n, struct point p) {
    int factor;

    /* negative or zero is not valid */
    factor = (n < 1) ? 1 : n;

    /* multiple (x,y) values by scaling factor */
    p.x = factor * (p.x);
    p.y = factor * (p.y);

    /* Example 3 Breakpoint */
     BRKPT

    /* return point */
    return p;
}

Stackframe for Example 3:

Base OffsetObjectStack SizeDescriptionNoteExample Address
R7 points to the Top of Expression Stack2346 + 1
-2factor2integerLowest Memory Address (TOS), factor = 22347
RB points to the Base of Stack Frame2348 + 1
0n2integern = 22349
2point.x4integerx = -2234B
point.yintegery = 2000234D
6_pad22 byte padding to preserve structure dataHighest Memory Address234F

Running the example program stackframe.c with _STGROM_ defined yields the following data at the Breakpoint for Example 3:

BREAK AT XP=23 D=07 DF=0
R0=0000 R1=0000 R2=2260 R3=248A
R4=FFC6 R5=FFD8 R6=2572 R7=2346
R8=2346 R9=2101 RA=07D0 RB=2348
RC=0FA0 RD=07D0 RE=0100 RF=07D0

>>>E 2340 2350
        0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F
2340>  40 02 00 4D 23 D0 07 02 00 02 00 FE FF D0 07 00  @..M#P.....~.P..
2350>  00 53 23 00 00 00 00 FF FF E8 03 44 EE 02 00 04  .S#......h.Dn...
>>>

Notes:

  • The Stack grows downwards in memory.
  • The Bottom of Stack is at the highest memory address
  • The Top of Stack (TOS) is at the lowest memory address.
  • The default stack size is 2 bytes, the integer size.
  • Arguments are pushed on the stack from right to left, so that the first argument is at offset 0.
  • After the arguments are pushed to the expression stack, RB is set to the value of R7.
  • The base pointer RB defines the base address of the stack frame.
  • Local variables are the allocated onto the expression stack.
  • Arguments are referenced by positive offsets from the base address.
  • Local variables are referenced by negative offsets from the base address.
  • RB is used as a reference to access a function's arguments and local variables on the stack frame.
  • Since they are stack pointers, R7 and RB point to the address one below the data on the stack.
  • Character arguments are promoted to int on the expression stack.
  • Arrays are padded to an even byte size on the expression stack.
  • Structures and unions have their fields padded to the stack size (2 bytes).
  • If a struct/union is the first local (auto) variable in the function, an 2-byte padding element is added so that the struct/union data is preserved in case it is needed for a return value.
  • If a structure/union is the last argument in the function call, a 2-byte padding element is pushed on the stack before the struct/union so that the argument data is preserved in case it is needed for a return value.
  • At the end of a function, R7 is moved back by the size of the local variables, and the value of R7 is checked with RB to validate that the expression stack has returned to its base address.
  • If R7 does not equal RB when checked, then a Stack Creep Error is issued, and the program terminates.
  • Otherwise the function returns to the caller, with R7 and R8 restored to their previous values, and the program continues.

Registers Used

Input FilesDescriptionOwnerAvailability
R0DMA PointerOSReserved
R1Interrupt HandlerOSReserved
R2System Stack Pointer (SP)OSReserved
R3Program Instruction and Subroutine Argument PointerOSReserved
R4SCRT Call RoutineOSReserved
R5SCRT Return RoutineOSReserved
R6SCRT Argument and Return PointOSReserved
R7Expression Stack Pointer (ESP)ElfCReserved
R8Expression Temp ValueElfCGeneral Use
R9Subroutine PointerElfCReserved
RAAccumulator and Return ValueElfCReserved
RBCaller Stack Frame Base PointerElfCReserved
RCCounterUserGeneral Use
RDDestination Pointer or Data ValueUserGeneral Use
RE.1Baud Rate ByteOSReserved
RE.0SCRT Scratch ByteOSGeneral Use
RFBuffer PointerUserGeneral Use

Notes:

  • SCRT stands for the "Standard Call and Return" routine.
  • 'Reserved' means that the values of these registers should not be changed, even when not in use by its owner.
  • 'General Use' means that the register value may be changed when not directly in use by its owner.

Code Generation Interface

The following subroutines are invoked by the ElfC code generation code in the cgelf.c source file.

Arithmetic Subroutines
NameDescription
add16Add 2 signed 16-bit numbers on expression stack
div16Divide 2 signed 16-bit numbers on expression stack
mdsgn16Prepare 2 signed numbers on expression stack for Multiplication or Division
mod16Compute the Modulo of 2 numbers on expression stack
mul16Multiply 2 signed 16-bit numbers on expression stack
neg16Negate 16-bit integer on expression stack
shl16Left Shift 16-bit integer on expression stack one bit
shr16Right Shift 16-bit integer on expression stack one bit
sub16Subtract 2 signed 16-bit numbers on expression stack
Boolean and Comparison Subroutines
NameDescription
and16And two 16-bit numbers on expression stack
bool16Convert 16-bit value on expression stack to its boolean value (1 or 0)
eq16Compare 2 16-bit values for equality
false16Push boolean false value (0) onto expression stack
gt16Compare 2 signed 16-bit values for SOS greater than TOS
gte16Compare 2 signed 16-bit values for SOS greater or equal TOS
inv16Invert a 16-bit integer on expression stack
lt16Compare 2 signed 16-bit values for SOS less than TOS
lte16Compare 2 signed 16-bit values for SOS less or equal TOS
ne16Compare 2 16-bit values for non-equality
not16Boolean Not value of a 16-bit number on the expression stack
or16Or two 16-bit numbers on expression stack
true16Push boolean true value (1) onto expression stack
uge16Compare 2 unsigned 16-bit values for SOS greater or equal TOS
ugt16Compare 2 unsigned 16-bit values for SOS greater than TOS
ule16Compare 2 unsigned 16-bit values for SOS less or equal TOS
ult16Compare 2 unsigned 16-bit values for SOS less than TOS
xor16Exclusive Or (xor) two 16-bit numbers on expression stack
Stack Manipulation Subroutines
NameDescription
deref16Replace a pointer on the expression stack with the 16-bit value it references
deref8Replace a pointer on the expression stack with the 8-bit value it references
derefmReplace a pointer on the expression stack with the struct/union memory block it references
dget16Get a 2-byte value from the expression stack (ESP is unchanged)
dpop16Pop a 2-byte value from the expression stack
dpush16Pop a 2-byte value from the expression stack
epush16Push a 2-byte constant onto expression stack
epush8Push 1-byte char value onto expression stack
esmoveMove the expression stack pointer by a signed offset
mcopyCopy the contents of a structure or union referenced by the pointer at the SOS into the structure or union referenced by the pointer at the TOS
sclsos2nScale a 16-bit pointer offset at the SOS of the expression stack by a power of 2
scltos2nScale a 16-bit pointer offset at the TOS of the expression stack by a power of 2
stkchkCheck the expression stack for a stack creep error and return DF = 1 if the stack pointer has not returned to the base pointer.
swap16Swap the two 16-bit numbers at TOS and SOS on expression stack
unscl2nUnscale a 16-bit pointer difference at the TOS of the expression stack by a power of 2
Local Variable Subroutines
NameDescription
laddr16Put the address of a local variable onto the expression stack
ldec16Decrement a 16-bit local variable value
ldec8Decrement an 8-bit local variable value
lget16Get a 2-byte local variable (ESP is unchanged)
linc16Increment a 16-bit local variable value
linc8Increment an 8-bit local variable value
linit16Initialize a local variable to a 16-bit (ESP is unchanged)
lpdec16Decrement a local pointer variable by size bytes
lpinc16Increment a local pointer variable by size bytes
lpush16Push a 16-bit local variable value onto the expression stack
lpush8Push an 8-bit local variable value onto the expression stack
lset16Set a 2-byte local variable with value in the RA register (ESP is unchanged)
lstor16Copy a 2-byte value from the expression stack into a local variable (ESP is unchanged)
lstor8Copy a 1-byte value from the expression stack into a local variable (ESP is unchanged)
Pointer Reference Subroutines
NameDescription
pdec16Decrement a 2-byte value referenced by the pointer value in the RA register (ESP is unchanged)
pdec8Decrement a 1-byte value referenced by the pointer value in the RA register (ESP is unchanged)
pdecptrDecrement a pointer value referenced by the pointer value in the RA register (ESP is unchanged)
pinc16Increment a 2-byte value referenced by the pointer value in the RA register (ESP is unchanged)
pinc8Increment a 1-byte value referenced by the pointer value in the RA register (ESP is unchanged)
pincptrIncrement a pointer value referenced by the pointer value in the RA register (ESP is unchanged)
psaveGet a pointer value from the expression stack and store it in the RA register (ESP is unchanged)
pstor16Get a 2-byte value from the expression stack and store it via a pointer value in RA (ESP is unchanged)
pstor8Get a 1-byte value from the expression stack and store it via a pointer value in RA (ESP is unchanged)
Global and Static Variable Subroutines
NameDescription
vdec16Decrement 2-byte global or static variable
vdec8Decrement 1-byte global or static variable
vinc16Increment 2-byte global or static variable
vinc8Increment 1-byte global or static variable
vpdec16Decrement a global or static pointer to a variable value of size bytes
vpinc16Increment a global or static pointer to a variable value of size bytes
vpop16Pop 2-byte global or static variable from expression stack
vpush16Push 2-byte global or static variable onto expression stack
vpush8Push 1-byte global or static variable onto expression stack
vstor16Get a 2-byte value from the expression stack and store it in a global or static variable (ESP is unchanged)
vstor8Get a 1-byte value from the expression stack and store it in a global or static variable (ESP is unchanged)

Notes:

  • TOS stands for Top Of Stack.
  • SOS stands for Second On Stack.
  • ESP stands for Expression Stack Pointer.
  • Take care when using the Stack Manipulation routines to keep the stack 2-byte aligned.
  • All the Pointer Reference Subroutines leave the ESP unchanged
  • All other subroutines with names containing get, init, set or stor leave the ESP unchanged.
  • Local subroutine names begin with the letter l.
  • Pointer subroutine names begin with the letter p.
  • Global and Static Variable subroutine names begin with the letter v.

ElfC File Descriptor

Byte IndexDescriptionSize
0-3Current Offset4
4-5DTA Pointer2
6-7EOF2
8Flags Byte1
9-12Directory Sector4
13-14Directory Offset2
15-18Current Sector4
19Drive Number (Elf/OS v5)1
20Drive FS Type (Elf/OS v5)1
211 Byte Padding1
22-533DTA512

Notes:

  • The total Size of ElfC File Descriptor is 534 bytes.
  • The DTA begins 22 bytes offset from the start of the FD.
  • This FD format is valid for Mini/DOS and Elf/OS v5

Translation Limits

The ANSI C89/C90 specification defines the following minimum translation limits in section 5.2.4.1 of the specification. ElfC defines constants for most of these values that can be changed in the defs.h header file in the source code.

DescriptionElfC LimitMeets Spec?Limiting Factor
15 nesting levels of compound statements16YesMAXBREAK
8 nesting levels of conditional inclusion16YesMAXIFDEF
12 pointer, array, and function declarators (in any combination) in a declaration15Yes, with some ExceptionsMAXPTR (See Notes Below for Exceptions)
31 nesting levels of parenthesized declarators1NoParentheses in a declaration are only supported for declaring a function pointer.
32 nesting levels of parenthesized expressions within a full expression1024YesNSYMBOLS
31 significant initial characters in an internal identifier or a macro name32YesNAMELEN
6 significant initial characters in an external identifier32YesNAMELEN
511 external identifiers in one translation unit1024YesNSYMBOLS
127 identifiers with block scope declared in one block1024YesNSYMBOLS
1024 macro identifiers defined in one translation unit1024YesNSYMBOLS
31 parameters in one function definition32YesMAXFNARGS
31 arguments in one function call32YesMAXFNARGS
31 parameters in one macro definition32YesMAXMARGS
31 arguments in one macro invocation32YesMAXMARGS
509 characters in a logical source line512YesTEXTLEN
509 characters in a character string literal (after concatenation)512YesTEXTLEN
32767 bytes in an object (in a hosted environment)65535YesAsm/02 and Link/02 Limit
8 nesting levels for #included files32YesMAXFILES
257 case labels for a switch statement (including label for default case)257YesMAXCASE + 1 for default case label
127 members in a single structure or union1024YesNSYMBOLS
127 enumeration constants in a single enumeration1024YesNSYMBOLS
15 levels of nested structure or union definitions in a single declaration list1024Yes, with some ExceptionsNSYMBOLS (See Notes Below for Exceptions)

Notes:

  • Up to 15 levels of indirection is supported in a declaration involving pointers, arrays and structure/unions.
  • ElfC does not support multi-dimensional arrays, e.g. int a[3][4]; is not supported.
  • Pointers to function pointers are not supported., e.g. int (**f)(); is not supported.
  • Elfc supports structures and unions and pointers to struct/union and pointers to struct/union pointers, eg. struct stc, struct stc *p and struct stc **p are supported.
  • ElfC does not support pointers to pointers to structure or union pointers, e.g. struct stc ***p; is not supported.
  • Only the supported parenthesized declaration syntax is int (*f)() which declares f as a function pointer.
  • ElfC has an implementation defined limit of 32 local string initializations per function.
  • ElfC has an implementation defined limit of 32 integer values per initialization list.
  • ElfC has an implementation defined limit of 64 bytes (63 characters, plus NULL) for an initialization string.
  • Maximum total number of symbols in the ElfC symbol pool is 16348 (POOLSIZE)
  • Each of the various types of symbols in the symbol pool have a limit of 1024 (NSYMBOLS) for symbols of that type.
  • Both Asm/02 and Link/02 generate and link object files in a 16-bit address space, giving a maximum limit of 64K.
  • Struct/union definition declarations cannot be nested, but struct/union object declarations can be nested.
  • Struct/union definition declarations must be separate from the declarations of struct/union objects, i.e. struct p { int x, y; } q; will not work.
  • Struct/union definition declarations must be global, however, struct/union objects may be declared locally.