Most of the code in ABC is written in the ABC language. However, there are several cases in which it is advantageous or necessary to code in ABC assembler:
This manual assumes that you have a good understanding of programming, assemblers, compilers, interpreters and stack-based execution. Any messages you implement incorrectly can corrupt an environment or crash the ABC system so caution is advised in the use of the assembler.
The following sections describe the processes and elements to compile methods and run them using the interpreter.
The statements for a method are compiled into bytecodes which are executed by a stack-based interpreter. The function of the compiler is to concatenate the bytecodes for each message into one bytecode stream for the whole method. For non-ASM methods, the message is converted into bytecodes which will call the appropriate method at runtime. However, for ASM methods, the ASM code you supply is used to replace the message call and no call is made at runtime.
The interpreter uses a stack of pointers for its stack space. When a ABC session begins, the execution stack is initialized with the contents shown below. Note that the stack pushes and pops pointers to objects, not the objects themselves. Any variable accessible to a method is located somewhere on this stack.
Position Object >10 Computation Space 10 myTask 9 unused 8 Command Line (argv) 7 Event Task 6 Array of Types 5 Tail of task queue 4 system 3 exception 2 widgets 1 master 0 register
When a task is initiated and its methods are invoked, the stack space above task is allocated and released as needed. If this task terminates or must relinquish control of the interpreter, the stack contents for the current task are saved and the contents for the new task are restored.
The interpreter primarily executes opcodes which push and pop object pointers on the stack. In some cases, it will also call external C routines to conduct window operations, complex computations or other operations. It will also call on the Taskmaster to manage the task queue and determine the next ready task to execute.
A section of assembly code consists of opcodes and associated parameters, label names beginning with dollar signs, assembler directives like %load and %store, or nested code surrounded by curly brackets. See below for an example.
whileDo aBlock:Block
ASM
{ { START.
BRF END.
%load aBlock.
%load me.
BRU START.
END.
}.
PUF.
}
The layout of the lines is not significant, however each opcode must be terminated by a period. Branches to labels are normally resolved within a nested code section, however branches from inner to outer code sections is permitted.
The compiler concatenates the ASM code you define to code which pushes the receiver on the stack. Assembler code representing the parameters may be included in the assembler code by use of the %load and %store directives. Each specifies the name of a parameter or the receiver me. The %load directive can also be used to load constants of the basic types as shown below.
asString
ASM {
%load "%d".
IST.
}
Integers and reals can be specified in a similar manner. If you want a bitstring, precede the double-quoted string with a percent sign. Make sure that your ASM code leaves a result on the stack representing the REPLY object of the message.
The interpreter opcodes manipulate pointers to objects. Pointers are defined in C to be
typedef unsigned int utype;
typedef union {
int32 Value;
#if LSBORDER
struct
{ utype IsaPtr : 1;
utype Type : 3;
utype Marked : 1;
utype Loc :27;
} Ptr;
#else
struct
{ utype Type : 3;
utype Marked : 1;
utype Loc :27;
utype IsaPtr : 1;
} Ptr;
#endif
} DPointer;
The pointer encodes the Type (Boolean, String, Real, Bit or Pointer), Marked (used by garbage collector), Loc (pointer to where the object resides) and IsaPtr, which is 1 if the pointer is actually a pointer and 0 if it is an integer. The constant LSBORDER is used to compensate for compilers whose layout fields are in a different order in a structure.
All integers are right-shifted by 1 when stored and IsaPtr is set to 0. Thus integers do not take up additional memory although their magnitude is reduced by half from what is possible in the hardware. Booleans are also encoded in the pointer with IsaPtr as 1, Type as Boolean and Loc as 1 if TRUE and 0 if FALSE. NIL is encoded in the pointer with IsaPtr as 1, Type as Pointer and Loc as 0. This encoding speeds computation and reduce storage requirements since integers and booleans can be accessed directly and require no additional storage. All other types of objects have actual storage allocated for their data.
An object is stored in a contiguous area consisting of a four byte header and the data for the object. The header consists of Size and Type fields and some flags used by the system for bookkeeping, recovering unused objects and copying objects. The data will either be a null-terminated string for strings, a non-null-terminated string for Bits (eight bits to a byte), four bytes for a real or a vector of pointers for all other object types. The first pointer of a pointer object is a pointer to its type.
PARTS
{ entries:Array
tally:Integer
}
Suppose that you CALL an external procedure with the first parameter being a dictionary. You could retrieve the tally with the following C code:
DPointer *dict,*ptr;
int tally;
dict = ptrparm(1); /* get dictionary */
ptr = GetPtr(dict,2); /* tally is the second part */
tally = ValtoInt(ptr); /* convert to integer */
Note that ptrparm and GetPtr are macros which return the address of a DPointer, an actual location in memory. Don't assume that these pointers will stay accurate for a long period of time since the page they are on may be paged out during subsequent pointer accesses. Use them immediately and refresh later if necessary by executing GetPtr again. Also note that ptr must be converted to an integer with the ValtoInt macro. Other macros include GetStr(p) and GetReal(p). You can test if a Boolean pointer is true with the IsTrue(p) and determine if a pointer is NIL with the IsNIL(p). Finally, you can convert an integer to a pointer with InttoVal(i). Don't nest the macros.
Each opcode is a two or three letter mnemonic which may or may not have a parameter depending on the opcode. The opcodes are divided into categories depending on their use.
Opcodes which are designated as binary opcodes (binary), will pop the top two pointers, perform the operation and push the result. Normally, the operands are pushed in the order they are encountered (e.g. 3 + 4 would have 4 as the top and 3 below it on the stack.) Unary opcodes replace the top with the result of the operation.
Terminates a ABC session. This opcodes copies all pages from the temporary environment to the permanent environment and closes the environment.
Returns from executing a method. The top of the stack will contain the object to return from the method.
Executes (sends) a method. See PUM for stack contents.
Calls an external method. The stack contents are the same as for SND.
Signals an error. The stack top is a string which is the reason for the error.
Resignals a previous error.
Creates a suspended task from the parameters and method on the stack and pushes the new task object on the stack.
Transfers execution to top (a task).
Pushes next ready task.
Terminates top (a task).
Export an object(top) to a file(top-1) in object text format.
Read an object definition from a file (top) and push the newly created object.
Suspends task (top - 1) to sleep for n (top) milliseconds.
Executes the C system(s) call where s is a string on the top of the stack.
Provides current date if b is true or current time if b is false.
Recompiles a method to insert debug code if b is true or remove debug code if b is false.
Debugger NOP opcode.
Suspends execution of task and notifies debugger task.
Performs a debugger send.
Pushes the nth global variable. "n" is taken from the next byte in the bytecode and is relative to the stack bottom.
Pushes the nth typeref of a method. If the typeref already refers to the proper type, the appropriate part is returned. If not, a lookup is performed to find the part.
Pushes the nth typeref of a method. If the typeref already refers to the proper type, the appropriate method is returned. If not, a lookup is performed to find the method. A stack frame is ordered as follows:
top locals and temps
old me offset
old pc
method
n
parameter n
...
parameter 1
me
previous method temporaries
The execution of PUM will push previous method's pc and the offset of the old receiver. The pc will then be set to the beginning of the new method and NILs will be be pushed on the stack for any local variables in the method.
Like PUM, but starts the method search in the parent of the receiver.
Pushes the nth part of the receiver on the stack.
Pushes n NILs on the stack.
Pushes the nth variable on the stack. "n" will be negative if a parameter and positive if a local variable.
Pushes the nth (top) element of object (top-1). "object" may be a string or list of pointers.
Pushes an integer encoded in the next four bytes of the bytecode.
Pushes a copy of top.
Pushes the nth (top) of object (top-1). "object" is of type Bits.
Push TRUE.
Push FALSE.
Pushes the register A (global variable 0).
Pushes the current exception reason.
Pushes a list of objects onto the stack
Pushes a dynamic list of objects onto the stack.
Pushes the literal "string" on the stack.
Pushes an immediate real on the stack.
Pushes an immediate bitstring on the stack.
Pushes an immediate type on the stack.
Pop and discard the top.
The store opcodes copy one or more items of the stack in an object. Note that most store opcodes do not pop the top element of the stack when it is stored.
Store the top in the nth global variable.
Stores top in part specified by the nth typeref of method.
Stores top in part n of receiver.
Stores top in variable (parameter or local) n.
Stores value (top) in the nth (top-1) element of object (top-2). "object" may be a string or list of pointers. After STX completes, "value" and "n" will be popped and "object" will be left.
Stores top in register A.
Stores value (top) in the nth (top-1) element of object (top-2). Value should be a boolean and object should be a Bits. After STX completes, "value" and "n" will be popped and "object" will be left.
Forms the top n objects into a list.
Creates a new object based on the type which on the top of the stack. The top is replaced with the new object.
Replaces the name on top with the type which has that name.
Replaces top with the type of type.
Replaces top with its size. Integer, Boolean and Real have size 1. String will have a size which is the number of characters, Bits will have a size which is the number of bytes needed to represent the bitstring, and all other objects will have a size representing the number of pointers, excluding the type pointer.
Shallow copies object (top-1) by n (top) units. The meaning of the units is based on the top of obj. See SZ for details. Both object and n are popped and the new object is pushed.
Makes a deep copy of the top, replacing it on the stack.
Returns TRUE if top is defined (not NIL), FALSE otherwise.
Bit AND. (binary)
Bit OR. (binary)
Bit NOT. (unary)
Bit right shift. Shift the bitstring (top-1) to the right n (top) bits.
Bit left shift. Shift the bitstring (top-1) to the left n (top) bits.
Bit EXOR. (binary)
Bit EQUAL. (binary)
The number of ON bits in bit string replaces top.
Converts top to string. It is actually a byte-encoded bitstring.
Convert top to integer. This is only valid for bitstrings which have less than 32 bits.
Boolean AND. (binary)
Boolean OR. (binary)
Boolean NOT. (unary)
Branches use a relative offset which is encoded in the next two bytes of the bytecode.
Branch if top is TRUE. Pop the top.
Branch if top is FALSE. Pop the top.
Branch unconditionally.
A total of 10 files may be open at one time.
Open file based on top mode ("r","w","a"), name (top-1) and file object (top-2). Pops these three elements and pushed file index.
Read file based on maximum characters to copy (top) and file id (top-1).
Write string(top) to fileid (top-1).
Close fileid (top).
Deletes a file based on name (top).
Returns TRUE if file (top) exists.
The integer comparisons are binary operations which compare top and top-1 and push a Boolean result: IEQ, INE, ILT, ILE, IGT, IGE.
Integer add. (binary)
Integer subtract (binary)
Integer multiply. (binary)
Integer divide. (binary)
Integer absolute value. (unary)
Integer negation. (unary)
Integer modulus. (binary)
Converts integer to real. (unary)
Converts top to a string based on format in top-1 (C formatting).
Increment top.
Decrement top.
Convert integer to bitstring. (unary)
The real comparisons are binary operations which compare top and top-1 and push a Boolean result: REQ, RNE, RLT, RLE, RGT, RGE.
Real addition. (binary)
Real subtraction. (binary)
Real multiplication. (binary)
Real division. (binary)
Real absolute. (unary)
Real negation. (unary)
Real exponentiation. (binary)
Converts real to integer (truncates). (unary)
Converts top to string based on format top-1.
The string comparisons are binary operations which compare top and top-1 and push a Boolean result: SEQ, SNE, SLT, SLE, SGT, SGE.
Concatenates n strings on stack and pushes string. "n" is found in the next byte in the bytecode. The strings to concatenate are ordered so that lowest on stack is leftmost and highest is rightmost.
Make a substring of obj (top-2) starting at i(top-1) for n (top) characters and push on the stack.
Finds the position of source (top) in pattern (top-1). Returns 0 if not found.
Finds the first occurrance of an element of a set (top) of characters in given string (top-1).
Converts a string to upper case.
Converts a string to lower case.
Centers a string (top-1) in a field of width (top) characters.
Converts a string to an integer.
Converts a string to a real.
Prints a string to the console. The string is not popped.
Pushes a string from the console.
Assembles a string into bytecodes.
Capitalizes a string.
Finds prefix(top-2) using string prefix(top-1) starting at position(top).
Formats string(top) according to format(top-1).
RAC (arccos), RAS (arcsin), RAT (arctan), RCS (cos), RSN (sine), RTN (tan), RCH (cosh), RSH (sinh), RTH (tanh), RLG (log2), RLX (log10), RSQ (sqrt), REX (exp)
NOP byte alignment for 1,2 or 3 bytes.
Break/step opcode.
Copy a block for later execution.
Run a block.
Reply to a message.
Checkpoint the environment.
Fork a block.
Invalidate the message and parts cache.
Get an environment symbol.
Get start of line locations for debugger.
String scan for character in pattern.
Abort a session.
Compress a checkpoint file during execution.
Change the message quota of a running task.
A special text format is used to represent objects. The table below shows the symbols used to represent the various objects. Spacing and formatting is not normally significant.
Type Example Format
Integer 234 optional sign,digits
Real 3.141 optional sign, optional digits, decimal, digits
String "test" characters surrounded by double quotes
Bits %"$backslash3" string preceded by percent
Boolean T or F F(false) or T(true)
Others (...) Objects surround by parentheses
NIL N NIL pointer
Type @Real @ followed by name
Opcodes {...} Opcode code format surrounded by curly brackets
Dupl. * asterisk
Dup ref $33 dollar sign followed by number
Most objects, except the basic types, contain other objects and will be surrounded by parentheses. The first pointer of such an object will be a type constant. An example of a method is shown below.
( @Method 5 2
{ BRU EXC.###PPS 2.REP.#
PUV 0.RET.EXC.RSG.}
"size;" "Integer"
"sizen
REPLY Integern
{ REPLY tally.n
}n"
4
(@Array)
(@Array)
@Dictionary F
)
It has parts consisting of the Method type pointer, starting PC (5), scope(2), bytecodes (the opcodes between the brackets), the method name(size;), the reply type(Integer),the source code, count of local variables(4), array of variables(empty), array of references(empty), type in which the method is defined (Dictionary) and a Boolean indicating whether the type is a type method (F). The format for the bytecodes is described in the section on the Assembler format.
In some cases, two objects may need to refer to the same object. If so, the first object to refer to the object will define it preceded by an asterisk. Subsequent objects will refer to it with the reference format. The number refers to the nth asterisk which was encountered from the beginning of the object. Duplicates are normally generated by the export function.