The FSB virtual machine emulates a hypothetical computer designed to host dynamically typed, high-level programming languages. Some of its highlights include:
The FSB features an object oriented memory in several senses:
An FSB heap contains up to 216 page objects, each of which is 64K bytes in size. Thus, the FSB machine has a 232 bit address space.
Smaller objects are carved from pages, by dividing the pages into equally sized pieces (except for a page header, in some cases):
| Obj Size Code |
Heap Page Formats | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 12 | 64K obj | |||||||||||||||
| 11 | 32K obj | 32K obj | ||||||||||||||
| 10 | 16K obj | 16K obj | 16K obj | 16K obj | ||||||||||||
| 9 | 8K obj | 8K obj | 8K obj | 8K obj | 8K obj | 8K obj | 8K obj | 8K obj | ||||||||
| 8 | hdr (4k) | 4K obj | 4K obj | 4K obj | 4K obj | 4K obj | 4K obj | 4K obj | 4K obj | 4K obj | 4K obj | 4K obj | 4K obj | 4K obj | 4K obj | 4K obj |
| 7 | hdr (4k) | ... 30 x 2K byte objects ... | ||||||||||||||
| 6 | hdr (4k) | ... 60 x 1K byte objects ... | ||||||||||||||
| 5 | hdr (4k) | ... 120 x 512 byte objects ... | ||||||||||||||
| 4 | hdr (4k) | ... 240 x 256 byte objects ... | ||||||||||||||
| 3 | hdr (4k) | ... 480 x 128 byte objects ... | ||||||||||||||
| 2 | hdr (4k) | ... 960 x 64 byte objects ... | ||||||||||||||
| 1 | hdr (4k) | ... 1920 x 32 byte ... | ||||||||||||||
| 0 | hdr (4k) | ... 3840 x 16 byte objects ... | ||||||||||||||
All pointers in an FSB system point to an object, or to an internal field within an object. Given a pointer, even a pointer to an internal field, programs can quickly locate both the object referred to by the pointer, and the page on which the object resides.
The FSB features three object registers, each of which
always contains a pointer to a (special) register object.
Register objects are page-size objects (64K bytes each)
and comprise the "register files" of the FSB virtual CPU.
The FSB additionally includes a single, 4-bit task selector register. This is used for the cooperative multi-tasking features described below.
Each of the three register objects contains:
Each FSB instruction cycle begins by examining the four bit task selector register and using that value to select one of the three registers and, from that register, one of the four program counters and code-block pointers. The instruction in that code-block, at the offset indicated by that program counter, is the next instruction to execute.
| thread selector value | which instruction to run next | illustration |
|---|---|---|
| 0x0 | reg[0].pc/block[0] | ![]() |
| 0x1 | reg[0].pc/block[1] | |
| 0x2 | reg[0].pc/block[2] | |
| 0x3 | reg[0].pc/block[3] | |
| 0x4 | reg[1].pc/block[0] | |
| 0x5 | reg[1].pc/block[1] | |
| 0x6 | reg[1].pc/block[2] | |
| 0x7 | reg[1].pc/block[3] | |
| 0x8 | reg[2].pc/block[0] | |
| 0x9 | reg[2].pc/block[1] | |
| 0xA | reg[2].pc/block[2] | |
| 0xB | reg[2].pc/block[3] | |
| 0xC | unused, reserved | |
| 0xD | unused, reserved | |
| 0xE | unused, reserved | |
| 0xF | unused, reserved |
Every object is one of the 13 standard object sizes and thus is some multiple of 16 bytes. The (aligned) 16-byte regions within an object are called quads. Quads are further divided into pairs, words, half-words, and bytes:
The minimum size of an object in memory is 16 bytes. The maximum size is 64K bytes. Heap objects can be allocated at any power-of-two number of bytes in between those sizes.
"Ordinary" objects, called referencable objects, are reference counted and the FSB assists in reclaiming objects whose reference count drops to 0. Cycles among referencable objects must be explicitly broken by applications. (Note, though, that applications can also create classes of object that are collected by graph-tracing garbage collection.)
"Limb" objects are objects with the invariant property that exactly one object -- a referencable object -- contains a pointer to the limb. The referencing object also contains type and size information about the limb. The lifetime of a limb is identical to the lifetime of the (unique) object that points to the limb. When the referencing object is reclaimed, the limb is reclaimed as well.
Every referencable object in memory begins with a mandatory two-word header. (Limbs do not require headers because they are not reference counted and their type information is stored elsewhere.) Here is the format of the two-word header found in all referencable objects:
(top)| Small Object Headers (all referencable heap objects) | |||
|---|---|---|---|
| quad[0] of the object | |||
| pair[0]: the object header |
pair[1]: varies |
||
| word[0]: the object's (machine) type* |
word[1]: reference count and gc bits** |
||
|
* The type code encodes information sufficient to infer what machine type of value is stored in each location within the object. (The type code stored in an object can vary over the object's lifetime.) Type codes are described in detail elsewhere in this document. ** word[1] of the object header contains a 28-bit reference count and set of 4 GC bits that govern the use of the reference count (these are described in greater detail, elsewhere). Objects are normally reference counted with a requirement that cycles must be explicictly broken to reclaim an object. Objects which are managed instead by graph-tracing collection can be created by a separate mechanism (also described elsewhere). |
|||
Objects larger than 16 bytes (four words) have a larger object header: a four word (single quad) header:
(top)| Tagged Object Headers (referencable heap objects 32 bytes or larger) | |||
|---|---|---|---|
| quad[0] of the multi-quad object | |||
| pair[0]: object header |
pair[1]: additional object header |
||
| word[0]: the object's (machine) type |
word[1]: reference count and gc bits |
word[0]: user tag and object size* |
word[1]: object hash value** |
|
* An object size code is a four bit value. The object is 2(size-code + 3) bytes in size. The user tag is an application-controlled 28-bit tagging field. ** The object hash value is a cache containing either 0 or a hash value for the object. As an exception, in free, small objects the hash slot holds a freelist pointer linking free objects on the same page. |
|||
Objects 64-bytes and larger are called huge objects and are further subdivided into a head (not to be confused with an "object header") and a tail. The head of a huge object is exactly one quad in size. The tail of a huge object is the oversall size of the object, minus two quads:
| Huge Object Format | ||
|---|---|---|
| quad[0] object header |
quad[1] object head |
quad[2 .. (obj-size / 16) - 2] object tail |
| Note: The division of objects into head and tail is largely a convenience of applications however it plays an important role in the type tagging of objects, explained elsewhere in this document. | ||
Here is an illustration giving an overview of the preceding object format tables:
![]() |
In this section, we provide C code defining a low-level C interface to an implementation of an FSB memory subsystem.
| C code |
|---|
|
Porting Alert This code makes unwarranted assumptions about the byte-sizes of various C types. It may need modification if the program is ported. |
|
<Declare the Integer Types>
typedef char fsb_int8_t;
typedef unsigned char fsb_uint8_t;
typedef int fsb_int16_t;
typedef unsigned int fsb_uint16_t;
typedef int fsb_int32_t;
typedef unsigned int fsb_uint32_t;
typedef int fsb_int64_t;
typedef unsigned int fsb_uint64_t;
(code section) |
|
The C programming language lacks integer types of guaranteed sizes. In these declarations we make reasonable guesses. The primitive C types used must be large enough to hold the indicated number of bits. In theory, this reference implementation of FSB should work correctly even if the primitive C types used are "too large" -- contain extra bits. However, little work has been done to make sure that is the case so bugs are likely to be found. |
| C code |
|---|
|
<Declare the Character Type>
typedef fsb_uint32_t fsb_codept_t;
(code section) |
|
The FSB character set has 21 low-order bits defined by Unicode, plus 11 "bucky bits". |
| C code |
|---|
|
<Declare the Floating Point Types>
typedef float fsb_float_t;
typedef double fsb_double_t;
(code section) |
|
We assume that these types are IEEE 32 and 64 bit (single and double) precision floating point numbers. |
| C code |
|---|
|
<Declare Pointer Types>
union fsb_byte;
union fsb_half_word;
union fsb_word;
union fsb_pair;
union fsb_quad;
typedef union fsb_byte_ptr fsb_byte_ptr_t;
union fsb_byte_ptr
{
union fsb_byte * fsb_bytes;
};
typedef union fsb_half_word_ptr fsb_half_word_ptr_t;
union fsb_half_word_ptr
{
union fsb_byte * fsb_bytes;
union fsb_half_word * fsb_half_words;
};
typedef union fsb_word_ptr fsb_word_ptr_t;
union fsb_word_ptr
{
union fsb_byte * fsb_bytes;
union fsb_half_word * fsb_half_words;
union fsb_word * fsb_words;
};
typedef union fsb_pair_ptr fsb_pair_ptr_t;
union fsb_pair_ptr
{
union fsb_byte * fsb_bytes;
union fsb_half_word * fsb_half_words;
union fsb_word * fsb_words;
union fsb_pair * fsb_pairs;
};
typedef union fsb_quad_ptr fsb_quad_ptr_t;
union fsb_quad_ptr
{
union fsb_byte * fsb_bytes;
union fsb_half_word * fsb_half_words;
union fsb_word * fsb_words;
union fsb_pair * fsb_pairs;
union fsb_quad * fsb_quad;
};
(code section) |
|
All FSB pointers are pointers into virtual machine pages and always point at or into an FSB object. The FSB memory model implies that such pointers always point to a (virtual machine) quad, pair, word, half-word, or byte. |
| C code |
|---|
|
<Declare Bytes and Half Words> #define FSB_BYTE_FIELDS(MULT) \ fsb_int8_t fsb_int8[MULT]; \ fsb_uint8_t fsb_uint8[MULT] typedef union fsb_byte fsb_byte_t; union fsb_byte { FSB_BYTE_FIELDS (1); }; #define FSB_HALF_WORD_FIELDS(MULT) \ FSB_BYTE_FIELDS (2 * (MULT)); \ fsb_byte_t fsb_byte[2 * (MULT)]; \ fsb_int16_t fsb_int16[MULT]; \ fsb_uint16_t fsb_uint16[MULT] typedef union fsb_half_word fsb_half_word_t; union fsb_half_word { FSB_HALF_WORD_FIELDS (1); }; (code section) |
|
The FSB machine's memory model is defined in terms of virtual bytes, half-words, words, pairs, quads and pages. We can't, strictly speaking, rely on the C implementation using exactly those sizes in a convenient way. Therefore, it's helpful to wrap an abstraction barrier around those data types. |
| C code |
|---|
|
<Declare Words> #define FSB_WORD_FIELDS(MULT) \ FSB_HALF_WORD_FIELDS (2 * (MULT)); \ fsb_half_word_t fsb_half_word[2 * (MULT)]; \ fsb_int32_t fsb_int32[MULT]; \ fsb_uint32_t fsb_uint32[MULT]; \ fsb_codept_t fsb_codept[MULT]; \ fsb_float_t fsb_float[MULT]; \ fsb_byte_ptr_t fsb_byte_ptr[MULT]; \ fsb_word_ptr_t fsb_word_ptr[MULT]; \ fsb_pair_ptr_t fsb_pair_ptr[MULT]; \ fsb_quad_ptr_t fsb_quad_ptr[MULT] typedef union fsb_word fsb_word_t; union fsb_word { FSB_WORD_FIELDS (1); }; (code section) |
| C code |
|---|
|
<Declare Pairs> #define FSB_PAIR_FIELDS(MULT) \ FSB_WORD_FIELDS (2 * (MULT)); \ fsb_word_t fsb_word[2 * (MULT)]; \ fsb_int64_t fsb_int64[MULT]; \ fsb_uint64_t fsb_uint64[MULT]; \ fsb_double_t fsb_double[MULT] typedef union fsb_pair fsb_pair_t; union fsb_pair { FSB_PAIR_FIELDS (1); }; (code section) |
|
Pairs are the smallest memory type large enough to hold double precision floats and 64 bit integers. |
| C code |
|---|
|
<Declare Quads> #define FSB_QUAD_FIELDS(MULT) \ FSB_PAIR_FIELDS (2 * (MULT)); \ fsb_pair_t fsb_pair[2 * (MULT)] typedef union fsb_quad fsb_quad_t; union fsb_quad { FSB_QUAD_FIELDS (1); }; (code section) |
|
All heap objects are some multiple number of quads in size. The smallest heap objects are exactly one quad. The object header of all objects 32-bytes in size or larger is a quad. |
| C code |
|---|
|
<Declare Register Sets>
#define FSB_N_REGISTERS (3)
typedef struct fsb_registers fsb_registers_t;
struct fsb_registers
{
/* An invariant is maintained:
* Each register contains either NULL or
* a pointer to a page-sized register object.
*/
fsb_quad_ptr_t fsb_reg[FSB_N_REGISTERS];
unsigned int fsb_task_selector:4;
};
(code section) |
|
See Registers and Register Pages and Register Format and Cooperative Multi-tasking |
A single node FSB machine operates cyclicly, executing one instruction at a time. In each cycle, the task selector register is consulted to select one of the three registers, and within that register, one of the four program counter / block pairs. The block is an ordinary object; the program counter is an index into the tail of that object.
The tail location indexed by the program counter can contain any object. That indicated object is the next instruction to run. All objects can be executed as instructions.
The default behavior of objects, when they are executed as instructions, is to push a reference to themselves on stack[0] of the register indicated by the task selector register, and to increment the program counter indicated by the task selector.
A distinguished class of objects, called syms, has a different behavior. When a sym is executed, a corresponding built-in function (or primitive instruction; there is no distinction) is called. This function can perform any combination of computations and side effects permitted by the micro-code API.
Instruction execution is a series of atomic steps, transitioning the machine state from one consistent state to another. Interrupts may be delivered (in environment-specific circumstances) only between instructions. Interrupts are delivered, generally speaking, by saving the current task selector, program counter and block, and transferring control to an appropriate interrupt handler.
| Note: Cooperative Tasking and Multiple Stacks |
|---|
|
The FSB's multiple stacks, program counters, and block pointers combine nicely with the FSB's model of interrupt handling. For example, a system which wants to emulate a unix-like kernel with multiple "user processes" can cheaply use one FSB stack for interrupt services and/or kernel services, using the others for currently schedulable user processes. |
FSB instructions are constrained in what side effects they may perform in a single instruction cycle. In particular, they can directly modify registers and register objects without restriction. In some circumstances, they can modify objects directly referenced by register objects -- these are called first level objects. Under no circumstances can instructions modify other objects:
| Illustration: Restrictions on Instruction Mutations |
|---|
![]() |
|
If an object has a reference count of 1, it is called a linear object. First level linear objects may be freely modified by instructions. Only certain locations within non-linear objects can be modified. The machine type of an object distinguishes mutable locations within an object from immutable locations. Instructions are permitted only to modify explicitly mutable locations within first level, non-linear objects. |
The restrictions on side effects are key to the FSB approach to true concurrency: the question of what happens when an FSB machine is implemented atop a multi-processor shared memory system. The FSB design is based on the premise that communication between processors that share memory is relatively expensive: the more computation that takes place in memory private to a processor, the better.
| Illustration: True Concurrency |
|---|
![]() (top) |
|
The FSB architecture encourages as much work as practical to be done on register stacks, private to each processor. The architecture encourages mutations to be primarily limited to linear objects for which no locking is required. Truly shared first level objects being modified require page-level locking. The rest of the shared heap is immutable. One interesting application of this property is to customize paging algorithms: the FSB will access no page that is not directly addressed by a register object. Paging algorithms, rather than responding to "random access" to pages, can observe when FSB programs move a pointer to a heap object into a register: a logical point in time at which to "page in" an object not resident in memory. |
The FSB machine uses a system of dynamic type tags. The type of the value stored in each register and heap location can be computed at run-time, given only a pointer to that location. This section presents the type tagging system, and the details of object formats.
FSB type tagging is neither especially simple, or especially intuitive. It is carefully balanced to reduce the cost of a number of common operations. While the typing system is relatively surprising in its details, it is also designed to be easy to use effectively either "by hand" or via a compiler.
Diving in to the deep end, then, here is the low-level anatomy of an FSB type code
Every referencable heap object includes at least a small object header which includes, in its first word, a 32-bit type code.
FSB type codes are regarded as a two 4-element lists of 4-bit values. The first list of 4-bit values is called the head type of the object, the second list of 4-bit values is called the tail type of the object. Roughly speaking, the head type of an object records the value types stored in the head of the object, the tail type the tail of the object however there are exceptions to that too-simple rule.
Each 4-bit value in a type code is called a type quark.
| type code format | |||||||
|---|---|---|---|---|---|---|---|
| head type (four quarks) |
tail type (four quarks) |
||||||
| quark 0 (four bits) |
quark 1 (four bits) |
quark 2 (four bits) |
quark 3 (four bits) |
quark 0 (four bits) |
quark 1 (four bits) |
quark 2 (four bits) |
quark 3 (four bits) |
| C code |
|---|
|
<Type Code Definitions> <Declare Type Quarks> <Declare Type Pairs> <Declare Type Quads> <Declare Type Words> |
| C code |
|---|
|
<Declare Type Quarks>
enum fsb_ty_quark
{
fsb_ty_n = 0, /* pronounced "nil" -- must be 0 */
fsb_ty_b = 1, /* pronounced "signed byte" */
fsb_ty_B = 2, /* pronounced "unsigned byte" */
fsb_ty_s = 3, /* pronounced "signed short" */
fsb_ty_S = 4, /* pronounced "unsigned short" */
fsb_ty_i = 5, /* pronounced "signed int" */
fsb_ty_I = 6, /* pronounced "unsigned int" */
fsb_ty_l = 7, /* pronounced "signed long" */
fsb_ty_L = 8, /* pronounced "unsigned long" */
fsb_ty_c = 9, /* pronounced "character" */
fsb_ty_f = 10, /* pronounced "float" */
fsb_ty_d = 11, /* pronounced "double" */
fsb_ty_o = 12, /* pronounced "object" */
fsb_ty_x = 13, /* pronounced "cell" */
fsb_ty_m = 14, /* pronounced "limb" */
fsb_ty_y = 15, /* pronounced "sym" */
/* No more! There must be exactly 16 type quarks */
};
typedef struct fsb_type_quark fsb_type_quark_t;
struct fsb_type_quark
{
unsigned int fsb_type_quark:4;
};
#define fsb_ty_quark(N) ((fsb_type_quark_t){ (N) })
#define fsb_n fsb_ty_quark (fsb_ty_n)
#define fsb_b fsb_ty_quark (fsb_ty_b)
#define fsb_B fsb_ty_quark (fsb_ty_B)
#define fsb_s fsb_ty_quark (fsb_ty_s)
#define fsb_S fsb_ty_quark (fsb_ty_S)
#define fsb_i fsb_ty_quark (fsb_ty_i)
#define fsb_I fsb_ty_quark (fsb_ty_I)
#define fsb_l fsb_ty_quark (fsb_ty_l)
#define fsb_L fsb_ty_quark (fsb_ty_L)
#define fsb_c fsb_ty_quark (fsb_ty_c)
#define fsb_f fsb_ty_quark (fsb_ty_f)
#define fsb_d fsb_ty_quark (fsb_ty_d)
#define fsb_o fsb_ty_quark (fsb_ty_o)
#define fsb_x fsb_ty_quark (fsb_ty_x)
#define fsb_m fsb_ty_quark (fsb_ty_m)
#define fsb_y fsb_ty_quark (fsb_ty_y)
|
|
Type quarks are often but not always used to
indicate a value of the type they are named after.
For example, |
| C code |
|---|
|
<Declare Type Pairs>
typedef struct fsb_type_pair fsb_type_pair_t;
struct fsb_type_pair
{
unsigned int fsb_type_pair:8;
};
#define fsb_ty_pair_code(L,R) (((L) << 4) | (R))
#define fsb_ty_pair(L,R) ((struct fsb_type_pair){fsb_ty_pair_code ((L).fsb_type_quark, (R).fsb_type_quark)})
#define fsb_ty_pair_left(PT) fsb_ty_quark (((PT).fsb_type_pair >> 4) & 0xf)
#define fsb_ty_pair_right(PT) fsb_ty_quark ((PT).fsb_type_pair & 0xf)
#undef FSB_X
#undef FSB_Y
#undef FSB_Z
#define FSB_X(A,B) \
static const fsb_type_pair_t fsb_ ## A ## B \
= ((struct fsb_type_pair){(fsb_ty_ ## A << 4) | fsb_ty_ ## B})
#define FSB_Y(X) \
FSB_X(X,n); FSB_X(X,b); FSB_X(X,B); FSB_X(X,s); \
FSB_X(X,S); FSB_X(X,i); FSB_X(X,I); FSB_X(X,l); \
FSB_X(X,L); FSB_X(X,c); FSB_X(X,f); FSB_X(X,d); \
FSB_X(X,o); FSB_X(X,x); FSB_X(X,m); FSB_X(X,y)
#define FSB_Z \
FSB_Y(n); FSB_Y(b); FSB_Y(B); FSB_Y(s); \
FSB_Y(S); FSB_Y(i); FSB_Y(I); FSB_Y(l); \
FSB_Y(L); FSB_Y(c); FSB_Y(f); FSB_Y(d); \
FSB_Y(o); FSB_Y(x); FSB_Y(m); FSB_Y(y)
FSB_Z;
|
|
Having a short-hand notation for type pairs is handy. |
| C code |
|---|
|
<Declare Type Quads>
typedef struct fsb_type_quad fsb_type_quad_t;
struct fsb_type_quad
{
unsigned int fsb_type_quad:16;
};
#define fsb_ty_quad(L,R) ((struct fsb_type_quad){((L).fsb_type_pair << 8) | (R).fsb_type_pair})
#define FSB_TY_QUAD(W,X,Y,Z) fsb_ty_quad (fsb_ ## W ## X, fsb_ ## Y ## Z)
#define fsb_ty_quad_left(QT) ((struct fsb_type_pair){((QT).fsb_type_quad >> 8) & 0xff})
#define fsb_ty_quad_right(QT) ((struct fsb_type_pair){((QT).fsb_type_quad >> 8) & 0xff})
#define fsb_ty_quad_left_l(QT) fsb_ty_pair_left (fsb_ty_quad_left (QT))
#define fsb_ty_quad_left_r(QT) fsb_ty_pair_right (fsb_ty_quad_left (QT))
#define fsb_ty_quad_right_l(QT) fsb_ty_pair_left (fsb_ty_quad_right (QT))
#define fsb_ty_quad_right_r(QT) fsb_ty_pair_right (fsb_ty_quad_right (QT))
|
|
Naturally, a type quad is two type pairs or, equivalently, four type quarks. |
| C code |
|---|
|
<Declare Type Words>
typedef struct fsb_type fsb_type_t;
struct fsb_type
{
unsigned int fsb_type:32;
};
#define fsb_type(HD,TL) ((struct fsb_type){((HD) << 16) | (TL)})
#define fsb_type_head(QT) fsb_ty_quad (((QT) >> 16) & 0xffff)
#define fsb_type_tail(QT) fsb_ty_quad ((QT) & 0xffff)
#define fsb_type_head_l(QT) fsb_ty_quad_left (fsb_type_head (QT))
#define fsb_type_head_ll(QT) fsb_ty_quad_left_l (fsb_type_head (QT))
#define fsb_type_head_lr(QT) fsb_ty_quad_left_r (fsb_type_head (QT))
#define fsb_type_head_r(QT) fsb_ty_quad_right (fsb_type_head (QT))
#define fsb_type_head_rl(QT) fsb_ty_quad_right_l (fsb_type_head (QT))
#define fsb_type_head_rr(QT) fsb_ty_quad_right_r (fsb_type_head (QT))
#define fsb_type_tail_l(QT) fsb_ty_quad_left (fsb_type_tail (QT))
#define fsb_type_tail_ll(QT) fsb_ty_quad_left_l (fsb_type_tail (QT))
#define fsb_type_tail_lr(QT) fsb_ty_quad_left_r (fsb_type_tail (QT))
#define fsb_type_tail_r(QT) fsb_ty_quad_right (fsb_type_tail (QT))
#define fsb_type_tail_rl(QT) fsb_ty_quad_right_l (fsb_type_tail (QT))
#define fsb_type_tail_rr(QT) fsb_ty_quad_right_r (fsb_type_tail (QT))
/* some of those have synonyms: */
#define fsb_type_head_lead(QT) fsb_type_head_l(QT)
#define fsb_type_head_trail(QT) fsb_type_head_r(QT)
#define fsb_type_tail_lead(QT) fsb_type_tail_l(QT)
#define fsb_type_tail_trail(QT) fsb_type_tail_r(QT)
#define fsb_type_head_0(QT) fsb_type_head_ll(QT)
#define fsb_type_head_1(QT) fsb_type_head_lr(QT)
#define fsb_type_head_2(QT) fsb_type_head_rl(QT)
#define fsb_type_head_3(QT) fsb_type_head_rr(QT)
#define fsb_type_tail_0(QT) fsb_type_tail_ll(QT)
#define fsb_type_tail_1(QT) fsb_type_tail_lr(QT)
#define fsb_type_tail_2(QT) fsb_type_tail_rl(QT)
#define fsb_type_tail_3(QT) fsb_type_tail_rr(QT)
|
|
Type words are the complete type codes stored in the first word of referencable objects |
Ordinarilly, a type quad describes the machines types of the values stored in a quad-sized region of memory.
If a quad is being used to store four values, each having a word-sized machine type, then the quarks of the type quad are in one-to-one correspondence with the words of the memory quad:
| quark N | value stored in word N | illustration |
|---|---|---|
| b | fsb_int8[4] | ![]() |
| B | fsb_uint8[4] | |
| s | fsb_int16[2] | |
| S | fsb_uint16[2] | |
| i | fsb_int32[1] | |
| I | fsb_uint32[1] | |
| c | fsb_codept[1] | |
| f | fsb_float[1] | |
| o | fsb_byte_ptr[1]* | |
| x | fsb_byte_ptr[1]* | |
| m | fsb_quad_ptr[1]** | |
|
* always NULL or points at or into a referencable object. ** always NULL or points at a limb. |
||
Any single-word type quark in position 0 or 2 can be extended to two words by storing a nil quark in positions 1 or 3, respectively.
64-bit integers and doubles always require two words. "object" and "cell" quarks change meaning, subtlely, when extended to two words:
| quarks N/N+1 | value stored in words N/N+1 | illustration |
|---|---|---|
| bn | fsb_int8[8] | ![]() |
| Bn | fsb_uint8[8] | |
| sn | fsb_int16[4] | |
| Sn | fsb_uint16[4] | |
| in | fsb_int32[2] | |
| In | fsb_uint32[2] | |
| ln | fsb_int64[1] | |
| Ln | fsb_uint64[1] | |
| cn | fsb_codept[2] | |
| fn | fsb_float[2] | |
| dn | fsb_double[1] | |
| on | fsb_uint32[1]‡ +fsb_word[1]* |
|
| xn | fsb_uint32[1]‡ +fsb_word[1]* |
|
| mn | fsb_uint32[1]‡‡ +fsb_quad_ptr[1]** |
|
| yn*** | fsb_uint32[1] +fsb_word[1] |
|
|
* always NULL or points at or into a referencable object. ** always NULL or points at a limb. *** this is a "sym reference", described elsewhere in this document ‡ contains type information about the following word ‡‡ contains type information about the limb pointed at by the following word |
||
Ordinary and extended width types can be mixed, as in these examples:
| quarks 0..3 | value stored in words 0..3 | illustration |
|---|---|---|
| Ifdn | fsb_uint32[1]+fsb_float[1] fsb_double[1] |
![]() |
| dnIf | fsb_double[1] fsb_uint32[1]+fsb_float[1] |
Just as type quarks can be extended to two words, they can be extended to four words:
| quarks 0..3 | value stored in words 0..3 | illustration |
|---|---|---|
| bnnn | fsb_int8[16] | ![]() |
| Bnnn | fsb_uint8[16] | |
| snnn | fsb_int16[8] | |
| Snnn | fsb_uint16[8] | |
| innn | fsb_int32[4] | |
| Innn | fsb_uint32[4] | |
| lnnn | fsb_int64[2] | |
| Lnnn | fsb_uint64[2] | |
| cnnn | fsb_codept[4] | |
| fnnn | fsb_float[4] | |
| dnnn | fsb_double[2] | |
| onnn | fsb_uint32[1]+fsb_word[1] fsb_uint32[1]+fsb_word[1] |
|
| xnnn | fsb_uint32[1]+fsb_word[1] fsb_uint32[1]+fsb_word[1] |
|
| mnnn | fsb_uint32[1]+fsb_quad_ptr[1] fsb_uint32[1]+fsb_quad_ptr[1] |
|
| ynnn | fsb_uint32[1]+fsb_word[1] fsb_uint32[1]+fsb_word[1] |
|
|
See an earlier table for more details. |
||
| Tagged Object Head Types |
|---|
|
Objects 32-bytes in size and larger are tagged objects which include a type code in the first word, and application data in the final four words. If the head type of the of the object's type code is an ordinary, extended, mixed, or quad width type quad, then it directly describes the type of the head of the object as illustrated in this example. |
|
| Huge Object tail Types |
|---|
|
Objects larger than 32-bytes in size are huge objects which include a type code in the first word, and application data in both the head and tail of the object. If both the the head and tail types of the of the object's type code are ordinary, extended, mixed, or quad width type quads, then the tail quad directly describes the type of the tail of the object. Tail types of this sort describe the layout of each quad of the tail. The tail is a repeating sequence of this pattern: |
|
Many type quads fall into one of the categories ordinary, extended width, mixed width, or quad width and are therefore used to describe the physical layout of an object head or tail.
There are, however, many type quads which do not describe a valid physical layout, as in these examples:
| exceptional type code | reason for being exceptional |
|---|---|
| dddd | four double-precision floats can't fit in four words of memory |
| nIII | nil does not take up any space -- only three words are accounted for |
Some (not all) exceptional type codes are given special meanings. These are described in detail in an appendix although some uses for them are described in the next section.
Immediate objects are immutable objects which can be stored in 64 bits (8 bytes) or fewer of memory.
Immediate objects can be stored in aligned, 2-word fields within
an object. Immediate objects can only be stored in locations
tagged with one of the type pairs:
fsb_on or fsb_xn. (See the
explanation of extended
width types for an introduction to type pairs.)
Immediate objects are also called smobs, a contraction of Small iMmutable OBjectS.
Most numeric types are smobs. Syms are smobs. Perhaps confusingly: references to heap objects are also smobs. Note that heap objects themselves are not smobs but "references" (type-tagged pointers) to heap objects are smobs.
Smobs use a system of staggered tagging. At least the first byte of a smob is used to discern its type. In some cases, additional bits of the smob must also be consulted.
Smobs are discussed in detail in an appendix. Here some examples are illustrated.
(top)
32-bit integer smobs
|
32-bit unsigned integer smobs
|
||||||||||||||||||||||||||||||||||||
single precision floating point smobs
|
pair of signed 16-bit integers
|
||||||||||||||||||||||||||||||||||||
object reference
|
|||||||||||||||||||||||||||||||||||||
reference to an object known to be a linear object**
|
|||||||||||||||||||||||||||||||||||||
sym***
|
|||||||||||||||||||||||||||||||||||||
|
* These two-letter codes refer to type pairs. ** When an object is known to be linear it is handy to record this fact in the reference to the object. That way, the linearity of the object can be recognized without having to examine the object's reference count. *** This is not the only ways syms are represented. See the exceptional objects appendix. |
|||||||||||||||||||||||||||||||||||||
If a pair within an object is tagged with the type tag fsb_mn then the pair is a limb handle:
(top)
32-bit integer smobs
* See the Limb Formats appendix for more detail. |
||||||||||||||||||||||||||||||||||
Linear objects may be freely modified if they are addressed directly via a register object.
Only certain fields of non-linear objects can modified (and
again, only when those objects are directly addresed via
register objects). In particular, word pairs of type fsb_mn can be modified to
contain any SMOB. Individual
words of the un-extended
type fsm_m can be modified to
contain NULL or an fsb_byte_ptr_t
which points to or within some referencable object.
Each of the three FSB registers points to a register page. This chapter describes the internal structure of a register page.
| Register Format | |
|---|---|
|
The head of a register page contains four stack pointers and program counters. The tail of the page is divided into several regions. Each region is divided into pairs each of which contains an array of SMOBs. Each register contains four stacks (details discussed below) and four "overflow area links" for those stacks. Each register contains a variety of system registers (described in detail below) and an index: a set of 2048, randomly accessible, general purpose registers. |
|
| Stack Details | |
|---|---|
|
Each register contains four stack pointers, four stacks, and four "displays". For each stack there is an overflow variable, pointing to heap allocated objects that are used to save values pushed deeply within the stack. Virtual machine instructions can directly address only the top 256 stack entries and only while those entries are all present in the stack object (rather than stored in the heap, in an overflow object). In other words, each stack pointer addresses a floating 256 variable "stack window" at the top of the stack. Each stack area in a register object has room for 768 stack entries. When the corresponding stack pointer is between 256 and 511 (inclusive), the stack is is said to be normalized meaning that a full 256 stack variables are currently addressable, and it is safe to push at least 256 new values on the stack, both with no danger of needing to write to or read from stack overflow objects. When a stack is not normalized, the FSB machine will sometimes automatically move values within the stack area and move value to or from stack overflow objects in order to re-normalize the stack. Each display is a 256-variable area of per-stack, application-controlled variables that do not move. |
|
| PC and Code Pointer Details | |
|---|---|
|
Each register can be used to execute up to four tasks, each with its own program counter and code pointer. Code pointers point to arbitrary, heap-allocated objects and, roughly speaking, represent a "subroutine". Each program counter is a 16-bit unsigned integer index to a tail field of the corresponding code object. At all times, the next instruction scheduled for a task is the object stored in the task's code object tail field, indexed by the task's pc. Primitive operations can effect control transfers by modifying either or both their task's code pointer and program counter. |
|
| Free Lists | |
|---|---|
|
Each register contains a list of 13 free lists: one for each object size. Small object size free-lists are heaps of pages that contain non-empty free-lists of objects of that size. Large object size free-lists are heaps of unallocated large objects. |
|
All heap-resident objects either are or are part of page-sized objects. The maximum size of a heap object is one (virtual) page (64Kbytes).
An FSB instance contains at least one page (one register object). It contains at most 216 pages.
Pages are "first accessed" by an FSB node in some definite order and, at that time, each new page acquires a reference count.
When first accessed, pages are assigned sequential numbers, starting from 1. These numbers are page ids.
If a page's reference count drops to 0, it's page id may be re-used for any later page. If page ids are re-used, they are re-used from lowest-numbered to highest-numbered. Thus, overall, implementations of the FSB machine make a reasonable effort to keep page ids densely packed, near 0.
Given a pointer to a page, or an object within a page, or an internal field -- the corresponding page id can be quickly and inexpensively computed.
Page ids 0..3 are special: they are never assigned to any actual page. (Strictly speaking, therefore, the FSB machine has a limit of 216 - 4 pages.)
| Page Ids on 32-bit Systems |
|---|
| While not entirely portable, a natural implementation of page ids on 32-bit systems is to use the upper 16 bits of a pointer to a page: |
![]() |
| C code |
|
<Low-level Page Code> <Page ID Computations> <Raw Allocation of Memory for Pages> <Raw Page Re-use> |
| C code |
|---|
|
Porting Alert The reference implementation assumes that pointers are 32 bits (or, more specifically, that all pointers to FSB pages are unique in bits 16..31). There are many "truly portable" ways to implement page IDs in C but this implementation will work well on almost all popular systems and has the virtue that it can be coded in but a few lines of code. |
|
<Page ID Computations>
/*
* Note that fsbx__page_to_id works for a
* pointer to *any* location within a page,
* not just pointers to the start of pages.
*
* For example, fsbx__page_to_id can tell you
* the page id of an object, given a pointer to
* that object.
*/
static inline fsb_uint16_t
fsbx__page_to_id (fsb_quad_ptr_t page)
{
return (fsb_uint16_t)(0xffff & ((unsigned long)page.fsb_quad >> FSB_PAGE_LOG2));
}
static inline fsb_quad_ptr_t
fsbx__id_to_page (fsb_uint16_t id)
{
return (fsb_quad_ptr_t){ .fsb_quad = (fsb_quad_t *)((unsigned long)id << FSB_PAGE_LOG2) };
}
|
The page id code works only if page pointers are guaranteed to be aligned on 2FSB_PAGE_LOG2 boundaries. We'd best make sure that this is true when we allocate pages:
| C code |
|---|
|
Porting Alert
The reference implementation assumes an environment that
provides an environment that provides the
Here is how the lowest levels allocate a new page from the system: |
|
<Raw Allocation of Memory for Pages> extern fsb_quad_ptr_t fsbx__alloc_raw_pagemem (void); #ifdef FSB_LIB fsb_quad_ptr_t fsbx__alloc_raw_pagemem (void) { while (1) { void * chunk = 0; fsb_quad_ptr_t answer; (void)posix_memalign (&chunk, FSB_SIZEOF_PAGE, FSB_SIZEOF_PAGE); if (!chunk) fsbx__abort ("page allocation failed"); if ((unsigned long)chunk & ((1UL << FSB_PAGE_LOG2) - 1)) fsbx__abort ("improperly aligned page allocated"); memset (chunk, 0, FSB_SIZEOF_PAGE); answer = (fsb_quad_ptr_t) { .fsb_quad = (fsb_quad_t *)chunk }; if (fsbx__page_to_id (answer) > 3) return answer; } } #endif |
| C code |
|---|
|
Porting Alert A more complicated allocator might "release pages back to the underlying system" in different ways. |
|
<Raw Page Re-use>
extern void fsbx__free_raw_pagemem (fsb_quad_ptr_t);
#ifdef FSB_LIB
void
fsbx__free_raw_pagemem (fsb_quad_ptr_t page)
{
free ((void *)page.fsb_quad);
}
#endif
|
In each instruction cycle an FSB node examines the task selector to pick one of the 12 program counters and code pointers. It executes the primitive operation represented by the object stored in the tail of the code object, at the offset indicated by the program counter. This chapter presents the details of that process.
The complete state of an FSB machine is, in essence, a directed graph of objects within page, rooted in the three registers. The FSB machine maintains a number of invariants that constrain the structure of this graph (e.g, reference counting constraints).
An FSB's state graph is, by definition, finite. For example, the system can contain at most 216 pages of memory, 64Kbyte each.
Therefore, we can usefully think of "all valid FSB machine states" as a set of graphs:
FSB_states := the set of all valid FSB state graphs
From a programmer's perspective, an FSB state graph is represented by a set of registers: three registers pointing to register objects plus a task selector. Mathematically, we can equivocate by defining them to be the same thing:
FSB_regiseter_set_states := FSB_states
Execution of FSB programs is, semantically, a sequential execution of atomic primitive operations. Mathematically, a primitive is simply a function, albeit, a function that (realistically) can fail if we try to compute it:
FSB_primitives := FSB_states → { FSB_states, ⊥ }
Read: The set of primitives is identical with the set of functions mapping FSB states to FSB states, or mapping an FSB state to a "failure state" ("⊥" aka "bottom").
The execution of a primitive has a simple mathematical model:
Executing a primitive:
statet1 := prim (statet0)
Read: the machine state at time t1 is the state returned by the primitive function when given the state at time t0 as an argument.
The sequential execution of two primitives is the same as executing a single primitive which is the composition of the two primitive functions:
Sequential execution is function composition:
statet1 := prima (statet0)
statet2 := primb (statet1)
⇒ statet2 = primb (prima (statet0))
⇒ statet2 = primb ∘ prima (statet0)
That function composition models sequential execution of primitives is a useful property:
Primitive operations are mathematically functions. It is useful to model them in C as C functions. This section describes using C functions as primitive operations, but does not present the C API actually used. Rather, a slightly simplified API is shown. A subsequent section fills in the details and gives the actual C API.
Consider a hypothetical primitive operation such as push_nil. It's job is very simple: It must increment the stack pointer (the new top of stack is guaranteed to be initialized to nil already). It must also increment the program counter. In pseudocode, it is easy to represent this as a C function that takes a set of registers as its parameter, and returns a new set of registers as its result:
fsb_registers_t
prim_push_nil (fsb_registers_t regs)
{
SET_SP (regs, 1 + GET_SP (regs));
SET_PC (regs, 1 + GET_PC (regs));
return regs;
}
A code block is, in effect, an array of primitive operations. For example, a code block defining a subroutine to push two nils onto the stack might look like:
[ push_nil push_nil ]
Because of the compositional property of primitives, such a block is trivial to compile into C:
[ push_nil push_nil ]
⇒
fsb_registers_t
compiled_block (fsb_registers_t regs)
{
return prim_push_nil (prim_push_nil (regs));
}
One unproven hope of such compilation is that the C compiler can perform useful optimizations, particularly if inlining is used. For example, can primitive operations be define inline:
static inline fsb_registers_t
prim_push_nil (fsb_registers_t regs)
{
SET_SP (regs, 1 + GET_SP (regs));
SET_PC (regs, 1 + GET_PC (regs));
return regs;
}
such that compiled code like:
return prim_push_nil (prim_push_nil (regs));
is, in effect, expanded and optimized by the C compiler to
as follows?:
SET_SP (regs, 2 + GET_SP (regs));
SET_PC (regs, 2 + GET_PC (regs));
return regs;
Nota bene: While the reference implementation within makes a half-hearted attempt to so use GCC's inline function facility and optimizations, nevertheless, no attempt has yet been made to examine the code actually being generated. The expectation is that the generated code is not yet especially good and that more work is required to fully put this trick into practice.
Note of interest: The FSB's restrictions on memory mutation, in theory, are constraints on the aliasing of memory locations that are subject to reads and writes by primitive operations. The trick will be to make those constraints usefully obvious to the C compiler.
Ordinary primitive operations are implemented as compiled functions, callable from C. Primitives may be built-in to an FSB instance or dynamically loaded. The FSB machine facilitates the "unloading" of primitive functions which can no longer be referenced from any object.
Each FSB instance is limited as to the number of primitives that may be present at any one time: there may be at most 224 - 1. However, over the lifetime of an FSB instance, a larger number of primitive functions may be referenced (any number) with only the constraint that no more than 224 - 1 can be concurrently present. (This limitation is "reasonable" in relation to the FSB machine's overall address size limit and it is a helpful restriction when designing the representation of SMOBs.)
All primitive operations are assigned globally unique names. To facilitate the distributed allocation of these names, the FSB requires that all primitives be named within an XML namespace, designated by an arbitrary URI, reserving the URI method "fsb:" for special interpretations.
In addition to an XML namespace, each primitive also has a unique XML "QName" within that namespace. The combination of a primitive's XML namespace and qname should be globally unique.
Finally, each primitive also has an arbitrary "pretty print" string which is some white-space-free string of graphical, ascii characters, unique among primitives within the primitive's XML namespace.
For example, the full name of the "push-nil" primitive described above might be:
| primitive name example | |
|---|---|
| namespace uri | fsb:experimental-std:/0 |
| QName | push-nil |
| pretty-print name | '() |
The model of primitives developed above is simple, but in some ways too simple. Our first approximation of a C API permits only primitives that take no "arguments" other than the state of a machine.
Consider the programming challenge: "Push the 32-bit integer 5294 onto a stack." What sequence of primitives might accomplish this? One could imagine a specialized primitive:
fsb_registers_t
prim_push_5294 (fsb_registers_t regs)
{
SET_SP (regs, 1 + GET_SP (regs));
SET_STACK_ENTRY_INT32 (regs, 0, 5294);
SET_PC (regs, 1 + GET_PC (regs));
return regs
}
however it would be awkward to require such a primitive for every
integer, every floating point number, every character, and so
forth.
When a "call" to a primitive operation is encoded in a code block it is stored as a 64 bit SMOB. At most 224 - 1 primitives can be loaded, so the choice of which primitive is to be invoked can be encoded in as few as 24 bits of that 64 bit object. The type tag system is arranged such that, of the remaining 40 bits, the encoded call to primitive may include up to 32 additional bits of data, that data to be passed to the primitive as its operands
For example, in the presence of operands, we need only a single function to push a 32-bit integer (any 32-bit integer) on the stack:
fsb_registers_t
prim_push_i32 (fsb_int32_t n, fsb_registers_t regs)
{
SET_SP (regs, 1 + GET_SP (regs));
SET_STACK_ENTRY_INT32 (regs, 0, n);
SET_PC (regs, 1 + GET_PC (regs));
return regs
}
In a sense, our mathematical model is too simple. Rather than functions, primitives that accept an operand argument could be considered a kind of functor. On the other hand, one can regard the combination of a primitive with a particular operand value as, itself, our simpler kind of primitive-as-function. For example, we might still hope to see:
return prim_push_i32 (100, prim_push_i32 (200, regs))
to be compiled as:
SET_SP (regs, 2 + GET_SP (regs));
SET_STACK_ENTRY_INT32 (regs, 1, 200);
SET_STACK_ENTRY_INT32 (regs, 0, 100);
SET_PC (regs, 2 + GET_PC (regs));
return regs
rather than as:
SET_SP (regs, 1 + GET_SP (regs));
SET_STACK_ENTRY_INT32 (regs, 0, 200);
SET_PC (regs, 1 + GET_PC (regs));
SET_SP (regs, 1 + GET_SP (regs));
SET_STACK_ENTRY_INT32 (regs, 0, 100);
SET_PC (regs, 1 + GET_PC (regs));
return regs
Operands to primitives must be one of but a few types:
| Operand Types | |
|---|---|
fsb_int8_tfsb_uint8_tfsb_int16_tfsb_uint16_tfsb_int32_tfsb_uint32_tfsb_float_t |
e.g., All machine numeric type that fit in 32 bits can be used as operand types. |
fsb_quad_ptr_t
|
The pointer points at or into some referencable object in the state graph of the register passed as an argument. The primitive is not automatically granted a reference count to the object: if the primitive mutates the heap but needs the object pointer to remain valid, it must first add (and later remove, before returning) a reference count to the object. The primitive may freely mutate this object itself if the object has a reference count of 1 (i.e., it is a linear object). Regardless of the reference count, the primitive may mutate (with page locking) any cell of the object. |
fsb_pair_ptr_t
|
The pointer points into the code object from which the primitive was invoked. In particular, it points to a SMOB. This facility allows primitives to manage their operands as (up to) 32-bits of opaque data such as a pointer to a data structure from a "foreign" C library. |
fsb_uint28_t
|
As an optimization, a special 28-bit operand type is defined. This is because the tagging system includes a representation of primitives in which the primitive is represented in 32 bits rather than 24, but the operand in only 28 bits. That representation has an advantage on many underlying real machines: a 32-bit representation of a primitive is (usually) enough storage to contain a pointer directly to a C function that implements the pointer. Thus, a primitive represented with exactly 28 bits of operand can (often) be invoked more quickly than a primitive represented with as many as 32-bits of operand. 28 is a particularly useful number of bits for an operand to have, as the next section about an addressing mode demonstrates. |
A convenient class of primitives are the three address operands, each of which expects a 28 bit operand.
The 28 bits are interpreted as a choice of stack and, within that stack, an index identifying two source operands and one destination:
| Three Address Ops Code |
|---|
![]() |
There are other useful ways to break down 28-bit parameters into adresses into register objects. These are illustrated in association with the definitions of corresponding primitives.
Each 64K page is subdivided into, at most, 3,840 objects. Thus, a 12-bit integer suffices to identify any object within a sub-divided page. That integer is called the object page index.
The combination of a 16-bit page id with a 12-bit object page index is sufficient to uniquely identify any heap object. This gives rise to a useful property:
Although the FSB machine has a 32-bit address space, pointers to heap objects can be compressed to 28-bits, by converting them to a page id and object page index.
|
Porting Alert The reference implementation, as usual in the lowest-level code, assumes a 32-bit architecture. Additionally, it assumes that objects, being quad-aligned, are 4-bit aligned. |
![]() |
| C code |
|---|
|
<Object Page Indexes>
static inline fsb_uint32_t
fsbx__object_id (fsb_quad_ptr_t obj)
{
return (fsb_uint16_t)(0xfffffff & ((unsigned long)obj.fsb_quad >> FSB_QUAD_LOG2));
}
static inline fsb_uint16_t
fsbx__object_index (fsb_quad_ptr_t obj)
{
return (fsb_uint16_t)(0xfff & fsbx__object_id (obj));
}
static inline fsb_uint16_t
fsbx__object_page_id (fsb_quad_ptr_t obj)
{
return (fsb_uint16_t)(0xffff & (fsbx__object_id (obj) >> 12));
}
static inline fsb_quad_ptr_t
fsbx__page_and_index_to_obj (fsb_uint16_t page_id, fsb_uint16_t obj_index)
{
fsb_quad_ptr_t pg = fsbx__id_to_page (page_id);
return (fsb_quad_ptr_t){ .fsb_quad = pg.fsb_quad + obj_index };
}
static inline fsb_quad_ptr_t
fsbx__object_page (fsb_quad_ptr_t obj)
{
return fsbx__page_and_index_to_obj (fsbx__object_page_id (obj), 0);
}
|
Referencable heap objects 32-bytes and larger include a 28-bit user tag in their third word of memory.
| User Tags and Object Properties |
|---|
|
A user tag is either NULL (0), or a
28-bit object id (in C, an If a user tag is not NULL, the object containing the tag holds a reference count on the object referred to. This referred to object is called the referring object's properties: |
|
The FSB has built-in support for sparse arrays indexed by sixteen-bit integers. These are particularly handy in implementing the FSB's relational features, which are explained in a later chapter. Here, we simply explain how these sparse arrays work.
The particular variety of sparse array that we use is a two-level tree. Keys can be located with at most 8 comparisons. While 8 may seem a rather high number, the tree is also dynamically re-organized so that more popular keys will tend to require fewer comparisons than less popular keys. Finally, the data structure rewards both locality of reference in key space and numerically-dense-near-an-origin keys. The internal uses of sparse arrays in the FSB take advantage of these performance attributes.
Here is a series of illustrations which helps to explain the 16-bit sparse array data structure:
| Sixteen-bit Sparse Arrays | |
|---|---|
|
A sparse array object's tail is a series of quads, each quad a key-value pair. This object is the "cache" for the sparse array. All keys having some number of low-order bits in common map to the same set of key value cache entries in a hash table object. A key can be found or stored in any position from that set. |
![]() |
|
Each key-value pair contains a 16-bit key and 64-bit value, as well as data about a (possibly present) limb which is used as a kind of table-entry overflow bucket. |
![]() |
|
"Short limbs" contain (possibly sparse) key-value pairs. "Long limbs" contains 256 values for keys whose low-order 8-bits are all the same. |
![]() |
|
When a key is inserted, it is put into a cache entry (in the top-level object) if one is available. Failing that, if there is a limb that can hold the key and room in the limb, they key is inserted there. If there is a full limb, smaller in size than the region that points to it, the size of that limb is doubled and the key inserted. If there is a full limb the same size as its region, and the top-level object already contains the next larger size region, the limb is split in two and the attempt to insert the key is restarted. If there is no room to split a limb, then the size of the top level object is doubled, and the key is inserted into one of the newly created cache entries. |
![]() |
|
A Note About Optimization The above description is deliberately vague about the order in which cache entries are filled and the exact manner of searching short limbs. It also omits any mention of rearranging values, such as by swapping a key found on a limb with a key found in a cache entry. Multiple policies are possible in each of those areas, leading to a wide variety of performance characteristsics over a given set of inputs. |
|
URHERE
| C code |
|---|
|
A Test Program The first test program does nothing but allocate and initialize registers. |
|
<Register Allocation Test Program>
#ifdef FSB_REG_ALLOC_TEST
int
main (int argc, char * argv[])
{
fsb_registers_t regs = fsbx__new_regs ();
return 0;
}
#endif
|
fsbx__new_regs
| C code |
|---|
|
Three new pages from the underlying system will do. Zero is as reasonable an initial value for the task selector as any. Some individual fields on the register pages have to be allocated, though: |
|
<fsbx__new_regs>
extern fsb_registers_t fsbx__new_regs (void);
#ifdef FSB_LIB
fsb_registers_t
fsbx__new_regs (void)
{
fsb_registers_t regs;
int x;
regs.fsb_task_selector = 0;
for (x = 0; x < FSB_N_REGISTERS; ++x)
{
regs.fsb_reg[x] = fsbx__alloc_raw_pagemem ();
<Initialize a Register Page>
}
return regs;
}
#endif
|
| C code |
|---|
|
The three register pages are "sui generis" objects. We have to initialize the object header fields "by hand" as a result. |
|
<Initialize a Register Page>
fsbx__set_object_head_type (regs.fsb_reg[x], FSB_REG_HEAD_TYPE);
fsbx__set_object_tail_type (regs.fsb_reg[x], FSB_REG_TAIL_TYPE);
fsbx__set_object_refcnt (regs.fsb_reg[x], 1);
fsbx__set_object_gc_bits (regs.fsb_reg[x], FSB_GC_REGISTER_PAGE);
fsbx__set_object_size (regs.fsb_reg[x], FSB_SZ_PAGE);
|
| C code |
|---|
|
There is only a single mechanism for increasing the depth of an FSB stack: pushing between 1 and 256 nil values. (To push a non-nil value, programs first increase the stack depth and then copy the desired value to the newly allocated location.) Increasing the depth of the stack can cause an oveflow condition if there is not enough room in the register object to hold the new entries. In that case, some stack entries in the register object are moved to an overflow object, to make room in the register object: |
|
<Growing Stacks>
extern fsb_registers_t fsbx__lower_stack (fsb_registers_t regs, int r, int s);
static inline fsb_registers_t
fsbx__push (fsb_registers_t regs, int r, int s, int amt)
{
fsb_uint16_t maybe_new_sp = amt + fsbx__sp (regs, r, s);
if (maybe_new_sp < 768)
{
return fsbx__set_sp (regs, r, s, maybe_new_sp);
}
else
{
regs = fsbx__lower_stack (regs, r, s);
return fsbx__set_sp (regs, r, s, amt + fsbx__sp (regs, r, s));
}
}
<fsbx__lower_stack>
|
|
"Lowering the stack" means moving stack entries from the bottom of the stack area in the register object, to the top of the |