Reimplementation of Beowulf in C, with compiler
Find a file
2026-04-05 15:40:12 +01:00
src OK, this is a lot of useful groundwork, but nothing really built yet. 2026-04-05 15:40:12 +01:00
.gitignore Initial commit. 2026-04-05 12:22:48 +01:00
LICENSE Initial commit 2026-04-05 11:18:15 +00:00
Makefile Initial commit. 2026-04-05 12:22:48 +01:00
README.md OK, this is a lot of useful groundwork, but nothing really built yet. 2026-04-05 15:40:12 +01:00

grendel

A reimplementation of Beowulf bootstrapped in C, with a compiler following, basically, Abdulaziz Ghuloum's recipe.

Memory model

It seems I obsess with how things are represented in memory. Although most of the people who build Ghuloum-style compilers treat memory as something of an afterthought, I'm starting with it.

In the beginning was the Word

My intention is that memory will be considered as an array of 64 bit words.

Each word may be considered as

  1. a cons cell: two instances of object32, each having one mark bit, three tag bits and 28 payload bits;
  2. a single object64, having one mark bit, seven tag bits, and 56 payload bits.

Note that, for any word, the first four bits comprise the mark and (part or all of) the tag, whether the cell is an object64 or a cons of two object32s; for this reason, all object64s will have all of the first three bits of the tag set. So:

                                   3 3                                6
 0 1 3 4    8                      1 2                                3
+-+---+-----------------------------+-+---+----------------------------+
|M|tag| payload...                  |M|tag| payload...                 |
+-+---+----+------------------------+-+---+----------------------------+
|M|111 tag | payload...                                                |
+-+--------+-----------------------------------------------------------+
where `M` represents `mark`

I've tried to do this with C structs but I've failed to get the bit fields to pack properly so I'm just going to be a barbarian and use bit masks and bit shifts.

Tag! You're it!

Tags will be allocated as follows:

3-bit value 7-bit value (Hex) Interpretation
0 0 0x0 a pointer; an offset into the vector of words.
1 1 0x1 a signed 28 bit integer.
2 2 0x2 a character; possibly just a byte, or possibly a 16 bit wchar.
3 3 0x3 unassigned (possibly a floating point number, later.)
4 4 0x4 unassigned
5 5 0x5 unassigned
6 6 0x6 unassigned
7 7 0x7 a cons cell
7 15 0xf a symbol cell (this implies a symbol can have only up to seven, or if compressed to five bits per character, eleven characters)
7 23 0x17 a pointer to a compiled function (there's a problem here; it means we can only allocate a function in the lower 72,057,594,037,927,936 bytes of memory; I think that's not going to byte us on the bum, pun intended).
7 31 0x1f a pointer to a compiled special form (same problem as above).
7 39 0x27 unassigned ? a ratio cell ?
7 47 0x2f unassigned ? a big number ?
7 55 0x37 unassigned ? a string ?
7 63 unassigned
7 71 unassigned
7 79 unassigned
7 87 unassigned
7 95 unassigned
7 103 unassigned
7 111 unassigned
7 119 unassigned
7 127 0x7f a free cell