paged space objects #13

Open
opened 2026-03-20 22:17:34 +00:00 by simon · 2 comments
Owner

Paged space objects will be a generalisation of cons space objects. A page — all pages will be of the same size — will comprise a header, followed by an array of objects. All the objects in a given page will be of the same size. The number of objects in a page is simply the number of objects of the size which the page supports that will fit into the page.

All paged space objects will have a header of the same size which will be broadly the same as the version 0.0.6 cons space object header, except that it will have a payload-size field, which will probably have to be stolen out of the reference count field. However, if we issue only paged space objects whose payload sizes in 64 bit words are powers of two, we would then have

Bits Field value Number of words Number of bytes
0000 0 1 8
0001 1 2 16
0010 2 4 32
0011 3 8 64
0100 4 16 128
0101 5 32 256
0110 6 64 512
0111 7 128 1024
1000 8 256 2048
1001 9 512 4096
1010 10 1024 8192
1011 11 2048 16384
1100 12 4096 32768
1101 13 8192 65536
1110 14 16384 131072
1111 15 32768 262144

The page itself obviously also has to have an object size field; and for computational efficiency probably ought to have both the bit field and a uint32 specifying the equivalent number of words (? or bytes?).

Which means we get up to quarter megabyte objects storeable in pages by robbing only four bits out of the reference count, meaning we can have up to quarter of a billion references to an object before it is locked in. However, extracting a four bit bitfield and a 28 bit bitfield out of the same 32 bit uint32, and treating each as uints, while economic of memory, may be quite expensive of compute, because each time we want to operate on a value, we need to mask it out, operate on it, limit check it, and then mask it back in. This isn't such a big deal for the four bit size field which we will often read but never write, but significantly more important for the refcount field which we will increment and decrement frequently.

So it probably makes more sense to rob four (or even eight) bits out of the tag field, which is often read but written only on assignment and on garbage collection. This means sacrificing the human-readable tags, but they are absurdly large and wasteful.

We can still float objects of greater size than we decide to store in a page, in vector space, but we don't need to have a separate vector pointer object for each object in page space, only for free floating objects. Pages are themselves obviously objects in vector space, so probably do need a VECP (vector pointer object) pointing to each of them.

We won't initialise a page for objects of a given size until we have an object of that size to store. When all existing pages for objects of a given size is full, we will automatically allocate a new page for objects of that size. When a page for objects of existing size is entirely empty (i.e. all objects in it are FREE), that page should be deallocated and the freelist for objects of that size fixed up.

Cons-space objects simply become paged-space-object instances of size 2; stack frames become instances of size 16 (and can grow extra registers, because there's space).

Hash tables can be of any paged-space-object size above 16; the table knows how many buckets it has, since the number of buckets is necessarily the word size of the object, less one word for a pointer to the hashing function (to allow custom hashing functions) and one word for a pointer to the write ACL, since I think the only practical difference between a hash table and a namespace will be that namespaces have a write ACL that allows writes by somebody, whereas a hash table will have a write ACL of nil — i.e. that blocks writes from anyone.

(Indeed, actually, it would be possible to have variable size stack frames, but I think that would make eval/apply very complicated)

I think this works.

Paged space objects will be a generalisation of cons space objects. A page — all pages will be of the same size — will comprise a header, followed by an array of objects. All the objects in a given page will be of the same size. The number of objects in a page is simply the number of objects of the size which the page supports that will fit into the page. All paged space objects will have a header of the same size which will be broadly the same as the version 0.0.6 cons space object header, except that it will have a payload-size field, which will probably have to be stolen out of the reference count field. However, if we issue only paged space objects whose payload sizes in 64 bit words are powers of two, we would then have | Bits | Field value | Number of words | Number of bytes | | ---- | ------------- | ------------------ | ------------------- | | 0000 | 0 | 1 | 8 | | 0001 | 1 | 2 | 16 | | 0010 | 2 | 4 | 32 | | 0011 | 3 | 8 | 64 | | 0100 | 4 | 16 | 128 | | 0101 | 5 | 32 | 256 | | 0110 | 6 | 64 | 512 | | 0111 | 7 | 128 | 1024 | | 1000 | 8 | 256 | 2048 | | 1001 | 9 | 512 | 4096 | | 1010 | 10 | 1024 | 8192 | | 1011 | 11 | 2048 | 16384 | | 1100 | 12 | 4096 | 32768 | | 1101 | 13 | 8192 | 65536 | | 1110 | 14 | 16384 | 131072 | | 1111 | 15 | 32768 | 262144 | The page itself obviously also has to have an object size field; and for computational efficiency probably ought to have both the bit field and a uint32 specifying the equivalent number of words (? or bytes?). Which means we get up to quarter megabyte objects storeable in pages by robbing only four bits out of the reference count, meaning we can have up to quarter of a billion references to an object before it is locked in. However, extracting a four bit bitfield and a 28 bit bitfield out of the same 32 bit uint32, and treating each as uints, while economic of memory, may be quite expensive of compute, because each time we want to operate on a value, we need to mask it out, operate on it, limit check it, and then mask it back in. This isn't such a big deal for the four bit `size` field which we will often read but never write, but significantly more important for the `refcount` field which we will increment and decrement frequently. So it probably makes more sense to rob four (or even eight) bits out of the `tag` field, which is often read but written only on assignment and on garbage collection. This means sacrificing the human-readable tags, but they are absurdly large and wasteful. We can still float objects of greater size than we decide to store in a page, in vector space, but we don't need to have a separate vector pointer object for each object in page space, only for free floating objects. Pages are themselves obviously objects in vector space, so probably do need a `VECP` (vector pointer object) pointing to each of them. We won't initialise a page for objects of a given size until we have an object of that size to store. When all existing pages for objects of a given size is full, we will automatically allocate a new page for objects of that size. When a page for objects of existing size is entirely empty (i.e. all objects in it are `FREE`), that page should be deallocated and the freelist for objects of that size fixed up. Cons-space objects simply become paged-space-object instances of size 2; stack frames become instances of size 16 (and can grow extra registers, because there's space). Hash tables can be of any paged-space-object size above 16; the table knows how many buckets it has, since the number of buckets is necessarily the word size of the object, less one word for a pointer to the hashing function (to allow custom hashing functions) and one word for a pointer to the write ACL, since I think the only practical difference between a hash table and a namespace will be that namespaces have a write ACL that allows writes by *somebody*, whereas a hash table will have a write ACL of `nil` — i.e. that blocks writes from anyone. (Indeed, actually, it would be possible to have variable size stack frames, but I think that would make `eval`/`apply` very complicated) I think this works.
simon added the
enhancement
label 2026-03-20 22:17:34 +00:00
Author
Owner

Obviously, we need separate free lists for each size of paged space objects; we can have a [0...16] array of pointers to those free lists.

Obviously, we need separate free lists for each size of paged space objects; we can have a [0...16] array of pointers to those free lists.
Author
Owner

A perverse thought:

We need to store four bits. We currently have four bytes for the tag field, but we currently interpret those bytes as upper case ASCII characters. Which means we don't use the case bit on those characters, and we don't use the high bit on those characters. Hiding the size field in the tag field in the case bit would be entirely plausible.

Extracting it would be expensive, of course; but how often would we actually need to extract it?

A perverse thought: We need to store four bits. We currently have four bytes for the `tag` field, but we currently interpret those bytes as upper case ASCII characters. Which means we don't use the case bit on those characters, and we don't use the high bit on those characters. Hiding the size field in the tag field in the case bit would be entirely plausible. Extracting it would be expensive, of course; but how often would we actually need to extract it?
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: simon/post-scarcity#13
No description provided.