# State of Play ## 20260505 ### The stack frame corruption(?) bug I have a weird bug in `read_symbol`, which at present I'm not understanding. Stack frames in `0.1.0` are [paged space objects](https://www.journeyman.cc/blog/posts-output/2026-03-23-Paged-space-objects/), like all other objects; specifically they are objects of size class 4, which is to say they have a payload size of fourteen words. The first eight arguments to the function being called (which in most cases will be all the arguments) are held directly in the frame. `read_symbol` expects its arguments to be as follows (I'm numbering from zero here, although I consider that perverse and confusing, because the substrate language is C which uses numbering from zero:) | Argument | Expected value | Expected type | | -------- | --------------- | ------------------------------------ | | 0 | input stream | input stream | | 1 | read table | store (cons, hashtable or namespace) | | 2 | first character | character object | `read_symbol` then reads characters sequentially from the stream until it encounters a white-space character; for each character it reads, it creates a symbol object representing that character, and conses that object onto the list of the characters it has read so far. So if the user has typed > xyz the internal representation is now a sequence ```lisp (z y x) ``` Obviously, this now has to be reversed. So `read_symbol` then calls `reverse`. But wait! Because we're still in the bootstrap layer, the version of `read_symbol` I'm talking about is written in C. So *at the time of writing* it actually calls a wrapper function called `c_reverse` which builds the Lisp stack frame for `reverse` and then calls `reverse` with that stack frame. There was an earlier version of `c_reverse` which failed to create a new stack frame, and which would account for the bug I'm seeing; but that version has been replaced and the current version does certainly create the new stack frame: ```c /** * @brief reverse a sequence. * * A sequence is a list or a string-like-thing. A dotted pair is not a * sequence. * * @param sequence a pointer to a sequence. * @return a sequence like the `sequence` passed, but reversed; or `nil` if * the argument was not a sequence. */ struct pso_pointer c_reverse( struct pso_pointer frame_pointer, struct pso_pointer sequence ) { struct pso_pointer result = nil; if ( stackp( frame_pointer ) ) { result = reverse( make_frame(1, frame_pointer, sequence) ); } return result; } ``` So, I can see in the debugger that the sequence created in `read_symbol` is passed to `c_reverse` as the sequence argument; I can see it is put into the new frame as the first (index 0) argument; the new frame is directly passed into reverse. Reverse expects the argument in its stack frame to look like this: | Argument | Expected value | Expected type | | -------- | -------------- | ------------------------------------------ | | 0 | sequence | sequence (cons, keyword, string or symbol) | Reverse throws an exception: ```lisp ``` D'oh! And, of course, in trying to explain the bug, I've found the bug. It wasn't what I thought it was, so I was looking in the wrong place. It was this: ```diff struct pso_pointer sequence = fetch_arg( pointer_to_pso4( frame_pointer ), 0 ); - for ( struct pso_pointer cursor = sequence; !c_nilp( sequence ); + for ( struct pso_pointer cursor = sequence; !c_nilp( cursor ); cursor = c_cdr( cursor ) ) { struct pso2 *object = pointer_to_object( cursor ); switch ( get_tag_value( cursor ) ) { ``` I was checking for `nil` on the sequence, which obviously didn't change, not on the cursor, which did. D'oh! ### About debuggers I switched to Eclipse for this session, because Eclipse has really good, really easy to use, debugger integration. But I don't, as I said yesterday, much like Eclipse. It is too helpful; it gets in the way too much. Zed, Gram, Gnome Builder and VS Codium (discussed yesterday) all claim to have debugger integration, and I'm pretty sure the debugger used in all cases is the [GNU debugger, `gdb`](https://sourceware.org/gdb/) (edited: I'm wrong. Zed, and so presumably also Gram, use [`lldb`](https://lldb.llvm.org/)). `Gdb` is an excellent debugger with a truly atrocious user interface, but fortunately there's a large range of tools which wrap more or less good user interfaces around `gdb`, of which I use (and like) ['seer'](https://github.com/epasveer/seer). However it's *much* more productive to have your debugger integrated with your editor. I've tried this morning to get each of these to enter a useful debugging session. It has taken some work. Gnome Builder fails (for me) because although selecting `Run with Debugger` from the `run` menu does start both a `psse` session and a `gdb` session, and although terminating the `psse` session does show `[Inferior 1 (process 248474) exited normally]` on the GDB console, when I attempt to set a breakpoint (you don't seem to be able to set on in the GUI), I get the following: ``` > break src/c/ops/eval_apply.c:784 Make breakpoint pending on future shared library load? (y or [n]) [answered N; input not from terminal] > n Cannot execute this command without a live selected thread. ``` So there is something alive there, and probably with a bit of struggle I could make it work. Zed and Gram are much the same, because Gram is a fork of Zed. Zed appears(?) to copy VS Codium's (and thus VS Code's) approach to interacting with `gdb`. VS Codium *appears*(?) to need some sort of JSON configuration in `launch.json`. I've tried this: ```json { // Use IntelliSense to learn about possible attributes. // Hover to view descriptions of existing attributes. // For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387 "version": "0.2.0", "configurations": [ { "name": "PSSE Debug (gdb Attach)", "type": "cppdbg", "request": "attach", "program": "target/psse", // "args": ["-p", "-s1000", "-v1023"], "processId": "${command:pickProcess}", "MIMode": "gdb", "setupCommands": [ { "description": "Enable pretty-printing for gdb", "text": "-enable-pretty-printing", "ignoreFailures": true } ] } ] } ``` It does not work, at least not in VS Codium. Zed's debugger [configuration documentation](https://zed.dev/docs/debugger) is better. Using it, I was able to compose this stanza: ```json { "label": "PSSE Start debugger config", "adapter": "CodeLLDB", "request": "launch", "program": "target/psse", "cwd": "$ZED_WORKTREE_ROOT", }, ``` which successfully launches a debugger session. It's easy to set breakpoints in the editor windows; it's probably as easy to find your way around variables and stack frames as it is in Eclipse or Seer, once you get used to it (I haven't yet). I haven't yet worked out how to get it to automatically rebuild before running if it needs to do so, but I expect I shall. This is usable; but I shall need to get used to it. ## 20260504 My monster, she builds! Admittedly, she doesn't yet do much, but... ### Evaluating editors My favourite Clojure editor, [LightTable](http://lighttable.com/), went dark — or at least, ceased to be actively developed — about five years ago; and as it depends on libraries which are not available in Debian Trixie, the published executable will no longer run. At about the time it died I did have a look at whether it would be feasible for me to take over maintenance of it, and I came to the conclusion that it would be too much work. #### VS Codium So I switched to [VSCodium](https://vscodium.com/), which is a fork of Microsoft's supposedly open source VS Code editor with all the proprietary Microsoft shit taken out, some years ago. VS Codium, like VS Code, is built on [Electron](https://www.electronjs.org/), which means it's built, fundamentally, on a JavaScript library stack, with all the instability and insecurity that implies. I have been getting increasingly nervous about my use of VSCodium in the light of [increasingly frequent attacks](https://krebsonsecurity.com/2025/09/18-popular-code-packages-hacked-rigged-to-steal-crypto/) on the JavaScript ecosystem. This is not to say I dislike VSCodium; I don't. It's been, mainly, a pleasure to use. It's stable, it doesn't get in my way, it's highly configurable and extensible. I just don't have the bandwidth to monitor and audit the libraries it is using. #### Emacs In April had one of my periodic attempts to switch back to [Emacs](https://www.gnu.org/software/emacs/) — that ancient editor which is Generally Not Used Except by Middle Aged Computer Scientists. Back in the day I didn't use Emacs for editing Lisp, of course, because back in the day I was using real Lisps like Portable Standard Lisp and InterLisp which had built in structure editors. But I used to use Emacs for almost everything else, including reading my mail, browsing [Usenet](https://en.wikipedia.org/wiki/Usenet), and editing shell scripts and programs in the languages of [οἱ](https://en.wiktionary.org/wiki/οἱ#Ancient_Greek) [πολλοί](https://en.wiktionary.org/wiki/πολλοί#Ancient_Greek). And given that the substrate of Post Scarcity is (still) being written in C, just as KnacqTools was back in the day, why not Emacs? After all, it is extremely stable, and extraordinarily configurable and extensible. The answer, dear reader, is that Emacs is determined to get in my way in every possible way. It is obnoxious to use. Every key binding, every mouse action, which works in every other software package on a modern windowed user interface does something completely different in Emacs (and vice versa). Your muscle memory no longer works. Every keystroke, every command action, has to be carefully thought about. You have two choices: you can switch entirely to living only in Emacs and relearning the Emacs keybindings, or to live in a permanent hell of confusion, overthinking and self-doubt. And, in this day and age, there are many things which Emacs does not do nearly so well as more modern packages do. You **can** browse the web in Emacs — of course you can! — but, dear reader, you really wouldn't want to. #### Eclipse When I finally switched away from using Emacs for everything, sometime around 2000, I tried a number of things and ended up with [Eclipse](https://eclipseide.org/), which was at the time a fairly simple but fairly solid Java oriented integrated development environment (IDE). I stayed with Eclipse then for about a decade; but when I moved to mainly developing in Clojure, Eclipse just didn't do Clojure very well, I switched back to Emacs for a while, was driven mad by it again, and found LightTable as a blissful release; which takes us back to the beginning of this section. Last month, when I was searching for something to replace VSCodium and had realised once again how much I hate using Emacs for serious development, I tried Eclipse. It's... not awful? It's become a very polished, very configurable IDE; it has excellent facilities for C development. But I found it intrusively over-helpful: its continual 'helpful' suggestions got in my way. I used it for about ten days. I wasn't enjoying it. But what made me give up on it was because it won't follow your configured desktop colour theme, and I wasn't able to find a dark-mode theme for it that worked for me: there are plenty of themes , but they are only applied to the editing panels, not to the chrome or to any of the other panels. I find white backgrounds really unpleasant on my eyes. #### KDevelop and Gnome Builder I know I tried [KDevelop](https://kdevelop.org/) at some stage in this process. I can't remember why I rejected it. There's probably a reason. I also tried [Gnome Builder](https://apps.gnome.org/en-GB/Builder/) and rejected it very quickly, again I can't remember why; having a wee play with it just now it feels quite nice, and I may have another try. However, the Debian package of Gnome Builder [does not include the help files](https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1111418), and, without them, I haven't found out how to invoke the debugger. #### Basic text editors I obviously have a basic text editor, [gedit](https://gedit-text-editor.org/), on my system. It does C syntax highlighting very well, but doesn't do code completion, and doesn't have any integration with a build system or debugger. I have various debugger user interfaces — I like [seergdb](https://github.com/epasveer/seer) — but I do have it convenient to have a debugger integrated into my editor, rather than having to switch between two separate applications. Similarly, it's convenient to have a terminal integrated with the development environment, although it doesn't need to be. GEdit, plus seergdb, plus a terminal, plus some sort of a git browser, would work for me. #### New editors People online have suggested I try two new editors: [Zed](https://zed.dev/) and Gram: these are essentially the same editor, in fact. Zed proudly announces itself as > a minimal code editor crafted for speed and collaboration with humans and AI The Zed project seems to want to monetise their work by selling you AI tokens. Which LLM is behind their AI I don't know. Open Source development needs to be funded somehow; funding it through a tax on people who use AI is as good a way as any. Dear reader, I do **not want** to collaborate with AI; I don't want any of that shit in my working environment. So that immediately got my back up. It also doesn't have a Debian installer. But I was able to build it from source, and have been using it consistently over the last couple of days, and it's very pleasant. There's a built in debugger, but I cannot get it to work. Beyond that, my build crashes occasionally — maybe once every two or three hours; but it doesn't seem to lose anything when it crashes, so this is not obnoxious. If I ignore the 'AI' features, the lack of a working debugger is the only mark against it. [Gram](https://gram.liten.app/) is said to be a fork of Zed with the AI features removed. It has a proper Debian installation repository, which is a significant step up over Zed. Unfortunately, it won't run on my desktop machine, due to [a problem with the video card](https://codeberg.org/GramEditor/gram/issues/256). On my laptop, it runs fine, and seems generally usable — although, again, I can't get the debugger to work. #### Conclusion for now The conclusion for now is that I don't have a conclusion for now. Any of Gnome Builder, Zed and Gram are sort of good enough. Zed crashes, which is not desirable; Gram only launches on my laptop, but I mostly do serious development on my desktop; I can't yet work out how to launch the debugger on any of them. But none are annoying, none get in my way. I'll keep on evaluating. ## 20260503 Right, so, it's a week since my last entry. The version of eval/apply copied from `0.0.6` still doesn't compile, let alone work. There are reasons. I've been ill — my brain really is fucked — and I've had outdoor work it's felt urgent to do. There is progress. I am cleaning up bits of old cruft as I go. But I don't think copying the old code was a good decision. Probably, if I had started a clean room implementation a week ago, I would now have a working evaluator. Certainly, I'd have a better one. Probably, the first thing I should do when I get the old one working is write a new, clean, one. ## 20260427 ### eval/apply, yet again OK, OK. So the version of `eval`/`apply` written in C is the `:bootstrap` version — which is to say, sufficient to get Lisp bootstrapped, and to run the compiler. One or both can then be replaced by new implementations written in Lisp, to provide the `:system` versions. And any user should in principle be able to override the system versions with their own versions (although troubling worries about security come into that). So yesterday, I decided to copy the versions of `eval` and `apply` from `0.0.6` (which, after all, do work — there are lots of problems with the `0.0.6` prototype, but the interpreter is not one of them) into `0.1.0`. But then last night I read the chapter in Cees de Groot's [The Genius of Lisp](https://cdegroot.com/programming/lisp/2026/02/17/why-i-wrote-the-genius-of-lisp.html) and I'm back to wanting to reimplement them *yet again*. I'm not sure that this is wise. ## 20260424 ### To have `c_` functions or not to have `c_` functions, revisited Right, I was hugely pleased with my 'make everything a Lisp, function, and then call it from C' idea. I wrote things like: ```c print( make_frame( 2, base_of_stack, eval( make_frame( 1, base_of_stack, read( make_frame( 1, base_of_stack, input_stream ) ) ) ), output_stream ) ); ``` Isn't it beautiful? Isn't it elegant? Isn't it clear? Yes, it is. Does it work? Yes, actually, it does. Is it a total crock? Unfortunately, dear reader, it is. In this pattern, we don't have a handle on any of the stack frames made with make_frame, so we can't `dec_ref` them, so they don't get garbage collected. And while during bootstrap it's inevitable that there's a little crud left over because it was created before we have enough infrastructure set up, what I'm seeing at present from a 'start up and shut down run' is | Size class | Allocated | Deallocated | Remaining | | ------------ | ------------ | ------------ | ------------ | | 2 | 453 | 1 | 452 | | 3 | 1 | 0 | 1 | | 4 | 49 | 4 | 45 | | 5 | 0 | 0 | 0 | | 6 | 0 | 0 | 0 | The 452 unfreed objects in size class two are cons cells and string fragments, and they mostly represent the metadata on the streams `*in*`, `*out*`, `*log*` and `*sink*`, all of which are deliberately protected from garbage collection because, frankly, you don't want those things going away under you; so that's kind of OK. The one in size class three is an exception, and I'm quite pleased I'm only throwing one exception during bootstrap (although it would be nice it it got cleaned up). But the 45 unfreed objects in size class four are stackframes, and the reason they're unfreed is the coding pattern you see above. So, how to get around this? The code snippet above could be rewritten: ```c struct pso_pointer next = inc_ref( make_frame(1, base_of_stack, input_stream)); struct pso_pointer read_value = inc_ref(read(next)); dec_ref( next); next = inc_ref( make_frame(1, base_of_stack, read_value)); struct pso_pointer eval_value = inc_ref( eval( next)); dec_ref( next); dec_ref( read_value); next = inc_ref( make_frame(2, base_of_stack, eval_value, output_stream)); print( next); dec_ref( next); dec_ref( eval_value); ``` This is much more prolix and, to me, less elegant; but it does get the garbage collected. In each stanza we're first setting up a frame with the arguments for the function we're about to call, then calling that function with the frame we've set up, and then `dec_ref`ing the frame. We shouldn't need to `dec_ref` the value returned by `print`, since we don't use it and the only thing holding a reference to it is the frame in which it was created, which we do `dec_ref`. I could `dec_ref` `read_value`, for instance, as soon as I've put it into the frame for `eval` rather than after `eval` has actually been invoked, since the frame is now protecting it from garbage collection; but I've delayed doing so until afterwards out of caution. Once we have `eval`/`apply` working, we won't need to do all this bureaucratic incrementing and decrementing of reference counts explicitly, since `eval`/`apply` *should* take care of it automatically. I'm still not 100% confident I can make the reference counting garbage collector work reliably, irrespective of whether it's actually efficient. ### To recode or not to recode? There are 55 calls to `make_frame` in existing C code, and they're almost all written in the 'elegant but insanitary' pattern. Could they be rewritten more cleanly? Yes, they could. But my hope is most of this code will be replaced with code written in Lisp, once we have Lisp sufficiently bootstrapped to make that possible. So I think I'm going to put up with the uncollected garbage until we get to that point, at which point I'll audit the C code to see what is actually still in use, sanitise that, and delete the rest. However, any new C code (and there is going to have to be some) *must* be written in the sanitary but bureaucratic pattern. #### 21:24 Well, at the end of the day I think the git log says it all: ``` commit 63906fe817d509adb6171a72d16c045c2793ebed (HEAD -> feature/reengineering-17-21) Author: Simon Brooke Date: Fri Apr 24 21:20:23 2026 +0100 Print is less badly broken. Read is less badly broken. GC is too aggressive. commit 22b0160a266999c939c9a21df150542f8b2f0b25 (origin/feature/reengineering-17-21) Author: Simon Brooke Date: Fri Apr 24 09:22:06 2026 +0100 Builds and runs, but print is badly broken. Need some rethink. ``` I could just disable the garbage collector until I've got `eval`/`apply` working. I *believe* that with `eval`/`apply` I'll be able to automate all the garbage collection bookkeeping work. I hope so. Mark and sweep, or even my preferred mark but don't sweep, on a massively parallel machine, just doesn't bear thinking on. ## 20260421 ### To have `c_` functions or not to have `c_` functions? Up to now I've had a conscious design pattern of having C functions with names beginning with `c_` which were 'the simplest possible way of solving the problem in C', and C functions with names beginning `lisp_` which were (usually) wrappers around those functions designed to be callable from Lisp. The current current refactoring exercise — and the `0.1.0` design doctrine that I should only code in C things which are absolutely necessary to bootstrap the Lisp compiler — is calling into question the need for many of the `c_` functions. After all, the `lisp_` functions are callable from C, it's just a little more prolix. However, there is an overhead to calling a `lisp_` function: you have to generate a new stack frame, and there is a overhead, and consequently a time penalty. It may be in the long term it will be worth reviving `c_` functions for performance optimisation; but I think the priority for `0.1.X` is functionality, not performance. ### Type checking stack frames Passing everything around as `pso_pointers` bypasses even C's rather lax type safety. Of course this doesn't matter for code written in Lisp, because it is the compiler's responsibility to mechanically make sure that **only** stack frames are passed into functions as stack frames. But if something else was passed in as a stack frame, the results probable wouldn't be pretty, and at least while I'm mostly running boostrap functions written in C, there is a risk. Type checking the stack frame every time a function is entered is an overhead that will grow big quickly. I'm inclined to not do it in production. But I think it's essential to do it during debugging. proposal [here](). ## 20260420 Still on side projects, but those side-projects are giving me thinking time; and over the past few days I've logged four issues that I've tagged [`Architecture change`](https://git.journeyman.cc/simon/post-scarcity/issues?q=&type=all&state=open&labels=15&milestone=0&assignee=0&poster=0). These are: * 17: [Add readtables; implement quote and keyword through readtables.](https://git.journeyman.cc/simon/post-scarcity/issues/17) * 18: [Consider converting from `wchar_t` to `char32_t`, everywhere.](https://git.journeyman.cc/simon/post-scarcity/issues/18) * 20: [Environment in stack frame.](https://git.journeyman.cc/simon/post-scarcity/issues/20) * 21: [Temporary objects in a function must be bound to a locals slot in the stack frame](https://git.journeyman.cc/simon/post-scarcity/issues/21) These, especially the last, mean a fundamental change not only to the Lisp calling convention, but also to everything which may create objects — even if they're never expected to be called directly from Lisp. Generally, **every** such thing must be called with the standard Lisp calling convention (and so potentially could be called directly from Lisp), except for those very rare things where calling them with the standard calling convention would cause a runaway infinite recursion (the obvious ones are the constructors for `stack_frame` and `cons`, but there may be others); and the Lisp calling convention has to change. Which means a lot of things which have already been written for `0.1.0` have to change. So I have this morning started a new feature branch, `feature/reengineering-17-21`, to work on these four issues together; and I think the first thing to do is to audit the existing code for functions that are affected by these changes (I mean: *every* Lisp-callable function is affected by 20, but apart from that). This may also resolve the `[MANAGED_POINTER_ONLY](https://git.journeyman.cc/simon/post-scarcity/src/commit/812a1be7d9eb97c25aa07477eb71605b1af93397/src/c/payloads/function.h#L16)` issue (see [20260415](#20260415)). I *may* leave that in as a compile time switch because passing the unmanaged pointer is certainly a performance optimisation, but it will make writing the compiler a bit harder. I'm not ignoring the fact that a lot of stuff in `0.1.0` is still fundamentally broken, and the REPL still doesn't work; but getting the calling convention right at this point is still the right thing to do, and won't make any of those problems worse. Indeed, it may resolve some of them. I think this week is going to be mostly a thinking week — partly because the weather forecast is unusually benign, and it would be sensible get some outdoor work done. ### 21:30 Right, I have spent a lot of time hauling timber out of the wood today, but I've also done a substantial amount of coding, doing a sort of hybrid not-quite-standard-lisp calling convention; and I'm now convinced all this work is wrong and needs to be backed out, and I need to go for full on Lisp calling convention. So where I'm now calling `make_cons` as in this sample: ```c struct pso_pointer c_reverse( struct pso4* frame, struct pso_pointer sequence ) { struct pso_pointer result = nil; for ( struct pso_pointer cursor = sequence; !nilp( sequence ); cursor = c_cdr( cursor ) ) { result = make_cons( frame, c_car( cursor ), result ); } return result; } ``` we would instead be doing this: ```c struct pso_pointer reverse( struct pso_pointer frame) { struct pso_pointer sequence = fetch_arg( frame, 0); struct pso_pointer result = nil; for ( struct pso_pointer cursor = sequence; !nilp( sequence ); cursor = cdr( make_frame( 1, frame, cursor ) ) { result = cons( make_frame( 2, frame, car( make_frame( 1, frame, cursor )), result); } return result; } ``` Note that instead of `c_reverse`, `c_cdr`, `c_car` this is using `reverse`, `cdr`, `car`. That's because these are actual Lisp functions, callable from Lisp, which don't have to be duplicated or wrapped in Lisp-compatible wrappers. This *has* to be the right way to go. ## 20260415 OK, I have been diverted down a side-project on a side-project. I decided that since Post Scarcity definitely needs a compiler, I should learn to write a compiler, and so I should start by writing one for a simpler Lisp than Post Scarcity. So I started to write [one in Guile Scheme for Beowulf](https://git.journeyman.cc/simon/naegling). This is started but a long way from finished. I'm also not very enamoured of Guile Scheme, and am starting to wonder whether in fact I should be writing if in [Beowulf](https://git.journeyman.cc/simon/beowulf) for Beowulf. I do believe I can complete the Naegling/Beowulf compiler, and that having written it, I can write a Post Scarcity compiler in Post Scarcity. But to do that I still need to have to have at least all of * apply * assoc * bind! (or put! or set!, but I *think* I prefer `bind!`) * car * cdr * cons * cond * eq? * equal? * eval * λ * nil * print * read * t and, essentially, have all the parts of a working REPL. My brain is not working very well at present; I can't do more than a very few hours of focussed work a day, and jumping between Naegling and Post Scarcity is probably not a good plan; but in periods when I need to do thinking about where I'm going with Naegling I may switch to Post Scarcity (and vice versa). ### Standard signature for compiled functions While I'm on this, I'm wondering whether I've got the standard signature for compiled functions right. What we've inherited from the `0.0.X` branch is documented as: ```c /** * pointer to a function which takes a cons pointer (representing * its argument list) and a cons pointer (representing its environment) and a * stack frame (representing the previous stack frame) as arguments and returns * a cons pointer (representing its result). * \todo check this documentation is current! */ struct cons_pointer ( *executable ) ( struct stack_frame *, struct cons_pointer, struct cons_pointer ); ``` But actually the documentation here is wrong, because what we actually pass is a C pointer to a stack frame object (which in `0.0.X` is in vector space), a cons pointer to the cons space object which is the vector pointer to that stack frame, and a cons pointer to the environment. We definitely don't need to pass a pointer to the argument list (and in fact we didn't before, the documentation is *wrong*); we also don't need to pass both a C pointer and a cons pointer to the frame, since the frame is now in paged space, so passing our managed pointer is enough. It *might be* that passing both an unmanaged and a managed pointer is worth doing, since recovering the managed pointer from the unmanaged pointer is very expensive, and while recovering the unmanaged pointer from the managed pointer is cheap, it isn't free. But it's worth thinking about. ## 20260331 Substrate layer `print` is written; all the building blocks for substrate layer `read` is in place. This will read far less than the 0.0.6, but it will be extensible with read macros *written in Lisp*, so much more flexible, and will gradually grow to read more than the non-extensible 0.0.6 reader was. Pleased with myself. The new print may grow to be extensible in Lisp, as well. In fact, it will have to! ## 20260326 Most of the memory architecture of the new prototype is now roughed out, but in C, not in a more modern language. It doesn't compile yet. My C is getting better... but it needed to! ## 20260323 I started an investigastion of the [Zig language](https://ziglang.org/) and come away frustrated. It's definitely an interesting language, and *I think* one capable of doing what I want. But in trying to learn, I checked out someone else's [Lisp interpreter in Zig](https://github.com/cryptocode/bio). The last commit to this project is six months ago, so fairly current; project documentation is polished, implying the project is well advanced and by someone competent. It won't build. It won't build because there are breaking changes to the build system in the current version of Zig, and, according to helpful people on the Zig language Discord, breaking changes in Zig versions are quite frequent. Post-scarcity is a project which procedes slowly, and is very large indeed. I will certainly not complete it before I die. I don't feel unstable tools are a good choice. I have, however, done more thinking about [Paged space objects], and think I now have a buildable specification. ## 20260319 Right, the `member?` bug [is fixed](https://git.journeyman.cc/simon/post-scarcity/issues/11). There are, of course, lots more bugs. But I nevertheless propose to release 0.0.6 **now**, because there will always be more bugs, quite a lot works, and I'm thinking about completely rearchitecting the memory system and, at the same time, trying once more to move away from C. The reasons are given in [this essay](The-worlds-slowest-ever-rapid-prototype.md). This, of course, completely invalidates the [roadmap](Roadmap.md) that I wrote less than a month ago, but that's because I really have been thinking seriously about the future of this project. ## 20260316 OK, where we're at: * The garbage collector is doing *even worse* than it was on 4th February, when I did the last serious look at it. * The bignum bugs are not fixed. * You can (optionally) limit runaway stack crashes with a new command line option. * If you enable the stack limiter feature, `(member? 5 '(1 2 3 4))` returns `nil`, as it should, and does not throw a stack limit exception, but if you do not enable it, `(member? 5 '(1 2 3 4))` causes a segfault. WTAF? ## 20260314 When I put a debugger on it, the stack limit bug proved shallow. I'm tempted to further exercise my debugging skills by having another go at the bignum arithmetic problems. However, I've been rethinking the roadmap of the project, and written a long [blog post about it](https://www.journeyman.cc/blog/posts-output/2026-03-13-The-worlds-slowest-ever-rapid-prototype/). This isn't a finalised decision yet, but it is something I'm thinking about. ## 20260311 I've still been having trouble with runaway recursion — in `member`, but due to a primitive bug I haven't identified — so this morning I've tried to implement a stack limit feature. This has been a real fail at this stage. Many more tests are breaking. However, I think having a configurable stack limit would be a good thing, so I'm not yet ready to abandon this feature. I need to work out why it's breaking things. ## 20260226 The bug in `member` turned out to be because when a symbol is read by the reader, it has a null character appended as its last character, after all the visibly printing characters. When the type string is being generated, it doesn't. I've fudged this for now by giving the type strings an appended null character, but the right solution is almost certainly to not add the null character in either case — i.e. revert today's 'fix' and instead fix the reader. I've also done a lot of documentation, and I've found the courage to do some investigation on the bignum bug. However, I've workeg until 04:00, which is neither sane nor healthy, so I shall stop. ## 20260225 A productive day! I awoke with a plan to fix `cond`. This morning, I execoted it, and it worked. This afternoon, I fixed `let`. And this evening, I greatly improved `equal`. The bug in `member` is still unresolved. We're getting very close to the release of 0.0.6. ## 20260224 Found a bug in subtraction, which I hoped might be a clue into the bignum bug; but it proved just to be a careless bug in the small integer cache code (and therefore a new regression). Fixed this one, easily. In the process spotted a new bug in subtracting rationals, which I haven't yet looked at. Currently working on a bug which is either in `let` or `cond`, which is leading to non-terminating recursion... H'mmm, there are bugs in both. #### `let` The unit test for let is segfaulting. That's a new regression today, because in last night's buildv it doesn't segfault. I don't know what's wrong, but to be honest I haven't looked very hard because I'm trying to fix the bug in `cond`. #### `cond` The unit test for `cond` still passes, so the bug that I'm seeing is not triggered by it. So it's not necessarily a new bug. What's happening? Well, `member` doesn't terminate. The definition is as follows: ```lisp (set! nil? (lambda (o) "`(nil? object)`: Return `t` if object is `nil`, else `t`." (= o nil))) (set! member (lambda (item collection) "`(member item collection)`: Return `t` if this `item` is a member of this `collection`, else `nil`." (cond ((nil? collection) nil) ((= item (car collection)) t) (t (member item (cdr collection)))))) ``` In the execution trace, with tracing of bind, eval and lambda enabled, I'm seeing this loop on the stack: ``` Stack frame with 1 arguments: Context: <= (member item (cdr collection)) <= ((nil? collection) nil) <= (cond ((nil? collection) nil) ((= item (car collection)) t) (t (member item (cdr collection)))) <= "LMDA" Arg 0: CONS count: 6 value: (member item (cdr collection)) Stack frame with 3 arguments: Context: <= ((nil? collection) nil) <= (cond ((nil? collection) nil) ((= item (car collection)) t) (t (member item (cdr collection)))) <= "LMDA" <= (member item (cdr collection)) Arg 0: CONS count: 7 value: ((nil? collection) nil) Arg 1: CONS count: 7 value: ((= item (car collection)) t) Arg 2: CONS count: 7 value: (t (member item (cdr collection))) Stack frame with 1 arguments: Context: <= (cond ((nil? collection) nil) ((= item (car collection)) t) (t (member item (cdr collection)))) <= "LMDA" <= (member item (cdr collection)) <= ((nil? collection) nil) Arg 0: CONS count: 8 value: (cond ((nil? collection) nil) ((= item (car collection)) t) (t (member item (cdr collection)))) Stack frame with 2 arguments: Context: <= "LMDA" <= (member item (cdr collection)) <= ((nil? collection) nil) <= (cond ((nil? collection) nil) ((= item (car collection)) t) (t (member item (cdr collection)))) Arg 0: STRG count: 19 value: "LMDA" Arg 1: NIL count: 4294967295 value: nil Stack frame with 1 arguments: Context: <= (member item (cdr collection)) <= ((nil? collection) nil) <= (cond ((nil? collection) nil) ((= item (car collection)) t) (t (member item (cdr collection)))) <= "LMDA" Arg 0: CONS count: 6 value: (member item (cdr collection)) Stack frame with 3 arguments: Context: <= ((nil? collection) nil) <= (cond ((nil? collection) nil) ((= item (car collection)) t) (t (member item (cdr collection)))) <= "LMDA" <= (member item (cdr collection)) Arg 0: CONS count: 7 value: ((nil? collection) nil) Arg 1: CONS count: 7 value: ((= item (car collection)) t) Arg 2: CONS count: 7 value: (t (member item (cdr collection))) Stack frame with 1 arguments: Context: <= (cond ((nil? collection) nil) ((= item (car collection)) t) (t (member item (cdr collection)))) <= "LMDA" <= (member item (cdr collection)) <= ((nil? collection) nil) Arg 0: CONS count: 8 value: (cond ((nil? collection) nil) ((= item (car collection)) t) (t (member item (cdr collection)))) Stack frame with 2 arguments: Context: <= "LMDA" <= (member item (cdr collection)) <= ((nil? collection) nil) <= (cond ((nil? collection) nil) ((= item (car collection)) t) (t (member item (cdr collection)))) Arg 0: STRG count: 19 value: "LMDA" Arg 1: NIL count: 4294967295 value: nil ``` This then just goes on, and on, and on. The longest instance I've got the trace of wound up more than a third of a million stack frames before I killed it. What appears to be happening is that the cond clause ```lisp ((nil? collection) nil) ``` Executes correctly and returns nil; but that instead of terminating the cond expression at that point it continues and executes the following two clauses, resulting in (infinite) recursion. This is bad. But what's worse is that the clause ```lisp ((= item (car collection)) t) ``` also doesn't terminate the `cond` expression, even when it should. And the reason? From the trace, it appears that clauses *never* succeed. But if that's true, how come the unit tests are passing? Problem for another day. I'm not going to commit today's work to git, because I don't want to commit something I know segfaults. ## 20260220 ### State of the build The only unit tests that are failing now are the bignum tests, which I have consciously parked as a future problem, and the memory leak, similarly. The leak is a lot less bad than it was, but I'm worried that stack frames are not being freed. If you run ``` cat lisp/fact.lisp | target/psse -d 2>&1 |\ grep 'Vector space object of type' | sort | uniq -c | sort -rn ``` you get a huge number (currently 394) of stack frames in the memory dump; they should all have been reclaimed. There's other stuff in the memory dump as well, ``` 422 CONS ;; cons cells, obviously 394 VECP ;; pointers to vector space objects -- specifically, the stack frames 335 SYMB ;; symbols 149 INTR ;; integers 83 STRG ;; strings 46 FUNC ;; primitive (i.e. written in C) functions 25 KEYW ;; keywords 10 SPFM ;; primitive special forms 3 WRIT ;; write streams: `*out*`, `*log*`, `*sink*` 1 TRUE ;; t 1 READ ;; read stream: `*in*` 1 NIL ;; nil 1 LMDA ;; lambda function, specifically `fact` ``` Generally, for each character in a string, symbol or keyword there will be one cell (`STRG`, `SYMB`, or `KEYW`) cell, so the high number of STRG cells is not especially surprising. It looks as though none of the symbols bound in the oblist are being recovered on exit, which is undesirable but not catastrophic, since it's a fixed burden of memory which isn't expanding. But the fact that stack frames aren't being reclaimed is serious. ### Update, 19:31 Right, investigating this more deeply, I found that `make_empty_frame` was doing an `inc_ref` it should not have been, Having fixed that I'm down to 27 frames left in the dump. That's very close to the number which will be generated by running `(fact 25)`, so I expect it is now only stack frames for interpreted functions which are not being reclaimed. This give me something to work on! ## 20260215 Both of yesterday's regressions are fixed. Memory problem still in much the same state. > Allocation summary: allocated 1210; deallocated 10; not deallocated 1200. That left the add ratios problem which was deeper. I had unintended unterminated recursion happening there. :-( It burned through 74 cons pages each of 1,024 cons cells, total 76,800 cells, and 19,153 stack frames. before it got there; and then threw the exception back up through each of those 19,153 stack frames. But the actual exception message was `Unrecognised tag value 0 ( )`, which is not enormously helpful. S However, once I had recognised what the problem was, it was quickly fSixed, with the added bonus that the new solution will automatically work for bignum fractions once bignums are working. So we're down to eight unit tests failing: the memory leak, one unimplemented feature, and the bignum problem. At the end of the day I decided to chew up some memory by doing a series of moderately large computations, to see how much memory is being successfully deallocated. ```lisp :: (mapcar fact '(1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20)) (1 2 6 24 120 720 5,040 40,320 362,880 3,628,800 39,916,800 479,001,600 1,932,053,504 1,278,945,280 2,004,310,016 2,004,189,184 4,006,445,056 3,396,534,272 109,641,728 2,192,834,560) :: Allocation summary: allocated 10136; deallocated 548; not deallocated 9588. ``` So, about 5%. This is still a major problem, and is making me doubt my reference counting strategy. Must do better! Note that the reason that the numbers become eratic past about two billion is the bignum arithmetic bug. ## 20260214 ### Memory leaks The amount I'm leaking memory is now down by an order of magnitude, but the problem is not fixed. Better, not good enough. And although I'm aware of the amount to which Lisp objects are not being reclaimed, there may also be transient C objects — cheifly strings — which are also not being freed. This is an ongoing process. But you'll remember that a week ago my base case was: > Allocation summary: allocated 19986; deallocated 245; not deallocated 19741. Now it's > Allocation summary: allocated 1188; deallocated 10; not deallocated 1178. That is better. ### Unit tests The unit test system got into a mess because the bignum tests are failing. But because I know some tests are failing, and the bignum problem feels so intractable that I don't want to tackle it, I've been ignoring the fact that tests are failing; which means I've missed regressions — until I started to get an 'Attempt to take value of unbound symbol' exception for `nil`, which is extremely serious and broke a lot of things. That arose out of work on the 'generalised key/value stores' feature, logged under [#20260203](20260203), below. However, because I wasn't paying attention to failing tests, it took me a week to find and fix it. But I've fixed that one. And I've put a lot of work into [cleaning up the unit tests](https://git.journeyman.cc/simon/post-scarcity/commit/222368bf640a0b79d57322878dee42ed58b47bd6). There is more work to do on this. ### Documentation I'm also gradually working through cleaning up documentation. ### Regressions Meantime we have some regressions which are serious, and must be resolved. #### equals The core function `equals` is now failing, at least for integers. Also. ```lisp (= 0.75 3/4) ``` fails because I've never implemented a method for it, which I ought. #### cond The current unit test for `cond` and that for `recursion` both fail but *I think* this is because `equals` is failing. #### rational arithmetic I have a horrible new regression in rational arithmetic which looks as though something is being freed when it shouldn't be. #### All tests failing as at 20260214 As follows: 1. unit-tests/bignum-expt.sh => (expt 2 61): Fail: expected '2305843009213693952', got '' 2. unit-tests/bignum-expt.sh => (expt 2 64): Fail: expected '18446744073709551616', got '' 3. unit-tests/bignum-expt.sh => (expt 2 65): Fail: expected '36893488147419103232', got '' 4. unit-tests/bignum-print.sh => unit-tests/bignum-print.sh => printing 576460752303423488: Fail: expected '576460752303423488', got '0' 5. unit-tests/bignum-print.sh => printing 1152921504606846976: Fail: expected '1152921504606846976', got '0' 6. unit-tests/bignum-print.sh => printing 1152921504606846977: Fail: expected '1152921504606846977', got '1' 7. unit-tests/bignum-print.sh => printing 1329227995784915872903807060280344576: Fail: expected '1329227995784915872903807060280344576', \n got '0' 8. unit-tests/bignum.sh => unit-tests/bignum.sh => Fail: expected '1,152,921,504,606,846,976', got '0' 9. unit-tests/bignum-subtract.sh => unit-tests/bignum-subtract.sh => subtracting 1 from 1152921504606846976: Fail: expected '1152921504606846975', got '4294967295' 10. unit-tests/bignum-subtract.sh => subtracting 1 from 1152921504606846977: Fail: expected '1152921504606846976', got '0' 11. unit-tests/bignum-subtract.sh => subtracting 1 from 1152921504606846978: Fail: expected '1152921504606846977', got '1' 12. unit-tests/bignum-subtract.sh => subtracting 1152921504606846977 from 1: Fail: expected '-1152921504606846976', got '0' 13. unit-tests/bignum-subtract.sh => subtracting 10000000000000000000 from 20000000000000000000: Fail: expected '10000000000000000000', got '2313682944' 14. unit-tests/cond.sh => unit-tests/cond.sh: cond with one clause... Fail: expected '5', got 'nil' 15. unit-tests/memory.sh => Fail: expected '1188', got '10' 16. unit-tests/ratio-addition.sh => Fail: expected '1/4', got 'Error: Unrecognised tag value 4539730 ( REE)' 17. unit-tests/recursion.sh => Fail: expected 'nil 3,628,800', got '' ### New master version I haven't done a 'release' of Post Scarcity since September 2021, because I've been so despondent about the bignum problem. But actually a lot of this *is* usable, and it's at least sufficiently intereting that other people might want to play with it, and possibly even might fix some bugs. So I'm currently planning to release a new master before the end of this month, and publicise it. ## 20260204 ### Testing what is leaking memory #### Analysis If you just start up and immediately abort the current build of psse, you get: > Allocation summary: allocated 19986; deallocated 245; not deallocated 19741. Allocation summaries from the current unit tests give the following ranges of values: | | Min | Max | | | --------------- | ----- | ----- | ---- | | Allocated | 19991 | 39009 | | | Deallocated | 238 | 1952 | | | Not deallocated | 19741 | 37057 | | The numbers go up broadly in sinc with one another — that is to say, broadly, as the number allocated rises, so do both the numbers deallocated and the numbers not deallocated. But not exactly. #### Strategy: what doesn't get cleaned up? Write a test wrapper which reads a file of forms, one per line, from standard input, and passes each in turn to a fresh invocation of psse, reporting the form and the allocation summary. ```bash #1/bin/bash while IFS= read -r form; do allocation=`echo ${form} | ../../target/psse 2>&1 | grep Allocation` echo "* ${allocation}: ${form}" done ``` So, from this: * Allocation summary: allocated 19986; deallocated 245; not deallocated 19741.: * Allocation summary: allocated 19990; deallocated 249; not deallocated 19741.: () * Allocation summary: allocated 20019; deallocated 253; not deallocated 19766.: nil Allocating an empty list allocates four additional cells, all of which are deallocated. Allocating 'nil' allocates a further **29** cells, 25 of which are not deallocated. WTF? Following further work I have this, showing the difference added to the base case of cells allocated, cells deallocated, and, most critically, cells not deallocated. From this we see that reading and printing `nil` allocates an additional 33 cells, of which eight are not cleaned up. That's startling, and worrying. But the next row shows us that reading and printing an empty list costs only four cells, each of which is cleaned up. Further down the table we see that an empty map is also correctly cleaned up. Where we're leaking memory is in reading (or printing, although I doubt this) symbols, either atoms, numbers, or keywords (I haven't yet tried strings, but I expect they're similar.) | **Case** | **Delta Allocated** | **Delta Deallocated** | **Delta Not Deallocated** | | --------------------------------- | ------------------- | --------------------- | ------------------------- | | **Basecase** | 0 | 0 | 0 | | **nil** | 33 | 8 | 25 | | **()** | 4 | 4 | 0 | | **(quote ())** | 39 | 2 | 37 | | **(list )** | 37 | 12 | 25 | | **(list 1)** | 47 | 14 | 33 | | **(list 1 1)** | 57 | 16 | 41 | | **(list 1 1 1)** | 67 | 18 | 49 | | **(list 1 2 3)** | 67 | 18 | 49 | | **(+)** | 36 | 10 | 26 | | **(+ 1)** | 44 | 12 | 32 | | **(+ 1 1)** | 53 | 14 | 39 | | **(+ 1 1 1)** | 62 | 16 | 46 | | **(+ 1 2 3)** | 62 | 16 | 46 | | **(list 'a 'a 'a)** | 151 | 33 | 118 | | **(list 'a 'b 'c)** | 151 | 33 | 118 | | **(list :a :b :c)** | 121 | 15 | 106 | | **(list :alpha :bravo :charlie)** | 485 | 15 | 470 | | **{}** | 6 | 6 | 0 | | **{:z 0}** | 43 | 10 | 33 | | **{:zero 0}** | 121 | 10 | 111 | | **{:z 0 :o 1}** | 80 | 11 | 69 | | **{:zero 0 :one 1}** | 210 | 14 | 196 | | **{:z 0 :o 1 :t 2}** | 117 | 12 | 105 | Looking at the entries, we see that 1. each number read costs ten allocations, of which only two are successfully deallocated; 2. the symbol `list` costs 33 cells, of which 25 are not deallocated, whereas the symbol `+` costs only one cell fewer, and an additional cell is not deallocated. So it doesn't seem that cell allocation scales with the length of the symbol; 3. Keyword allocation does scale with the length of the keyword, apparently, since `(list :a :b :c)` allocates 121 and deallocates 15, while `(list :alpha :bravo :charlie)` allocates 485 and deallocates the same 15; 4. The fact that both those two deallocate 15, and a addition of three numbers `(+ 1 2 3)` or `(+ 1 1 1)` deallocates 16 suggest to me that the list structure is being fully reclaimed but atoms are not being. 5. The atom `'a` costs more to read than the keyword `:a` because the reader macro is expanding `'a` to `(quote a)` behind the scenes. ### The integer allocation bug Looking at what happens when we read a single digit number, we get the following: ``` 2 Entering make_integer Allocated cell of type 'INTR' at 19, 507 make_integer: returning INTR (1381256777) at page 19, offset 507 count 0 Integer cell: value 0, count 0 Entering make_integer Allocated cell of type 'INTR' at 19, 508 make_integer: returning INTR (1381256777) at page 19, offset 508 count 0 Integer cell: value 10, count 0 Entering make_integer Allocated cell of type 'INTR' at 19, 509 make_integer: returning INTR (1381256777) at page 19, offset 509 count 0 Integer cell: value 2, count 0 Entering make_integer Allocated cell of type 'INTR' at 19, 510 make_integer: returning INTR (1381256777) at page 19, offset 510 count 0 Integer cell: value 0, count 0 Entering make_integer Allocated cell of type 'INTR' at 19, 506 make_integer: returning INTR (1381256777) at page 19, offset 506 count 0 Integer cell: value 0, count 0 Entering make_integer Allocated cell of type 'INTR' at 19, 505 make_integer: returning INTR (1381256777) at page 19, offset 505 count 0 Integer cell: value 0, count 0 Entering make_integer Allocated cell of type 'INTR' at 19, 504 make_integer: returning INTR (1381256777) at page 19, offset 504 count 0 Integer cell: value 0, count 0 Allocated cell of type 'STRG' at 19, 503 Freeing cell STRG (1196577875) at page 19, offset 503 count 0 String cell: character '2' (50) with hash 0; next at page 0 offset 0, count 0 value: "2" Freeing cell INTR (1381256777) at page 19, offset 504 count 0 Integer cell: value 2, count 0 2 Allocated cell of type 'SYMB' at 19, 504 Allocated cell of type 'SYMB' at 19, 503 Allocated cell of type 'SYMB' at 19, 502 Allocated cell of type 'SYMB' at 19, 501 Freeing cell SYMB (1112365395) at page 19, offset 501 count 0 Symbol cell: character '*' (42) with hash 485100; next at page 19 offset 502, count 0 value: *in* Freeing cell SYMB (1112365395) at page 19, offset 502 count 0 Symbol cell: character 'i' (105) with hash 11550; next at page 19 offset 503, count 0 value: in* Freeing cell SYMB (1112365395) at page 19, offset 503 count 0 Symbol cell: character 'n' (110) with hash 110; next at page 19 offset 504, count 0 value: n* Freeing cell SYMB (1112365395) at page 19, offset 504 count 0 Symbol cell: character '*' (42) with hash 0; next at page 0 offset 0, count 0 value: * ``` Many things are worrying here. 1. The only thing being freed here is the symbol to which the read stream is bound — and I didn't see where that got allocated, but we shouldn't be allocating and tearing down a symbol for every read! This implies that when I create a string with `c_string_to_lisp_string`, I need to make damn sure that that string is deallocated as soon as I'm done with it — and wherever I'm dealing with symbols which will be referred to repeatedly in `C` code, I need either 1. to bind a global on the C side of the world, which will become messy; 2. or else write a hash function which returns, for a `C` string, the same value that the standard hashing function will return for the lexically equivalent `Lisp` string, so that I can search hashmap structures from C without having to allocate and deallocate a fresh copy of the `Lisp` string; 3. In reading numbers, I'm generating a fresh instance of `Lisp zero` and `Lisp ten`, each time `read_integer` is called, and I'm not deallocating them. 4. I am correctly deallocating the number I did read, though! ## 20260203 I'm consciously avoiding the bignum issue for now. My current thinking is that if the C code only handles 64 bit integers, and bignums have to be done in Lisp code, that's perfectly fine with me. ### Hashmaps, assoc lists, and generalised key/value stores I now have the oblist working as a hashmap, and also hybrid assoc lists which incorporate hashmaps working. I don't 100% have consistent methods for reading stores which may be plain old assoc lists, new hybrid assoc lists, or hashmaps working but it isn't far off. This also takes me streets further towards doing hierarchies of hashmaps, allowing my namespace idea to work — and hybrid assoc lists provide a very sound basis for building environment structures. Currently all hashmaps are mutable, and my doctrine is that that is fixable when access control lists are actually implemented. #### assoc The function `(assoc store key) => value` should be the standard way of getting a value out of a store. #### put! The function `(put! store key value) => store` should become the standard way of setting a value in a store (of course, if the store is an assoc list or an immutable map, a new store will be returned which holds the additional key/value binding). ### State of unit tests Currently: > Tested 45, passed 39, failed 6 But the failures are as follows: ``` unit-tests/bignum-add.sh => checking a bignum was created: Fail unit-tests/bignum-add.sh => adding 1152921504606846977 to 1: Fail: expected 't', got 'nil' unit-tests/bignum-add.sh => adding 1 to 1152921504606846977: Fail: expected 't', got 'nil' unit-tests/bignum-add.sh => adding 1152921504606846977 to 1152921504606846977: Fail: expected 't', got 'nil' unit-tests/bignum-add.sh => adding 10000000000000000000 to 10000000000000000000: Fail: expected 't', got 'nil' unit-tests/bignum-add.sh => adding 1 to 1329227995784915872903807060280344576: Fail: expected 't', got 'nil' unit-tests/bignum-add.sh => adding 1 to 3064991081731777716716694054300618367237478244367204352: Fail: expected 't', got 'nil' unit-tests/bignum-expt.sh => (expt 2 60): Fail: expected '1152921504606846976', got '1' unit-tests/bignum-expt.sh => (expt 2 61): Fail: expected '2305843009213693952', got '2' unit-tests/bignum-expt.sh => (expt 2 64): Fail: expected '18446744073709551616', got '16' unit-tests/bignum-expt.sh => (expt 2 65): Fail: expected '36893488147419103232', got '32' unit-tests/bignum-print.sh => printing 1152921504606846976: Fail: expected '1152921504606846976', got '1' unit-tests/bignum-print.sh => printing 1152921504606846977: Fail: expected '1152921504606846977', got '2' unit-tests/bignum-print.sh => printing 1329227995784915872903807060280344576: Fail: expected '1329227995784915872903807060280344576', \n got '1151321504605245376' unit-tests/bignum.sh => unit-tests/bignum.sh => Fail: expected '1,152,921,504,606,846,976', got '1' unit-tests/bignum-subtract.sh => unit-tests/bignum-subtract.sh => subtracting 1 from 1152921504606846976: Fail: expected '1152921504606846975', got '0' unit-tests/bignum-subtract.sh => subtracting 1 from 1152921504606846977: Fail: expected '1152921504606846976', got '1' unit-tests/bignum-subtract.sh => subtracting 1 from 1152921504606846978: Fail: expected '1152921504606846977', got '2' unit-tests/bignum-subtract.sh => subtracting 1152921504606846977 from 1: Fail: expected '-1152921504606846976', got '1' unit-tests/bignum-subtract.sh => subtracting 10000000000000000000 from 20000000000000000000: Fail: expected '10000000000000000000', got '-376293541461622793' unit-tests/memory.sh ``` In other words, all failures are in bignum arithmetic **except** that I still have a major memory leak due to not decrefing somewhere where I ought to. ### Zig I've also experimented with autotranslating my C into Zig, but this failed. Although I don't think C is the right language for implementing my base Lisp in, it's what I've got; and until I can get some form of autotranslate to bootstrap me into some more modern systems language, I think I need to stick with it. ## 20250704 Right, I'm getting second and subsequent integer cells with negative values, which should not happen. This is probably the cause of (at least some of) the bignum problems. I need to find out why. This is (probably) fixable. ```lisp :: (inspect 10000000000000000000) INTR (1381256777) at page 3, offset 873 count 2 Integer cell: value 776627963145224192, count 2 BIGNUM! More at: INTR (1381256777) at page 3, offset 872 count 1 Integer cell: value -8, count 1 ``` Also, `print` is printing bignums wrong on ploughwright, but less wrong on mason, which implies a code difference. Investigate. ## 20250314 Thinking further about this, I think at least part of the problem is that I'm storing bignums as cons-space objects, which means that the integer representation I can store has to fit into the size of a cons pointer, which is 64 bits. Which means that to store integers larger than 64 bits I need chains of these objects. If I stored bignums in vector space, this problem would go away (especially as I have not implemented vector space yet). However, having bignums in vector space would cause a churn of non-standard-sized objects in vector space, which would mean much more frequent garbage collection, which has to be mark-and-sweep because unequal-sized objects, otherwise you get heap fragmentation. So maybe I just have to put more work into debugging my cons-space bignums. Bother, bother. There are no perfect solutions. However however, it's only the node that's short on vector space which has to pause to do a mark and sweep. It doesn't interrupt any other node, because their reference to the object will remain the same, even if it is the 'home node' of the object which is sweeping. So all the node has to do is set its busy flag, do GC, and clear its busy flag, The rest of the system can just be carrying on as normal. So... maybe mark and sweep isn't the big deal I think it is? ## 20250313 OK, the 60 bit integer cell happens in `int128_to_integer` in `arith/integer.c`. It seems to be being done consistently; but there is no obvious reason. `MAX_INTEGER` is defined in `arith/peano.h`. I've changed both to use 63 bits, and this makes no change to the number of unit tests that fail. With this change, `(fact 21)`, which was previously printing nothing, now prints a value, `11,891,611,015,076,642,816`. However, this value is definitively wrong, should be `51,090,942,171,709,440,000`. But, I hadn't fixed the shift in `integer_to_string`; have now... still no change in number of failed tests... But `(fact 21)` gives a different wrong value, `4,974,081,987,435,560,960`. Factorial values returned by `fact` are correct (agree with SBCL running the same code) up to `(fact 20)`, with both 60 bit integer cells and 63 bit integer cells giving correct values. Uhhhmmm... but I'd missed two other places where I'd had the number of significant bits as a numeric literal. Fixed those and now `(fact 21)` does not return a printable answer at all, although the internal representation is definitely wrong. So we may be seeing why I chose 60 bits. Bother. ## 20250312 Printing of bignums definitely doesn't work; I'm not persuaded that reading of bignums works right either, and there are probably problems with bignum arithmetic too. The internal memory representation of a number rolls over from one cell to two cells at 1152921504606846976, and I'm not at all certain why it does because this is neither 263 nor 264. | | | | | -------------- | -------------------- | ---- | | 262 | 4611686018427387904 | | | 263 | 9223372036854775808 | | | 264 | 18446744073709551616 | | | Mystery number | 1152921504606846976 | | In fact, our mystery number turns out (by inspection) to be 260. But **why**?