Added discussion of generating LLVM to README.

This commit is contained in:
Simon Brooke 2026-05-10 09:52:11 +01:00
parent 0b2aa4e070
commit 82256d0012

View file

@ -78,4 +78,31 @@ Or else, we could assemble the assembly language statements as a list of individ
But finally, and this will be my preferred outcome, one could create that list of assembly language statements in memory; one could estimate from it the size of the compiled function; one could malloc a block of memory of that size on the heap; and one could then assemble the function by writing bytes into that block of memory as specified in the assembly language statements.
If I'm going to use this side-project as an exercise to learn how to write the Post Scarcity compiler, the Post Scarcity compiler has got to be able to do that; so I should try.
If I'm going to use this side-project as an exercise to learn how to write the Post Scarcity compiler, the Post Scarcity compiler has got to be able to do that; so I should try.
## Generating LLVM-IR
All of the Ghuloum-style compilers I've seen generate to x86 assembly. Naegling -- and the Post Scarcity compiler (which are in my mind the same project) could just do the same. But I think the resultant compiler would be more hardware-agnostic, and thus more portable, as well as potentially more efficient, to use [LLVM-IR](https://llvm.org/docs/LangRef.html) instead.
### Resources
There's an [`-emit-llvm`](https://clang.llvm.org/docs/CommandGuide/clang.html#cmdoption-flto) option to the Clang C compiler which should be enough to bootstrap a compiler following the Ghuloum method.
I've found [a good introductory text on working with LLVM-IR](https://mcyoung.xyz/2023/08/01/llvm-ir/) by [Sunny Young de la Sota](https://mcyoung.xyz/). This introduced me to [an online tool, godbolt](https://godbolt.org/) which accepts source in a variety of languages including C, Scheme, Rust and Zig, and outputs corresponding LLVM-IR. This looks reasonably approachable.
There's a [discussion here on how to generate LLVM-IR from C in memory](https://stackoverflow.com/questions/34828480/generate-assembly-from-c-code-in-memory-using-libclang) which I haven't yet studied intensively but may be helpful.
### Discussion
LLVM-IR is more like an intermediate language than what I'd think of as a pure assembler; it doesn't strike me as particularly intimidating. Consider this example:
```assembly
define i32 @square(i32 %x) {
%1 = mul i32 %x, %x
ret i32 %1
}
```
**However**, it seems that LLVM is opinionated about stack, and unless I am very careful about serialising my calls to the LLVM layer, it's going to build a shadow stack of its own separate from and much more inflexible than the Post Scarcity stack, which I really do not want, because that will crash me out of memory!
So I'm thinking about a layer that just does a tight iterative loop, generating Lisp stack frames, evaluating portions of Lisp functions until the next stack frame needs to be generated, leaving the state of that calculation in the prior Lisp stack frame, exiting, iterating to do the next without generating a new LLVM stack frame, and so on. This is obviously not how LLVM is intended to be used and may be pretty awkward, but conceptually it seems possible.