TL;DR: this is an exercise in implementing a C compiler from scratch. "From scratch" here means "without an existing gcc/clang," so consider civilization destruction scenarios, aliens reading our source code, EMP strike that takes out all smart silicon, corporate policy won't let you download development tools, you only have a javascript console and gumption, etc...
To do this, you must:
1. Implement a small tool that turns hexidecimal into binary (you can do this in any language)
2. Use whatever you have (python, POSIX shell, alien crystal substrate, x86-64 machine code, ...) to implement a small VM that runs simple bytecode. The VM has 16 registers and 16MB of working memory. There are sixteen opcodes to implement for arithmetic, memory manipulation, and control flow. There are also twelve syscalls for fopen/fread/fwrite/unlink(!)/etc.
After these two steps (that you have to repeat yourself post-civilization collapse), everything's self-hosted:
3. Use the VM to write a manual linker that resolves labels
4. Use the linker to write assembler for a custom assembly language
5. Use the assembler to implement a minimal C compiler / preprocessor, that then compiles a more complex C compiler, that can compile a C17 compiler, that then compiles doom
See also: nand2tetris (focus is on teaching, less pragmatism), Cosmopolitan C (x64 as actually portable runtime)
I wonder what's the author's view on Forth, seems like the role of the bytecode VM here might be interchangeable with a Forth implementation.
> Security: Compiler binaries can contain malware and backdoors that insert viruses into programs they compile. Malicious code in a compiler can even recognize its own source code and propagate itself. Recompiling a compiler with itself therefore does not eliminate the threat. The only compiler that can truly be trusted is one that you've bootstrapped from scratch.
It is a laudable goal, but without using from-scratch hardware and either running the bootstrap on bare metal or on a from-scratch OS, I think "truly be trusted" isn't quite reachable with an approach that only handles user-space program execution.
From the GitHub for on-ramp: it’s “self-bootstrapping and can compile itself from scratch”. What does that mean? How can it compile itself if it doesn’t exist?
Since this hasn't gotten much attention, I just wanted to say that I think this is a cool project. Nice work!
This is so great. I've been watching the project develop and it's really neat to see this milestone!
Cool project, love the bit about aliens
as an alpine linux enthusiast, i can say that this is fantastic. keep it clean
Fascinating exercise and nice work!
Adjacent (resilient, low-level, big-vision, auditable) projects include:
http://collapseos.org/ Forth OS, bootstrapable from paper, for z80
https://urbit.org/ standalone, distributed, auditable, provable, minimalist
https://justine.lol/ APE (actually portable executable); cosmopolitan libc