I wrote this book so you can spend a boring weekend writing an operating system from scratch. You don’t have to write it in C - you can use your favorite programming language, like Rust or Zig.
I intentionally made it not UNIX-like and kept only the essential parts. Thinking about how the OS differs from Linux or Windows can also be fun. Designing an OS is like creating your own world—you can make it however you like!
BTW, you might notice some paragraphs feel machine-translated because, to some extent, they are. If you have some time to spare, please send me a PR. The content is written in plain Markdown [1].
Hope you enjoy :)
[1] https://github.com/nuta/operating-system-in-1000-lines/tree/...
Or like building your temple so you can talk to God directly
* https://rhodesmill.org/brandon/2012/one-sentence-per-line/
Making video game is also like creating your own world!
And it's order of magnitude less hard than making an OS...
Bonus point, you have a chance to make a living from it!
Also I'm not speaking about making a game engine along with the game.
Having a kernel dev job is different than making your own kernel, speaking on a "world builder" perspective.
MINIX book describes more practical designs, with a more feature-rich implementation. However, UNIX features such fork, brk, and tty are not intuitive for beginners. By writing a toy OS first, readers can compare the toy OS with MINIX, and understand that UNIX-like is just one of many possible designs. That's an important perspective IMO.
Also, readers can actually implement better algorithms described in the MINIX book. It makes the MINIX book more interesting to read.
The Tanenbaum book is great but it is a particle physics textbook compered to this OS cookbook.
Curious, what are the prerequisites for this? Do I have to know about how kernels work? How memory management, protection rings or processes are queued? Some I'd like to definitely learn about.
The book in question is about how to build your own operating system (i.e. a non-Linux) kernel from scratch.
> We'll implement basic context switching, paging, user mode, a command-line shell, a disk device driver, and file read/write operations in C. Sounds like a lot, however, it's only 1,000 lines of code!
Also, because the implementation in this book is very naive, it would stimulate your curiosity.
I thought the written text was very high quality and didn't show many of the usual tells of a non-native writer. Could you share some details of how you used AI tools to help write it?
That said, LLM is ofc not perfect, especially when the original text (or Japanese itself) is ambiguous. So I've heavily modified the text too. That's why there are some typos :cry:
The author explicitly describes why they chose riscv32
Why would you want the author to use riscv64 instead?
But seriously, it's not that hard. Change build options to generate 64-bit ELF, replace all 32-bit-wide parts (e.g. uint32_t, lw/sw instructions), and implement a slightly more complicated page table (e.g. Sv48).
QEMU contains a built-in GDB server, you'll need a GDB client built for the target architecture (riscv in this case) and connecting to the QEMU GDB server over the network.
1) Record & Replay: Record an execution and replay it back. You can even attach GDB while replaying, and go back in time while debugging with "reverse-next" and "reverse-continue": https://qemu-project.gitlab.io/qemu/system/replay.html
2) The QEMU monitor, especially the "gva2gpa" and "xp" commands which are very useful to debug stuff with virtual memory
3) "-d mmu,cpu_reset,guest_errors,unimp": Basically causes QEMU to log when your code does something wrong. Also check "trace:help", there's a bunch of useful stuff to debug drivers
Thankfully, GDB has a multiarch build these days which should work for all well-behaved targets in a single build.
(the place it is known to fail is for badly-behaved (embedded?) targets where there are configuration differences but no way to identify them)
I tried it again 2-3 years later and took the time to go over each subject. I even planned in advance to make sure I was going to finish it.
Shameless plug: I've written hobby OS (well, a kernel actually) in Nim for x86-64[0] and it's all documented as well. I put its development on hold until I create a JetBrains plugin for Nim (in heavy development right now).
[1] https://github.com/NJU-ProjectN/nemu/tree/master
[2] https://github.com/NJU-ProjectN/nemu/blob/master/LICENSE
Edit: wrong link
Is there any real hardware that this could run on?
Looking through this seems to use a lot of assembly. In the above the amount of assembly is kept to a minimum. Pretty much just bootstrapping and context switching. The rest is done in C.
I had a quick glance at the OS in the linked article. This seems to be based on a 32-bit RISC-V with MMU. However, AFAIK, all available RISC-V SoCs with MMU are 64-bit. The 32-bit cores are only used for embedded controllers (unless you want to start designing an FPGA-based system).
The 32 and 64 bit versions of RISC-V are _not_ binary compatible, but the differences are rather small. Porting the MMU code from 64 to 32 bit or the other way round is not very complex, see my RV32 port of xv6 at https://github.com/michaelengel/xv6-rv32 (the regular MIT xv6 version only supports RV64).
The major difference is that virtual address translation on RV32, sv32, uses a two-level page table (10 bit index for the first level, 10 bit index for the second and 12 bit offset) whereas there are several modes of translation for RV64. The most common one, sv39, uses 39 bits of the virtual address split into three 9-bit indexes (so you need a three-level page table for 4 kB pages) plus 12 bit offset.
If you make the modifications, running the OS on real hardware should not be too difficult. The Allwinner D1 is a relatively simply RV64 single code SoC (boards can be found for $20 upwards from aliexpress) and getting the CPU and a UART to work is not that difficult. You can check out my xv6 port to the D1 as a reference: https://github.com/michaelengel/xv6-d1
Shameless plug for my html version of the xv6 book: https://xv6-guide.github.io/xv6-riscv-book/
I found a small typo/editing glitch on the "RISC-V 101 page" [1]:
- It's a trending CPU ("Instruction Set Architecture") recent years.
It should probably say "ISA" instead of "CPU", and the word "in" is missing from after the parentheses, right?
Edit: Markdown, don't format the quote as code. Oops.
1: https://operating-system-in-1000-lines.vercel.app/en/02-asse...
I was wondering that too. I'll update it with other examples (x86 and Arm).
The older I get, the more I think I can figure out most problems that don't require some really gnarly domain expertise if I have a good way to iterate on them: code something, try it, see the results, see how they compare with what I wanted. It's when pieces of that are difficult or impossible, or very slow, things get more difficult.
* an issue, "make an ebook": https://github.com/nuta/operating-system-in-1000-lines/issue...
* an epub: https://github.com/pronoiac/operating-system-in-1000-lines/r... The epub is mostly ok but a bit broken.
Apples to oranges, though. It was a specialized firmware system. Probably the biggest part was the IEEE-488 communications handler.
Looks like more like 2800 lines.
[0] https://littlegreenviper.com/wp-content/uploads/2022/07/TF30...
https://github.com/nuta/microkernel-book/
https://github.com/mit-pdos/xv6-riscv
https://operating-system-in-1000-lines.vercel.app/en/17-outr...
They bothered to stop and suggest improvements here. That's enough work for them. They don't need to go elsewhere and do more, any more than you did.
When I learned OS, I followed MIT 6.828 (https://pdos.csail.mit.edu/6.828/2017/overview.html) and implemented a small OS called JOS based on Xv6. So if you're looking for some teaching OS in x86, check it out.
Are there other virtualisation-driven designs for hardware devices out there rather than the qemu stuff?
An overview of the available devices can be found in this presentation:
https://crc.dev/blog/Container%20Plumbing%202023%20-%20vfkit...
As an industry are we not supposed to be trying to move away from hoary old unsafe C?
Could we not have a hobbyist educational OSes in more of the C replacements?
Drew DeVault wrote Bunnix in Hare, in one month. There's the proof of concept.
How about tiny toy Unix-likes in Zig, Nim, Crystal, Odin, D, Rust, Circle, Carbon, Austral?
How about ones that aren't ostensibly suitable for such tasks, such as Go or Ada?
Yes I know Ada is not a good fit, but there has already been a Unix-like OS entirely implemented in a derivative of Pascal: TUNIS.
https://en.wikipedia.org/wiki/TUNIS
This might need work from skilled expert practitioners first. That's good. That's what experts are for: teaching, and uplifting newbies.
There was a project to do C# on the bare metal.
https://migeel.sk/blog/2023/12/08/building-bare-metal-bootab...
How about a Unix-like in C#? Get the Unix and .NET folks interested in this stuff.
Even if the OS never leads to anything, maybe the tooling might prove useful. I am sure someone somewhere would have uses for bare-metal GoLang.
Saying that, I really don't think we need any more Unix-like OSes. There are far far too many of those already. There is a huge problem space to be explored here, and there used to be fascinating OSes that did things no Unix-like ever did.
OSes that are by modern standards tiny and simple but explored interesting areas of OS design, and are FOSS, with code out there under permissive licenses:
* Plan 9 https://github.com/plan9foundation/plan9
* Inferno https://github.com/inferno-os/inferno-os
* Symbian https://github.com/SymbianSource
* Parhelion HeliOS https://archive.org/details/Heliosukernel
There is already an effort at Plan 9 in Rust:
https://github.com/dancrossnyc/r9
Why not Plan 9 in Zig, or Hare, or even D?
Plan 9 imposes and enforces considerably more simplicity on C as it is: you can't #include stuff that already has #include statements of its own. The result is a compilation speedup of around 3 orders of magnitude. That would be a benefit to the would-be C replacements too, wouldn't it?
Isn't it? There is a very well-developed kernel written in ADA with SPARK and formally verified at that: https://ironclad.nongnu.org
And PASCAL-derived languages were very popular for operating systems in the 80s. To name a few: Apple's LISA OS, DEC's VAXELN, and OBERON. There were others as well that didn't quite make it, like DEC's MICA and Acorn's ARX.
I did not realise VAXELN was in Pascal. The others I did know of, yes, although Ironclad only from another comment in this thread.
Why?
> https://en.wikipedia.org/wiki/TUNIS
Interesting; do you know whether the source code is available somewhere?
You might ask co-developer Prof James Cordy: https://en.wikipedia.org/wiki/James_Cordy
Or approach the University of Toronto: https://www.utoronto.ca/
Most of them don't seem to understand how anything substantially different could exist in the world of computing - every other language and operating system is seen as either an inferior copy, or as another layer of abstraction building on top of C and UNIX.
He knew how to make friends.
This industry was built on foundations laid down by people who started by with basic on 8 bit systems.
Besides, what's the point of this comment? What if people wanted to write a million more Unix-like kernels in C? Do you think this is bad? Why do you care? If you want, just write your own in whatever language you want, with whatever design you want.
> Why not Plan 9 in Zig, or Hare, or even D?
Because nobody to this point was interested in doing this. It's really that simple.
"Make your own kernel" is a thing-in-itself, and "runs on <X> hardware/VM" + "provides <Y>-like API for programs" are tangible, concrete goals to aim for, even if you personally don't like the <Y> API or the architectural choices it implies.
To give an analogy: https://www.nand2tetris.org/ is an amazing learning experience, even though games other than Tetris should and do exist
Personally, I like the AROS project, aiming to provide an operating system that implements the AmigaOS APIs and runs on many architectures, but lots of users are interested in running it on 680x0 Amigas and spiritually-related PowerPC devices: https://en.wikipedia.org/wiki/AROS_Research_Operating_System
It's OK for programmers to write a thing just for the learning experience. If it gains adoptees, that's a happy accident.
Interesting. Thanks.
> Besides, what's the point of this comment? What if people wanted to write a million more Unix-like kernels in C? Do you think this is bad? Why do you care?
Because it seems to me that modern OS design is caught in a deep deep rut, and the "OS in 1000 lines" article that we are discussing is digging that rut even deeper.
Don't repeat the mistakes of the past. Make interesting new mistakes. It's more fun.
Now’s your chance to move the state of the art forward!
The Muen Seperation Kernel (muen.sk) is a secure hypervisor written in SPARK Ada.
Ada was designed for low-level programming. It makes sense it does operating systems fairly easily. Parts of them will break its safety features. They can be validated with external tools, though.
Another trick, used in House with H layer, is to wrap the lowest-level parts which you might do in assembly. Build the GC or whatever, too. Then, everything else is in the higher-level language. These data, the lowest-level portions can be specified, verified, and implemented.
I was slightly more getting at it being not a natural fit for a Unix-like OS but I accept your point.
AFAIK, sadly, Intel BiiN is lost to history now. A Register reader wrote in to me to tell me that he was one of the developers.
What I really wanted from a developer was the i960. Specifically, the version with the object protections. That might be worth buying today for secure, embedded work. If I found the right person, I'd ask them to open-source, or at least dual-license, the i960. For embedded systems, leave it as a RISC-V alternative or port it to RISC-V.
It was. (If you ever see this, thank you again, Mr Buchanan.)
> What I really wanted from a developer was the i960.
Interesting choice!
Given that Intel has a number of distinctive architectures in its historical portfolio, and is in trouble these days due to the competition from Arm and perhaps even RISC-V, I would love to see it do either experimental revivals of some of its architectures, or open up the specs for other ones.
(Someone there must bitterly regret selling off its Arm architecture license cheaply to Marvell; now, Marvell is worth more than Intel itself.)
How about modern die-shrinks of i860 and i960, or even just FPGA versions?
After the DEC/Compaq/HP implosion, Intel also ended owning the Alpha. I would not be at all averse to a resurrected Alpha chip, even if a very low-end chip on some old cheaper process tech.
Yes, even shrinks to nodes like 180-350nm would be helpful. The older nodes are still more reliable than modern ones due to the physics involved in deep-sub micron. While not power-efficient, both i960 and Alpha would be fast and reliable.
On FPGA’s, that’s a likely use. Crash-safe.org used Alpha ISA in their early prototype. It’s also just easy for experimentation. In security and accelerators, we’re seeing many companies just throw a fat FPGA. Then layer the improvements on it to avoid the NRE cost.
Btw, Alpha had something worth continuing to talk about in new designs: PALcode. From what Alpha people told me, it is in between microcode and kernel code in nature. They can switch to PAL mode to run a series of instructions as an atomic block with more access to internal parts of the CPU. Projects could essentially extend the CPU to make things like secure, context switching or concurrent GC’s easier. They don’t have to open their internals up as much as custom microcode either.
On that note, what I’d really prefer is custom microcode on an open ISA. There were HLL to microcode compilers, at least in academia, that let you synthesize the microcode from HLL algorithms. That would be super-helpful since one could eliminate problematic instructions or add better ones with no hardware changes. Intel could still differentiate on that, too.
What is or was crash-safe.org? It seems to be gone without a trace...
https://www.cs.brandeis.edu/~dkw/papers/ieee-hst-2013-paper....
The usable part of the work was a metadata co-processor that could enforce micro-policies:
http://nikos.vasilak.is/p/pump:hasp:2014.pdf
It was spun off as Dover’s CoreGuard which I don’t know much about:
https://www.dovermicrosystems.com/solutions/coreguard/
The original design did for your CPU what Jesus Christ does for your soul. Keeps it from burning up due to user failures or external attacks. The product can’t guarantee eternal life but others are researching that.
Back to the devices, there’s at least two families of coprocessors: typed, tagged designs like Burroughs B5000 and capability security like CHERI. SAFE was more like Burroughs or even System/38’s object-centered approach. If patents are a concern, one could always just redo B5000 model itself since it’s more secure than any mainstream architecture.
You might have better luck with prompts that try to adhere better to your intent:
> English is not my first language. Please correct the spelling/grammar and perhaps teach me up to one idiom that would fit well in the following comment, but leave the meat of the message largely the same.
> Please translate the following [franglish/spanglish/(English mixed with a few words from your native language as appropriate to better convey your point)] to English suitable for a forum post.