I've been wanting to write a mini OS for a while now to get hands-on experience with low-level
concepts that, despite working in firmware development, we often don't touch anymore since we're using SDKs from various chip vendors and real-time operating systems.
--- Obviously, I want to do this in Rust! ---
Having chosen to work with RISC-V architecture, I used the book "OS in 1,000 Lines"
as a starting point to see how to begin,
especially regarding linker scripts and setting up QEMU. I then continued using documentation and eventually reached Interrupt handling, which isn't covered in the book.
So I won't dwell on the initial setup, which is the same except for using Rust instead of C for compilation.
I want to focus on the problems I encountered just in the initial step, specificallyin compilation due to LLVM,
how I solved them, and most importantly, what I learned. The cargo files and linker script are these:
[package]
name = "riscv32-nogui-os"
version = "0.1.0"
edition = "2024"
[dependencies]
[profile.dev]
panic = "abort"
[profile.release]
panic = "abort"
[[bin]]
name = "riscv32-nogui-os"
path = "src/kernel.rs"
[target.riscv32imac-unknown-none-elf]
rustflags = [
"-C", "link-arg=-Tkernel.ld", # our memory map
"-C", "link-arg=-Map=kernel.map",
]
[build]
target = "riscv32imac-unknown-none-elf"
ENTRY(boot)
SECTIONS {
. = 0x80200000;
.text : {
KEEP(*(.text.boot));
*(.text .text.*);
}
.rodata : ALIGN(4) {
*(.rodata .rodata.*);
}
.data : ALIGN(4) {
*(.data .data.*);
}
.bss : ALIGN(4) {
__bss = .;
*(.bss .bss.* .sbss .sbss.*);
__bss_end = .;
}
. = ALIGN(4);
. += 128 * 1024; /* 128KB */
__stack_top = .;
}
Very simply, it defines which address to start from, openSBI 0x80200000
.
The .text section (our code) will be placed there, followed by data and the stack.
Below is the Rust code where there's a small error that cost me some debugging time but from which I learned a lot:
#![no_std]
#![no_main]
use core::arch::asm;
unsafe extern "C" {
static __bss: u8;
static __bss_end: u8;
static __stack_top: u8;
}
#[panic_handler]
fn panic(_info: &core::panic::PanicInfo) -> ! {
loop {}
}
#[unsafe(no_mangle)]
fn memset(dest: *mut u8, val: u8, count: usize) {
for i in 0..count {
unsafe {
*dest.offset(i as isize) = val;
}
}
}
#[unsafe(no_mangle)]
fn kernel_main() {
let bss_size = unsafe {
(&__bss_end as *const u8)
.offset_from(&__bss as *const u8)
} as usize;
unsafe { memset(&__bss as *const u8 as *mut u8, 0, bss_size) };
loop {
}
}
#[unsafe(no_mangle)]
#[unsafe(link_section = ".text.boot")]
fn boot() -> ! {
unsafe {
asm!(
"mv sp, {stack_top}\n
j {kernel_main}\n",
stack_top = in(reg) &__stack_top,
kernel_main = sym kernel_main,
options(noreturn),
);
}
}
Good, very quickly this is a translation of the code in the boot section of the book "OS in 1,000 Lines"
to see if everything works before moving forward on my own.
I compile and launch my kernel through QEMU and see:
This seems very strange, as if nothing ever started or for some reason the next instruction to execute had returned to the initial one, the boot function. So I check the assembly generated through objdump and to my great surprise, I notice two things:
80200038 <boot>:
80200038: 80220537 lui a0, 0x80220
8020003c: 0ec50513 addi a0, a0, 0xec
80200040: 812a mv sp, a0
80200042: 0060006f j 0x80200048 <kernel_main>
80200046: 0000 unimp
The boot function wasn't at 0x80200000
(why? there's a directive saying to put the boot function at the beginning of .text)
and the translation of memset was a bit too long for the for loop written in the corresponding function of my Rust code:
80200070 <memset>:
80200070: 1141 addi sp, sp, -0x10
80200072: c606 sw ra, 0xc(sp)
80200074: c422 sw s0, 0x8(sp)
80200076: 0800 addi s0, sp, 0x10
80200078: 46c1 li a3, 0x10
8020007a: 06d66263 bltu a2, a3, 0x802000de <memset+0x6e>
8020007e: 40a006b3 neg a3, a0
80200082: 0036f813 andi a6, a3, 0x3
80200086: 01050733 add a4, a0, a6
8020008a: 00e57963 bgeu a0, a4, 0x8020009c <memset+0x2c>
8020008e: 87c2 mv a5, a6
80200090: 86aa mv a3, a0
80200092: 00b68023 sb a1, 0x0(a3)
80200096: 17fd addi a5, a5, -0x1
80200098: 0685 addi a3, a3, 0x1
8020009a: ffe5 bnez a5, 0x80200092 <memset+0x22>
8020009c: 41060633 sub a2, a2, a6
802000a0: ffc67693 andi a3, a2, -0x4
802000a4: 96ba add a3, a3, a4
802000a6: 00d77e63 bgeu a4, a3, 0x802000c2 <memset+0x52>
802000aa: 0ff5f813 andi a6, a1, 0xff
802000ae: 010107b7 lui a5, 0x1010
802000b2: 10178793 addi a5, a5, 0x101
802000b6: 02f807b3 mul a5, a6, a5
802000ba: c31c sw a5, 0x0(a4)
802000bc: 0711 addi a4, a4, 0x4
802000be: fed76ee3 bltu a4, a3, 0x802000ba <memset+0x4a>
802000c2: 8a0d andi a2, a2, 0x3
802000c4: 00c68733 add a4, a3, a2
802000c8: 00e6f763 bgeu a3, a4, 0x802000d6 <memset+0x66>
802000cc: 00b68023 sb a1, 0x0(a3)
802000d0: 167d addi a2, a2, -0x1
802000d2: 0685 addi a3, a3, 0x1
802000d4: fe65 bnez a2, 0x802000cc <memset+0x5c>
802000d6: 40b2 lw ra, 0xc(sp)
802000d8: 4422 lw s0, 0x8(sp)
802000da: 0141 addi sp, sp, 0x10
802000dc: 8082 ret
802000de: 86aa mv a3, a0
802000e0: 00c50733 add a4, a0, a2
802000e4: fee564e3 bltu a0, a4, 0x802000cc <memset+0x5c>
802000e8: b7fd j 0x802000d6 <memset+0x66>
I start checking that I was actually using my linker script during the linking phase, and everything was ok.
Then I focus on the memset and discover that LLVM, for certain useful and well-known functions like memset, memcpy, etc., recognizes the patterns and inserts known assembly code.
As far as I'm concerned, I'm writing a micro OS and would like to have everything under control, and this already didn't please me.
But effectively,
there shouldn't be reasons why this memset wouldn't work fine. The stack should be properly formed during the call since it's set up in the assembly of the boot function.
I try asking my faithful companion ChatGPT, who notices the mul instruction that isn't supported in some RISC-V architectures.
But in my nice .cargo/config.toml file, I specify target = "riscv32imac-unknown-none-elf"
where the "m" stands precisely for the extension for integer multiplication/division instructions.
Chat also adds that in bare metal environments like mine, it would be better to write your own contained functions without using the stack where the stack might not be well-formed.
I give it a nice pat on the back and put it to sleep. The stack in theory is well-formed, and I know I should have my own functions - I'm already angry with LLVM!
However, I notice a detail, an oversight: the directive #[unsafe(no_mangle)]
above the memset function. This directive leaves the function name unchanged.
Indeed, in the C code of the book, it writes its own memset function, and LLVM (yes, clang is used) doesn't dream of putting its version but correctly generates the 4 lines necessary to "fill" a memory area.
I add the directive and everything works. The program counter is magically in the right place inside the kernel_main function, and the memset function is actually as long as it should be.
All's well that ends well. I want to see my first beautiful "Hello World" on my nice terminal, without any OS underneath - actually,
I'm writing it with MY OS by directly instructing my beautiful virtual CPU.
I succeed, but something was missing. Why did the PC remain stuck with the LLVM version of memset?
In any case, that code should work - it does too much, true, but it should work. Chat doesn't know what to say; every token generated is wasted tokens, energy almost wasted - almost. Indeed,
among the thousand things, it tells me there must be something between 0x80200000
and the address where boot is placed, since I,
stupid human, insist that it's not the memset that's the problem. So I check through QEMU what instruction is at 0x80200000
, and there's something there,
not an instruction but data. I check again with objdump, but I want to see everything, not just the generated assembly, and here we are:
Contents of section .eh_frame:
80200000 10000000 00000000 017a5200 017c0101 .........zR..|..
80200010 1b0c0200 1c000000 18000000 54000000 ............T...
80200020 7a000000 00420e10 44810188 02420c08 z....B..D....B..
80200030 00000000 00000000 ........
At 0x80200000
there was an .eh_frame section, and I discover it's a useful section for stack unwinding in case of exceptions or, in the case of Rust code, after a panic.
So for now, it's totally useless in a system where I'll have to handle everything manually.
I also discover that orphaned sections can be placed at the beginning (WHY EVER??) so I remove the section by adding a piece to my linker script:
/DISCARD/ : { *(.eh_frame*) *(.eh_frame_hdr) }
and everything works with LLVM's memset (which I still don't want, and that's why I've left the no_mangle directive).
This is just the beginning of this project that will lead me toward writing my micro OS, but I'm already very happy because due to a small oversight,
I had to read articles, documentation, and understand some lines of assembly
(how I loved computer architecture exercises at university where we had to act as a compiler and translate C code into MIPS assembly).
Now as I write, these seem like trivial matters, but I couldn't find solutions and explanations, so here I am.
Next, I'll try to dive deep into Rust macros to implement my own basic println!