Making a custom `std`!

Preface: Everything written here relies on undocumented compiler implementation details and may change at any time. So if you are looking for a reliable way to do this that is future-proof, I'm afraid one doesn't exist and will almost certainly never exist on the stable toolchain, so use at your own risk. However, If this does break at some point in the future, I will do my best to keep this updated with the changes necessary to keep it working.


For those who are working on projects such as custom operating systems and embedded programs, you may have at some point asked yourself "What if I had a custom std that was as ergonomic to use as the regular Rust std?" Well as part of my own journey in Rust OS dev, I managed to find out how to do just that :)

Step 1: Surrendering to the Nightly Overlords

As you probably expected, this is only possible on the nightly toolchain as we'll be using some feature flags to accomplish our goal, but this is likely already the case for those who are doing OS dev or developing embedded binaries.

Step 2: Trampolining into main

The initial setup for this is pretty simple, but I'll go over all the different pieces required for this to work. Let's first create our library crate, and the first item on the requirements list is: it must be named std. The actual project directory does not, however if you choose to name the directory a different name, you will need to change the name field of the Cargo.toml to "std". The reason for this is that the compiler has some heuristics for determining what core library to import the prelude from. If you make a #![no_std] crate, rustc will auto-import core's prelude, and if it's a regular crate using std, it will use std's prelude, as we're all familiar with.

Next up is making the crate #![no_std], since we can't really import ourselves now, can we? So go ahead and slap that at the top of your lib.rs. At this point you can fill in all the functions and modules you want your applications to use, as well as such things like the global allocator.

Now we can start making use of the feature flags I alluded to earlier, particularly: lang_items and prelude_import. These two feature flags will allow us to define a custom entry-point to any application that uses our std, just like the regular std, as well as letting us define our own custom prelude for the compiler to import. So after adding both feature flags to the top of our lib.rs, the first thing we'll want to do is write our entry-point function. You'll likely want the asm feature flag as well, and perhaps naked_functions, depending on your use-case. Since the entry function is very context dependent, I won't really focus on the details of implementing it. But let's take a look at the general format it'll have:

#[no_mangle]
unsafe extern "C" fn _start() -> ! {
    extern "C" {
        fn main(argc: isize, argv: *const *const u8) -> isize;
    }

    ...

    main(..., ...);

    ...
}

For those not familiar with what is being defined here:

#[no_mangle] is going to make it so that the linker can properly find our entry point function. The name of the function will depend on what your linker script defines as the entry-point for executables, so change as necessary. It's also marked as unsafe since, well, there's probably going to be some initial unsafe setup that needs performed before calling main. Though it is optional if you'd rather opt for using unsafe blocks instead.

The extern "C" block containing a main function might look a bit strange, and this is where I had a lot of problems finding information on how to do a custom std: rustc will autogenerate a main function for you, and its definition is fixed on all platforms. I was only able to find this by digging into the rustc source and finding a similar entry-point function declared for an OS (of which I don't remember which one). This function is what will call our binary's fn main. The reason for it being extern "C" is that the Rust ABI is not stable, so we need a jumping-off point to the Rust ABI defined start lang item.

Speaking of lang items: the final piece of the puzzle to get fn main() executing without any special sauce in the binaries! It looks roughly like this:

#[lang = "start"]
fn lang_start<T>(main: fn() -> T, argc: isize, argv: *const *const u8) -> isize {
    ...

    main();
    
    ...

    0
}

This is where I accepted defeat when I had initially tried to get this all to work. I kept getting ICEs when attempting to compile my start lang item! Well, it turns out that's where the generic parameter we have comes in. In Rust 1.26.0, the compiler started accepting return types other than the unit type (()), and to facilitate this change, the Rust developers added a trait to std called Termination which represents a type that can be used to determine success or failure of the fn main() function. This meant that now the function type for fn main() changed: instead of fn() -> (), it became fn() -> T where T: Termination, which meant that the start lang item in std had to change with it to include the generic type in that signature. Apparently, this means that rustc always expects #[lang = "start"] functions to have at least one generic parameter, otherwise the compiler will ICE. :D

Step 3: The Sweet, Sweet Prelude

The auto prelude import is definitely one of the biggest ergonomics gains, and besides not needing #![no_main], one of the few reasons you would want to go this far to avoid writing a normal crate. :) This part is thankfully pretty easy to do because of the prelude_import feature! All that is necessary is to define a pub use of the items that we want automagically imported into our other crates, and slap #[prelude_import] on top, as so:

#[prelude_import]
pub use prelude::rust_2018::*;

As far as I can tell, this format is explicitly required, and I'm not sure what purpose #[prelude_import] actually serves if you can't name arbitrary modules to begin with. But then again, we are using a generally undocumented, compiler-only feature, so here be dragons. 🐉 (Fun fact, this portion even changed before I released the post! It used to be the ::v1::* path, but now seems to be tied to your Rust edition)

Toss all the goodies that you would like to import from core, along with your own types and such, into the prelude module, and you're good to go! All that's left to do is add the std crate we've made to whatever crate you want to use it with.

Wrapping it up

Using this hacky solution has served me well so far in my RISC-V OS project for making userspace binaries to run, and I'd recommend anyone whose interest is piqued by this to give it a try for any projects where they would think it would be a nice quality of life change. For anyone that does, let me know if you encounter any problems so that I can update this post accordingly. I'll leave off with the code to a simple demo of this that you can run if you're on an x86_64 Linux distribution:

To build this, you'll need to have cargo-xbuild installed, since Rust doesn't have a bare-metal x86_64 target by default, so we need to define a custom target.

Project layout:

std
 |---- src/
       |---- lib.rs
       |---- entry.rs

test_bin
 |---- bare-x86_64.json
 |---- Cargo.toml
 |---- src/
       |---- main.rs

Cargo.toml (workspace members)

std/src/lib.rs

#![feature(asm, lang_items, prelude_import)]
#![no_std]

mod entry;

#[prelude_import]
pub use prelude::rust_2018::*;

pub mod prelude {
    pub mod rust_2018 {
        pub use crate::{fancy_apis::MyFancyUsefulStruct, print};
        pub use core::prelude::v1::*;
    }
}

pub mod fancy_apis {
    #[derive(Debug)]
    pub struct MyFancyUsefulStruct;

    pub fn count_chars(s: &str) -> usize {
        s.chars().count()
    }
}

pub fn add(x: usize, y: usize) -> usize {
    x + y
}

pub fn print(s: &str) {
    unsafe {
        asm!(
            "syscall",
            in("rax") 1,
            in("rdi") 1,
            in("rsi") s.as_ptr(),
            in("rdx") s.len(),
            out("rcx") _,
            out("r11") _,
        );
    }
}

#[panic_handler]
fn panic_handler(_: &core::panic::PanicInfo) -> ! {
    loop {}
}

std/src/entry.rs

#[no_mangle]
unsafe extern "C" fn _start() -> ! {
    extern "C" {
        fn main(argc: isize, argv: *const *const u8) -> isize;
    }

    main(0, core::ptr::null());

    asm!(
        "syscall",
        in("rax") 60,
        in("rdi") 0,
        options(noreturn)
    );
}

#[lang = "start"]
fn lang_start<T>(main: fn() -> T, _: isize, _: *const *const u8) -> isize {
    main();
    0
}

test_bin/bare-x86_64.json

{
    "llvm-target": "x86_64-unknown-none",
    "data-layout": "e-m:e-i64:64-f80:128-n8:16:32:64-S128",
    "arch": "x86_64",
    "target-endian": "little",
    "target-pointer-width": "64",
    "target-c-int-width": "32",
    "os": "none",
    "executables": true,
    "linker-flavor": "ld.lld",
    "linker": "rust-lld",
    "panic-strategy": "abort",
    "disable-redzone": true,
    "features": "-mmx,-sse,+soft-float"
}

test_bin/Cargo.toml

[package]
name = "test_bin"
version = "0.1.0"
authors = ["you <you@example.com>"]
edition = "2018"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
std = { path = "../std" }

test_bin/src/main.rs

fn main() {
    print("Hello, world!\n");
}

Output:

repnop@snek ~/d/m/test_custom_std> cargo +nightly xbuild -p test_bin --target test_bin/bare-x86_64.json; target/bare-x86_64/debug/test_bin
   Compiling test_bin v0.1.0 (/home/repnop/dev/misc/test_custom_std/test_bin)
    Finished dev [unoptimized + debuginfo] target(s) in 0.31s
Hello, world!