Preface: Everything written here relies on undocumented compiler implementation details and may change at any time. So if you are looking for a reliable way to do this that is future-proof, I'm afraid one doesn't exist and will almost certainly never exist on the stable toolchain, so use at your own risk. However, If this does break at some point in the future, I will do my best to keep this updated with the changes necessary to keep it working.
For those who are working on projects such as custom operating systems and
embedded programs, you may have at some point asked yourself "What if I had a
custom std
that was as ergonomic to use as the regular Rust std
?" Well as
part of my own journey in Rust OS dev, I managed to find out how to do just that
:)
Step 1: Surrendering to the Nightly Overlords
As you probably expected, this is only possible on the nightly toolchain as we'll be using some feature flags to accomplish our goal, but this is likely already the case for those who are doing OS dev or developing embedded binaries.
Step 2: Trampolining into main
The initial setup for this is pretty simple, but I'll go over all the different
pieces required for this to work. Let's first create our library crate, and the
first item on the requirements list is: it must be named std
. The actual
project directory does not, however if you choose to name the directory a
different name, you will need to change the name
field of the Cargo.toml
to
"std"
. The reason for this is that the compiler has some heuristics for
determining what core library to import the prelude from. If you make a
#![no_std]
crate, rustc
will auto-import core
's prelude, and if it's a
regular crate using std
, it will use std
's prelude, as we're all familiar
with.
Next up is making the crate #![no_std]
, since we can't really import ourselves
now, can we? So go ahead and slap that at the top of your lib.rs
. At this
point you can fill in all the functions and modules you want your applications
to use, as well as such things like the global allocator.
Now we can start making use of the feature flags I alluded to earlier,
particularly: lang_items
and prelude_import
. These two feature flags will
allow us to define a custom entry-point to any application that uses our std
,
just like the regular std
, as well as letting us define our own custom prelude
for the compiler to import. So after adding both feature flags to the top of our
lib.rs
, the first thing we'll want to do is write our entry-point function.
You'll likely want the asm
feature flag as well, and perhaps
naked_functions
, depending on your use-case. Since the entry function is very
context dependent, I won't really focus on the details of implementing it. But
let's take a look at the general format it'll have:
#[no_mangle]
unsafe extern "C" fn _start() -> ! {
extern "C" {
fn main(argc: isize, argv: *const *const u8) -> isize;
}
...
main(..., ...);
...
}
For those not familiar with what is being defined here:
#[no_mangle]
is going to make it so that the linker can properly find our
entry point function. The name of the function will depend on what your linker
script defines as the entry-point for executables, so change as necessary. It's
also marked as unsafe
since, well, there's probably going to be some initial
unsafe
setup that needs performed before calling main
. Though it is optional
if you'd rather opt for using unsafe
blocks instead.
The extern "C"
block containing a main
function might look a bit strange,
and this is where I had a lot of problems finding information on how to do a
custom std
: rustc
will autogenerate a main
function for you, and its
definition is fixed on all platforms. I was only able to find this by digging
into the rustc
source and finding a similar entry-point function declared for
an OS (of which I don't remember which one). This function is what will call our
binary's fn main
. The reason for it being extern "C"
is that the Rust ABI is
not stable, so we need a jumping-off point to the Rust ABI defined start
lang
item.
Speaking of lang items: the final piece of the puzzle to get fn main()
executing without any special sauce in the binaries! It looks roughly like this:
#[lang = "start"]
fn lang_start<T>(main: fn() -> T, argc: isize, argv: *const *const u8) -> isize {
...
main();
...
0
}
This is where I accepted defeat when I had initially tried to get this all to
work. I kept getting ICEs when attempting to compile my start
lang item! Well,
it turns out that's where the generic parameter we have comes in. In Rust
1.26.0, the compiler started accepting return types other than the unit type
(()
), and to facilitate this change, the Rust developers added a trait to
std
called Termination
which represents a type that can be used to determine
success or failure of the fn main()
function. This meant that now the function
type for fn main()
changed: instead of fn() -> ()
, it became fn() -> T
where T: Termination
, which meant that the start
lang item in std
had to
change with it to include the generic type in that signature. Apparently, this
means that rustc
always expects #[lang = "start"]
functions to have at
least one generic parameter, otherwise the compiler will ICE. :D
Step 3: The Sweet, Sweet Prelude
The auto prelude import is definitely one of the biggest ergonomics gains, and
besides not needing #![no_main]
, one of the few reasons you would want to go
this far to avoid writing a normal crate. :) This part is thankfully pretty easy
to do because of the prelude_import
feature! All that is necessary is to
define a pub use
of the items that we want automagically imported into our
other crates, and slap #[prelude_import]
on top, as so:
#[prelude_import]
pub use prelude::rust_2018::*;
As far as I can tell, this format is explicitly required, and I'm not sure what
purpose #[prelude_import]
actually serves if you can't name arbitrary modules
to begin with. But then again, we are using a generally undocumented,
compiler-only feature, so here be dragons. 🐉 (Fun fact, this portion even
changed before I released the post! It used to be the ::v1::*
path, but now
seems to be tied to your Rust edition)
Toss all the goodies that you would like to import from core
, along with your
own types and such, into the prelude
module, and you're good to go! All that's
left to do is add the std
crate we've made to whatever crate you want to use
it with.
Wrapping it up
Using this hacky solution has served me well so far in my RISC-V OS project for making userspace binaries to run, and I'd recommend anyone whose interest is piqued by this to give it a try for any projects where they would think it would be a nice quality of life change. For anyone that does, let me know if you encounter any problems so that I can update this post accordingly. I'll leave off with the code to a simple demo of this that you can run if you're on an x86_64 Linux distribution:
To build this, you'll need to have cargo-xbuild
installed, since Rust doesn't
have a bare-metal x86_64 target by default, so we need to define a custom
target.
Project layout:
std
|---- src/
|---- lib.rs
|---- entry.rs
test_bin
|---- bare-x86_64.json
|---- Cargo.toml
|---- src/
|---- main.rs
Cargo.toml (workspace members)
std/src/lib.rs
#![feature(asm, lang_items, prelude_import)]
#![no_std]
mod entry;
#[prelude_import]
pub use prelude::rust_2018::*;
pub mod prelude {
pub mod rust_2018 {
pub use crate::{fancy_apis::MyFancyUsefulStruct, print};
pub use core::prelude::v1::*;
}
}
pub mod fancy_apis {
#[derive(Debug)]
pub struct MyFancyUsefulStruct;
pub fn count_chars(s: &str) -> usize {
s.chars().count()
}
}
pub fn add(x: usize, y: usize) -> usize {
x + y
}
pub fn print(s: &str) {
unsafe {
asm!(
"syscall",
in("rax") 1,
in("rdi") 1,
in("rsi") s.as_ptr(),
in("rdx") s.len(),
out("rcx") _,
out("r11") _,
);
}
}
#[panic_handler]
fn panic_handler(_: &core::panic::PanicInfo) -> ! {
loop {}
}
std/src/entry.rs
#[no_mangle]
unsafe extern "C" fn _start() -> ! {
extern "C" {
fn main(argc: isize, argv: *const *const u8) -> isize;
}
main(0, core::ptr::null());
asm!(
"syscall",
in("rax") 60,
in("rdi") 0,
options(noreturn)
);
}
#[lang = "start"]
fn lang_start<T>(main: fn() -> T, _: isize, _: *const *const u8) -> isize {
main();
0
}
test_bin/bare-x86_64.json
{
"llvm-target": "x86_64-unknown-none",
"data-layout": "e-m:e-i64:64-f80:128-n8:16:32:64-S128",
"arch": "x86_64",
"target-endian": "little",
"target-pointer-width": "64",
"target-c-int-width": "32",
"os": "none",
"executables": true,
"linker-flavor": "ld.lld",
"linker": "rust-lld",
"panic-strategy": "abort",
"disable-redzone": true,
"features": "-mmx,-sse,+soft-float"
}
test_bin/Cargo.toml
[package]
name = "test_bin"
version = "0.1.0"
authors = ["you <you@example.com>"]
edition = "2018"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
std = { path = "../std" }
test_bin/src/main.rs
fn main() {
print("Hello, world!\n");
}
Output:
repnop@snek ~/d/m/test_custom_std> cargo +nightly xbuild -p test_bin --target test_bin/bare-x86_64.json; target/bare-x86_64/debug/test_bin
Compiling test_bin v0.1.0 (/home/repnop/dev/misc/test_custom_std/test_bin)
Finished dev [unoptimized + debuginfo] target(s) in 0.31s
Hello, world!