Saturday 7 December 2019

Tiny Windows executable in Rust

I have recently spent a lot of time writing pixel shaders and given that I have already written a pure Rust mod player I have started to think about trying my hand at writing a 64K intro in Rust.
One of the main challenges in writing a 64K intro is to squeeze all the code and assets into 64K of memory. There are several tiny frameworks written in C++ that set up a Windows app with a modern OpenGL context. I could not find anything similar for Rust so I decided to create the smallest possible bare bones app that does just that.

First attempt at a minimal windows app

There are quite a few good crates for creating a small windows apps, such as winit. My first attempt was to create a windows program using winit and enabling all possible size optimizations to see what I end up with. If the resulting program is small enough I could declare success and stop there.
After setting up a small project using the winit example and enabling most of the size optimizations listed in johnthagen's excellent article on I ended up with the following toml file.
lto = true 
codegen-units = 1
opt-level = "z"   
panic = 'abort'

winit = "0.20.0-alpha4"
The settings in the toml file enable the have the following optimizations;
  • lto = true Enables link time optimizations which tells the compiler to optimize code generation at link time and can result in dropping code that is not used.
  • codegen-units = 1 Tells the compiler to use only one code generator instead of running several in parallel. Disabling parallel code generation makes the compilation slower but makes all optimizations possible.
  • opt-level = "z" Tells the compiler to optimize for minimal code size. This will make the code less performant but it will take up less space.
  • panic = 'abort' Stops Rust generating a helpful stack trace and panic message when it panics. With this optimization it will be much harder to figure out what went wrong when the program crashes.
With all these optimizations enabled the release build of the winit window example comes to 233 kilobytes. This is nearly 4 times as large as the entire 64K intro can be and the app does not do anything yet!


The one optimization I didn't initially enable from johnthagen's article was using xargo as it seemed like a lot of work to set up all the necessary tools and figuring out the command line but my initial result make it clear that I need to use all the available tools to minimize the executable size. ( Re-reading the xargo section I realized that the setup was actually quite simple)
Xargo tells Rust to compile its custom version of the std crate. By default a standard version is used that includes a lot of unused functionality. Enabling xargo requires a new setup file; xargo.toml that describes how to compile the std library.
std = {default-features=false,features=["panic_immediate_abort"]}
It also requires switching to nightly build and installing xargo. The command line for running xargo is the same as cargo but you also need to define the target platform ( which you can find out by looking up hostwhen running the command rustc -vV). In my case the command is
xargo build --target x86_64-pc-windows-msvc --release
With xargo the size of the executable is now 118 kilobytes. This is a big improvement but it is still too large. It is also a lot bigger than the 30kb that johnthagen achieved at this point. The big difference is that we are building a windows app and using the winit crate to access Windows functionality. Clearly, we need to get rid of the crate and access Windows API without any helper crates.

Winapi crate

The winapi crate provides foreign function interfaces to the Windows API but does not provide any coded functionality of its own. This means that the application is responsible for all of functionality not provided by Windows.
I created a new project using the previous cargo.toml and xargo.toml files as my starting point but change the dependencies section in cargo.toml to;
winapi = { version = "0.3.8", features = ["winuser", "libloaderapi" ] }
Earlier versions of the winapi crate split the functionality across multiple crate but now some parts of the Windows API are enabled using 'features'. The APIs I need are in winuser and libloaderapi.

The main function

I learned a lot about setting up the plain Window in Rust from TheSatoshiChiba's gist and lot of the code here is based on his example.
The top of the file has the following statements;
#![windows_subsystem = "windows"]
The first line tells the compiler the the code will provide its own C entry point to the code. Normally this would be provided by the std crate but that uses additional std functionality that bloats the executable.
The second line defines this as a windows program that does not have a console window. Some older Rust Windows applications needed additional functionality to close the console window. This eliminates the need for that. There is full explanation on
The actual main functions is;
pub fn main(_argc: i32, _argv: *const *const u8) {
    let mut window = create_window(  );

    loop {
        if !handle_message( &mut window ) {
The line #[no_mangle] turns off the Rust name mangling making it conform with the signature expected for the the main functions. The rest of the code 'just' creates a window and enters a message processing loop.

Character sets with W and A funtions

There are two variants of many Windows functions; the W and A variants. The difference is in which character set they use. The W versions use wide characters strings (16 bit unicode ) and the A versions use ANSI characters.
Rust natively uses UTF-8 for its strings. For the first 127 characters UTF-8 and ANSI have the same encoding, so if I stick with those characters I can just assume that they are the same.
If I was writing an interactive application that needs to run in different regions I should really be using the W versions but because I intend to use this for writing a 64K intro I am going to use the A version because it makes the required string handling much easier.

Creating the window

All of the interaction with the Windows API must be marked as unsafe code because Rust cannot verify that the code satisfies all the usual Rust rules regarding ownership and initialization. All of the code dealing with the Windows API is surrounded by unsafe{ ... } block.
The window creation function performs two operations; registering a new window class, and creating a windoww of that class.
The code for registering the window class is
let hinstance = GetModuleHandleA( 0 as *const i8 );
let wnd_class = WNDCLASSA {
    style : CS_OWNDC | CS_HREDRAW | CS_VREDRAW,     
    lpfnWndProc : Some( window_proc ),
    hInstance : hinstance,
    lpszClassName : "MyClass\0".as_ptr() as *const i8,
    cbClsExtra : 0,         
    cbWndExtra : 0,
    hIcon: 0 as HICON,
    hCursor: 0 as HICON,
    hbrBackground: 0 as HBRUSH,
    lpszMenuName: 0 as *const i8,
RegisterClassA( &wnd_class );
The first line gets the handle to the module that launched the current module. In a C program this would be passed into winMain as an argument but calling GetModuleHandleA with a null argument returns the same handle.
The rest of the function uses the WNDCLASSA constructor to setup windows class object. Because I am using A variants of the Windows API I can directly use the string constant without any conversion. The one details is that Rust string slices are not null terminated which is why I have added the zero to the end of the class name.
The rest of the create_window function creates the actual window and returns the handle.
        let handle = CreateWindowExA(
            0,                                  // dwExStyle 
            "MyClass\0".as_ptr() as *const i8, // class we registered.
            "MiniWIN\0".as_ptr() as *const i8,  // title
            WS_OVERLAPPEDWINDOW | WS_VISIBLE, // dwStyle
            // size and position
            0 as HWND,                  // hWndParent
            0 as HMENU,                 // hMenu
            hinstance,                  // hInstance
            0 as LPVOID );              // lpParam

The message pump

The second part of the code is the message handler which reads messages from the message queue and processes them.
fn handle_message( window : HWND ) -> bool {
    unsafe {
        let mut msg = MaybeUninit::uninit();
        if GetMessageA( msg.as_mut_ptr(), window, 0, 0 ) > 0 {
            TranslateMessage( msg.as_ptr() ); 
            DispatchMessageA( msg.as_ptr() ); 
        } else {
The noteworthy feature here is the use of MaybeUninit::uninit() which creates a structure on the stack without initializing it. Under normal Rust safety rules I would have to use the constructor to create the structure but here the msg structure is filled in by GetMessageA. Because a lot of Windows APIs follow this pattern it is nice to be able to use MaybeUninit::uninit() to avoid having to call constructors with dummy values.

The window_proc

The final piece is to create the window proc that handles all the messges that are routed to it. Strictly speaking this is not necessary because we could just have passed DefWindowProcA to RegisterClassA but I want to see something that proves that this is a well functioning normal Windows app.
pub unsafe extern "system" fn windowProc(hwnd: HWND,
    msg: UINT, wParam: WPARAM, lParam: LPARAM) -> LRESULT {

    match msg {
        winapi::um::winuser::WM_PAINT => {
            let mut paint_struct = MaybeUninit::uninit();
            let mut rect = MaybeUninit::uninit();
            let hdc = BeginPaint(hwnd, paint_struct.as_mut_ptr());
            GetClientRect(hwnd, rect.as_mut_ptr());
            DrawTextA(hdc, "Test\0".as_ptr() as *const i8, -1, rect.as_mut_ptr(), DT_SINGLELINE | DT_CENTER | DT_VCENTER);
            EndPaint(hwnd, paint_struct.as_mut_ptr());
        winapi::um::winuser::WM_DESTROY => {
        _ => { return DefWindowProcA(hwnd, msg, wParam, lParam); }
    return 0;

The shrinking executable

The new program comes to 10 kilobytes (or precisely 10240 bytes). This is a big improvement and takes the excutable into the range required for 64K intro. ( An exe packer like UPX reduces the size of the executable into 7680 bytes)
I had a look at the produced exe file and it looks like the small parts of the std library that do get used bring in a lot of extra code. It is time to completely jettison std library.
I add the following line to the of to completely break off any dependency on std
Removing std causes two bits of automatically provided functionality to go missing. First, the panic handler is missing. This can be added with the following code;
pub extern fn panic( _info: &PanicInfo ) -> ! { loop {} }
The second issue is that we do not have a mainCRTStartup. This does work like initializing the runtime, getting the arguments etc. before calling main. None of that is required for the app so I can just repurpose the old main function into mainCRTStartup. The new function looks like;
pub extern "system" fn mainCRTStartup() {
    let window = create_window(  );
    loop {
        if !handle_message( window ) {
Without std the executable is now 3584 bytes. The executable still looks quite padded out so the executable can probably still be made smaller yet but I think this requires more control of the linker. I might yet return to that but for now I think the executable is small enough to be a starting point for a 64K intro.

Next steps

The next stage is to setup a moden OpenGL rendering context without growing the code too much. All the code is on github

1 comment:

Rust Game Development

  Ever since I started learning Rust my intention has been to use it for writing games. In many ways it seems like the obvious choice but th...