Testing a custom Rust malloc for C

5 minute read Published:

Writing a malloc implementation for C using Rust, and testing it using mtrace.

Check out the repo konfiscator for the source code.

As an educational exercise, I’m going through a malloc university assignment to implement my own version of malloc.

To keep myself motivated on side projects, I like to mix things I’m familiar with with things I’m not familiar with. I’ve done some Rust interop with C before.

Thus, konfiscator was born - rewriting malloc for C using Rust.

Replacing malloc in glibc

From the GNU guideline for replacing malloc:

The minimum set of functions which has to be provided by a custom malloc is given in the table below. [malloc, free, calloc, realloc] These malloc-related functions are required for the GNU C Library to work.

Code

My Rust malloc signature looks like this:

#![crate_type = "cdylib"]

use libc::{c_void, size_t};

#[no_mangle]
pub extern "C" fn malloc(size: size_t) -> *mut c_void {
   <implementation here>
}

The crate type cdylib produces a dynamic library (i.e. .so file) for C.

The testbench is malloc.c:

#include "./malloc.h"
#include <errno.h>
#include <stdio.h>

int main(int argc, char **argv)
{
    if (argc < 2) {
        fprintf(stderr, "Usage: malloc <size in bytes>\n");
        exit(1);
    }

    long int desired_size = strtol(argv[1], NULL, 10);
    if ((errno == EINVAL) || (errno == ERANGE))  {
        fprintf(stderr, "%s couldn't be converted to an int\n", argv[1]);
        exit(1);
    }

    malloc(desired_size);

    return 0;
}

I use LD_PRELOAD to override glibc malloc with my own malloc:

sevagh:konfiscator $ gcc ./malloc.c -o malloc
sevagh:konfiscator $ LD_PRELOAD=./target/debug/libkonfiscator.so ./malloc

From man ld.so:

LD_PRELOAD A list of additional, user-specified, ELF shared objects to be loaded before all others.

The problem with this is that malloc is so fundmental that running anything with LD_PRELOAD=my_shitty_malloc.so will typically fuck everything up, so don’t export it.

Initial mistakes

Infinite loop

This was an early placeholder. I just wanted to ensure my code worked:

use libc::{c_void, size_t};

#[no_mangle]
pub extern "C" fn malloc(size: size_t) -> *mut c_void {
    libc::malloc(size)
}

My code hung indefinitely. gdb would hang as well:

(gdb) set environment LD_PRELOAD=./target/debug/libkonfiscator.so
(gdb) file ./malloc
(gdb) run

On the #rust IRC channel (irc.mozilla.org), somebody guessed that it’s probably some infinite loop:

fn malloc() {
    malloc()
}

Even println uses malloc

I had a println!{} statement in my Rust malloc, but it didn’t print anything, because, again, println probably uses malloc somewhere deep down.

Using vec as malloc

I got this idea from a Reddit post by brson - using Rust’s vec as a very easy malloc. The underlying logic uses jemalloc and a slew of complicated things, but it’s just a one-liner in my Rust malloc:

use libc::{c_void, size_t};
use std::mem::drop;

#[no_mangle]
pub extern "C" fn malloc(size: size_t) -> *mut c_void {
    (&mut vec![0u8; size]).as_mut_ptr() as *mut c_void
}

#[no_mangle]
pub extern "C" fn free(ptr: *mut c_void) {
    drop(ptr);
}

Testing my malloc with mtrace

First, I used valgrind, which doesn’t actually help. Valgrind replaces malloc with its own malloc to perform its profiling. This Stackoverflow answer demystified it.

The answer mentioned mtrace as an alternative, and I followed the steps on the Wikipedia article:

sevagh:konfiscator $ export MALLOC_TRACE=./malloc_trace_out.txt
sevagh:konfiscator $ cat malloc.c
[...]
#include <mcheck.h>
    [...] 

    mtrace();
    malloc(desired_size);
    muntrace();
    [...]
sevagh:konfiscator $ ./malloc 512
sevagh:konfiscator $
sevagh:konfiscator $
sevagh:konfiscator $ mtrace malloc malloc_trace_out.txt

Memory not freed:
-----------------
           Address     Size     Caller
0x00000000018dbd00    0x200  at 0x7f845d5c3f4b
 

0x200 in hex is 512 bytes, which is good. This is what we malloced. However, I still wasn’t sure if LD_PRELOAD was working, so I did this instead:

(&mut vec![0u8; 1337]).as_mut_ptr() as *mut c_void

Instead of using the size parameter, I just malloced 1337, and ran mtrace again:

sevagh:konfiscator $ mtrace malloc malloc_trace_out.txt

Memory not freed:
-----------------
           Address     Size     Caller
0x00000000018dbd00    0x539  at 0x7f845d5c3f4b

0x539 in hex is 1337 - success!

Free not working and clippy to the rescue

So, after adding a call to free() in my C code, mtrace still figured that the memory was not freed. Running clippy gave me an idea of what I was doing wrong:

warning: calls to `std::mem::drop` with a value that implements Copy. Dropping a copy leaves the original intact.
  --> src/vec.rs:11:5
   |
11 |     drop(ptr);
   |     ^^^^^^^^^
   |
   = note: #[warn(drop_copy)] on by default
note: argument has type *mut libc::c_void
  --> src/vec.rs:11:10
   |
11 |     drop(ptr);
   |          ^^^

The C code with the free isn’t very different from what it was before:

mtrace();
void *x = malloc(desired_size);
free(x);
muntrace();

I adjusted the Rust code of free() to this, which made clippy happy:

#[no_mangle]
pub extern "C" fn free(ptr: c_void) {
    drop(ptr);
}

I expect this would work, but unfortunately it didn’t:

sevagh:konfiscator $ ./malloc 512
sevagh:konfiscator $ mtrace malloc mtrace.txt

Memory not freed:
-----------------
           Address     Size     Caller
0x000000000184e6a0    0x200  at 0x7fe9835e1f4b

Reading through the Vec docs, I saw the function from_raw_parts() which seems promising.

That’s work for the future. Also, I intend to write brk/sbrk and mmap/munmap implementations.

Doing ugly things with build.rs

I’m tired of Makefiles and shell scripts to build stuff. cargo doesn’t support a post-build script. There’s some RFCs you can find, and the discussions are fair - cargo isn’t meant to be Yet Another Build Tool.

I used the pre-build build.rs to perform the following steps:

  1. Invoke gcc to compile the malloc testbench
  2. Export LD_PRELOAD in the calling shell

Step 2 wasn’t easy. I used nix::getppid() to get the parent pid. Then, I used /proc/{ppid}/status to get the ppid of the ppid - i.e., get the parent pid of the parent pid. This works in Linux because the call hierarchy goes like this:

sevagh:konfiscator $ cargo build
shell -> cargo -> build.rs

Then, to inject an environment variable into a pid that’s not your child, I used gdb:

(gdb) attach <pppid>
(gdb) call putenv ("LD_PRELOAD=./target/debug/libkonfiscator.so")
(gdb) detach

The path of the compiled so file is either target/debug or target/release. Fortunately, from within build.rs, the environment variable $PROFILE (again, another tip learned from the Rust IRC channel) tells you whether the build is debug or release.

fn _inject_env_var(pid: i32, k: &str, v: &str) {
    let gdb_in = format!("attach {}\ncall putenv (\"{}={}\")\ndetach\n", pid, k, v);
    _exec(&format!("gdb"), Some(&gdb_in)).unwrap();
}