Testing a custom Rust malloc for C

Writing a malloc implementation for C using Rust, and testing it using mtrace.

Published at: 2017-12-31

Note

The repo in this post is https://github.com/sevagh/konfiscator The original blog post and repo were very flawed and amateurish. The repo konfiscator has been totally rewritten (the new idea is to expose Prometheus metrics from the allocator)

Check out the repo konfiscator for the source code.

As an educational exercise, I’m going through a malloc university assignment to implement my own version of malloc.

To keep myself motivated on side projects, I like to mix things I’m familiar with with things I’m not familiar with. I’ve done some Rust interop with C before.

Thus, konfiscator was born - rewriting malloc for C using Rust.

Replacing malloc in glibc

From the GNU guideline for replacing malloc:

The minimum set of functions which has to be provided by a custom malloc is given in the table below. [malloc, free, calloc, realloc] These malloc-related functions are required for the GNU C Library to work.

Code

My Rust malloc signature looks like this:

#![crate_type = "cdylib"]

use libc::{c_void, size_t};

#[no_mangle]
pub extern "C" fn malloc(size: size_t) -> *mut c_void {
   <implementation here>
}

The crate type cdylib produces a dynamic library (i.e. .so file) for C.

The testbench is malloc.c:

#include "./malloc.h"
#include <errno.h>
#include <stdio.h>

int main(int argc, char **argv)
{
    if (argc < 2) {
        fprintf(stderr, "Usage: malloc <size in bytes>\n");
        exit(1);
    }

    long int desired_size = strtol(argv[1], NULL, 10);
    if ((errno == EINVAL) || (errno == ERANGE))  {
        fprintf(stderr, "%s couldn't be converted to an int\n", argv[1]);
        exit(1);
    }

    malloc(desired_size);

    return 0;
}

I use LD_PRELOAD to override glibc malloc with my own malloc:

sevagh:konfiscator $ gcc ./malloc.c -o malloc
sevagh:konfiscator $ LD_PRELOAD=./target/debug/libkonfiscator.so ./malloc

From man ld.so:

LD_PRELOAD A list of additional, user-specified, ELF shared objects to be loaded before all others.

The problem with this is that malloc is so fundmental that running anything with LD_PRELOAD=my_shitty_malloc.so will typically fuck everything up, so don’t export it.

Initial mistakes

Infinite loop

This was an early placeholder. I just wanted to ensure my code worked:

use libc::{c_void, size_t};

#[no_mangle]
pub extern "C" fn malloc(size: size_t) -> *mut c_void {
    libc::malloc(size)
}

My code hung indefinitely. gdb would hang as well:

(gdb) set environment LD_PRELOAD=./target/debug/libkonfiscator.so
(gdb) file ./malloc
(gdb) run

On the #rust IRC channel (irc.mozilla.org), somebody guessed that it’s probably some infinite loop:

fn malloc() {
    malloc()
}

Even println uses malloc

I had a println!{} statement in my Rust malloc, but it didn’t print anything, because, again, println probably uses malloc somewhere deep down.

Using vec as malloc

I got this idea from a Reddit post by brson - using Rust’s vec as a very easy malloc. The underlying logic uses jemalloc and a slew of complicated things, but it’s just a one-liner in my Rust malloc:

use libc::{c_void, size_t};
use std::mem::drop;

#[no_mangle]
pub extern "C" fn malloc(size: size_t) -> *mut c_void {
    (&mut vec![0u8; size]).as_mut_ptr() as *mut c_void
}

#[no_mangle]
pub extern "C" fn free(ptr: *mut c_void) {
    drop(ptr);
}

Testing my malloc with mtrace

First, I used valgrind, which doesn’t actually help. Valgrind replaces malloc with its own malloc to perform its profiling. This Stackoverflow answer demystified it.

The answer mentioned mtrace as an alternative, and I followed the steps on the Wikipedia article:

sevagh:konfiscator $ export MALLOC_TRACE=./malloc_trace_out.txt
sevagh:konfiscator $ cat malloc.c
[...]
#include <mcheck.h>
    [...]

    mtrace();
    malloc(desired_size);
    muntrace();
    [...]
sevagh:konfiscator $ ./malloc 512
sevagh:konfiscator $
sevagh:konfiscator $
sevagh:konfiscator $ mtrace malloc malloc_trace_out.txt

Memory not freed:
-----------------
           Address     Size     Caller
0x00000000018dbd00    0x200  at 0x7f845d5c3f4b

0x200 in hex is 512 bytes, which is good. This is what we malloced. However, I still wasn’t sure if LD_PRELOAD was working, so I did this instead:

(&mut vec![0u8; 1337]).as_mut_ptr() as *mut c_void

Instead of using the size parameter, I just malloced 1337, and ran mtrace again:

sevagh:konfiscator $ mtrace malloc malloc_trace_out.txt

Memory not freed:
-----------------
           Address     Size     Caller
0x00000000018dbd00    0x539  at 0x7f845d5c3f4b

0x539 in hex is 1337 - success!

Free not working and clippy to the rescue

So, after adding a call to free() in my C code, mtrace still figured that the memory was not freed. Running clippy gave me an idea of what I was doing wrong:

warning: calls to `std::mem::drop` with a value that implements Copy. Dropping a copy leaves the original intact.
  --> src/vec.rs:11:5
   |
11 |     drop(ptr);
   |     ^^^^^^^^^
   |
   = note: #[warn(drop_copy)] on by default
note: argument has type *mut libc::c_void
  --> src/vec.rs:11:10
   |
11 |     drop(ptr);
   |          ^^^

The C code with the free isn’t very different from what it was before:

mtrace();
void *x = malloc(desired_size);
free(x);
muntrace();

I adjusted the Rust code of free() to this, which made clippy happy:

#[no_mangle]
pub extern "C" fn free(ptr: c_void) {
    drop(ptr);
}

I expect this would work, but unfortunately it didn’t:

sevagh:konfiscator $ ./malloc 512
sevagh:konfiscator $ mtrace malloc mtrace.txt

Memory not freed:
-----------------
           Address     Size     Caller
0x000000000184e6a0    0x200  at 0x7fe9835e1f4b

Reading through the Vec docs, I saw the function from_raw_parts() which seems promising.

That’s work for the future. Also, I intend to write brk/sbrk and mmap/munmap implementations.

Doing ugly things with build.rs

I’m tired of Makefiles and shell scripts to build stuff. cargo doesn’t support a post-build script. There’s some RFCs you can find, and the discussions are fair - cargo isn’t meant to be Yet Another Build Tool.

I used the pre-build build.rs to perform the following steps: 1. Invoke gcc to compile the malloc testbench 2. Export LD_PRELOAD in the calling shell

Step 2 wasn’t easy. I used nix::getppid() to get the parent pid. Then, I used /proc/{ppid}/status to get the ppid of the ppid - i.e., get the parent pid of the parent pid. This works in Linux because the call hierarchy goes like this:

sevagh:konfiscator $ cargo build
shell -> cargo -> build.rs

Then, to inject an environment variable into a pid that’s not your child, I used gdb:

(gdb) attach <pppid>
(gdb) call putenv ("LD_PRELOAD=./target/debug/libkonfiscator.so")
(gdb) detach

The path of the compiled .so file is either target/debug or target/release. Fortunately, from within build.rs, the environment variable $PROFILE (again, another tip learned from the Rust IRC channel) tells you whether the build is debug or release.

fn _inject_env_var(pid: i32, k: &str, v: &str) {
    let gdb_in = format!("attach {}\ncall putenv (\"{}={}\")\ndetach\n", pid, k, v);
    _exec(&format!("gdb"), Some(&gdb_in)).unwrap();
}