Testing a custom Rust malloc for C
Writing a malloc implementation for C using Rust, and testing it using mtrace.Note
The repo in this post is https://github.com/sevagh/konfiscator The original blog post and repo were very flawed and amateurish. The repo konfiscator has been totally rewritten (the new idea is to expose Prometheus metrics from the allocator)
Check out the repo konfiscator for the source code.
As an educational exercise, I’m going through a malloc university assignment to implement my own version of malloc.
To keep myself motivated on side projects, I like to mix things I’m familiar with with things I’m not familiar with. I’ve done some Rust interop with C before.
Thus, konfiscator was born - rewriting malloc for C using Rust.
Replacing malloc in glibc
From the GNU guideline for replacing malloc:
The minimum set of functions which has to be provided by a custom malloc is given in the table below. [malloc, free, calloc, realloc] These malloc-related functions are required for the GNU C Library to work.
Code
My Rust malloc signature looks like this:
#![crate_type = "cdylib"]
use libc::{c_void, size_t};
#[no_mangle]
pub extern "C" fn malloc(size: size_t) -> *mut c_void {
<implementation here>
}
The crate type cdylib produces a dynamic library (i.e. .so file) for C.
The testbench is malloc.c:
#include "./malloc.h"
#include <errno.h>
#include <stdio.h>
int main(int argc, char **argv)
{
if (argc < 2) {
fprintf(stderr, "Usage: malloc <size in bytes>\n");
exit(1);
}
long int desired_size = strtol(argv[1], NULL, 10);
if ((errno == EINVAL) || (errno == ERANGE)) {
fprintf(stderr, "%s couldn't be converted to an int\n", argv[1]);
exit(1);
}
malloc(desired_size);
return 0;
}
I use LD_PRELOAD to override glibc malloc with my own malloc:
sevagh:konfiscator $ gcc ./malloc.c -o malloc
sevagh:konfiscator $ LD_PRELOAD=./target/debug/libkonfiscator.so ./malloc
From man ld.so:
LD_PRELOAD A list of additional, user-specified, ELF shared objects to be loaded before all others.
The problem with this is that malloc is so fundmental that running anything with LD_PRELOAD=my_shitty_malloc.so will typically fuck everything up, so don’t export it.
Initial mistakes
Infinite loop
This was an early placeholder. I just wanted to ensure my code worked:
use libc::{c_void, size_t};
#[no_mangle]
pub extern "C" fn malloc(size: size_t) -> *mut c_void {
libc::malloc(size)
}
My code hung indefinitely. gdb would hang as well:
(gdb) set environment LD_PRELOAD=./target/debug/libkonfiscator.so
(gdb) file ./malloc
(gdb) run
On the #rust IRC channel (irc.mozilla.org), somebody guessed that it’s probably some infinite loop:
fn malloc() {
malloc()
}
Even println uses malloc
I had a println!{}
statement in my Rust malloc, but it didn’t print anything, because, again, println probably uses malloc somewhere deep down.
Using vec as malloc
I got this idea from a Reddit post by brson - using Rust’s vec as a very easy malloc. The underlying logic uses jemalloc and a slew of complicated things, but it’s just a one-liner in my Rust malloc:
use libc::{c_void, size_t};
use std::mem::drop;
#[no_mangle]
pub extern "C" fn malloc(size: size_t) -> *mut c_void {
(&mut vec![0u8; size]).as_mut_ptr() as *mut c_void
}
#[no_mangle]
pub extern "C" fn free(ptr: *mut c_void) {
drop(ptr);
}
Testing my malloc with mtrace
First, I used valgrind
, which doesn’t actually help. Valgrind replaces malloc with its own malloc to perform its profiling. This Stackoverflow answer demystified it.
The answer mentioned mtrace as an alternative, and I followed the steps on the Wikipedia article:
sevagh:konfiscator $ export MALLOC_TRACE=./malloc_trace_out.txt
sevagh:konfiscator $ cat malloc.c
[...]
#include <mcheck.h>
[...]
mtrace();
malloc(desired_size);
muntrace();
[...]
sevagh:konfiscator $ ./malloc 512
sevagh:konfiscator $
sevagh:konfiscator $
sevagh:konfiscator $ mtrace malloc malloc_trace_out.txt
Memory not freed:
-----------------
Address Size Caller
0x00000000018dbd00 0x200 at 0x7f845d5c3f4b
0x200
in hex is 512 bytes, which is good. This is what we malloced. However, I still wasn’t sure if LD_PRELOAD
was working, so I did this instead:
(&mut vec![0u8; 1337]).as_mut_ptr() as *mut c_void
Instead of using the size
parameter, I just malloced 1337, and ran mtrace
again:
sevagh:konfiscator $ mtrace malloc malloc_trace_out.txt
Memory not freed:
-----------------
Address Size Caller
0x00000000018dbd00 0x539 at 0x7f845d5c3f4b
0x539
in hex is 1337 - success!
Free not working and clippy to the rescue
So, after adding a call to free()
in my C code, mtrace still figured that the memory was not freed. Running clippy
gave me an idea of what I was doing wrong:
warning: calls to `std::mem::drop` with a value that implements Copy. Dropping a copy leaves the original intact.
--> src/vec.rs:11:5
|
11 | drop(ptr);
| ^^^^^^^^^
|
= note: #[warn(drop_copy)] on by default
note: argument has type *mut libc::c_void
--> src/vec.rs:11:10
|
11 | drop(ptr);
| ^^^
The C code with the free isn’t very different from what it was before:
mtrace();
void *x = malloc(desired_size);
free(x);
muntrace();
I adjusted the Rust code of free()
to this, which made clippy happy:
#[no_mangle]
pub extern "C" fn free(ptr: c_void) {
drop(ptr);
}
I expect this would work, but unfortunately it didn’t:
sevagh:konfiscator $ ./malloc 512
sevagh:konfiscator $ mtrace malloc mtrace.txt
Memory not freed:
-----------------
Address Size Caller
0x000000000184e6a0 0x200 at 0x7fe9835e1f4b
Reading through the Vec docs, I saw the function from_raw_parts() which seems promising.
That’s work for the future. Also, I intend to write brk/sbrk
and mmap/munmap
implementations.
Doing ugly things with build.rs
I’m tired of Makefiles and shell scripts to build stuff. cargo doesn’t support a post-build script. There’s some RFCs you can find, and the discussions are fair - cargo isn’t meant to be Yet Another Build Tool.
I used the pre-build build.rs
to perform the following steps: 1. Invoke gcc to compile the malloc testbench 2. Export LD_PRELOAD in the calling shell
Step 2 wasn’t easy. I used nix::getppid()
to get the parent pid. Then, I used /proc/{ppid}/status
to get the ppid of the ppid - i.e., get the parent pid of the parent pid. This works in Linux because the call hierarchy goes like this:
sevagh:konfiscator $ cargo build
shell -> cargo -> build.rs
Then, to inject an environment variable into a pid that’s not your child, I used gdb:
(gdb) attach <pppid>
(gdb) call putenv ("LD_PRELOAD=./target/debug/libkonfiscator.so")
(gdb) detach
The path of the compiled .so
file is either target/debug
or target/release
. Fortunately, from within build.rs, the environment variable $PROFILE
(again, another tip learned from the Rust IRC channel) tells you whether the build is debug
or release
.
fn _inject_env_var(pid: i32, k: &str, v: &str) {
let gdb_in = format!("attach {}\ncall putenv (\"{}={}\")\ndetach\n", pid, k, v);
_exec(&format!("gdb"), Some(&gdb_in)).unwrap();
}