gdb 101: tracing a Mesa segfault in Docker, part 2

2 minute read Published:

In the previous part, gdb was giving 'File not found' errors. I'll attempt to acquire those files.

The “file not found” error from last time was easily solved by including the ez-quake source files in the final Docker container.

Once that was cleared, I hit the old issue of the segfaults from i965_dri.so:

gdbgui

The question marks still indicate that there are missing debug symbols - which I thought I installed with the deb libgl1-mesa-dri-dbgsym_17.2.4-0ubuntu2_amd64.ddeb.

Time to find out why these are missing:

root@5a2f9c99d982:/ezquake/ezquake-source# nm --debug-sym /usr/lib/x86_64-linux-gnu/dri/i965_dri.so | grep debug
nm: /usr/lib/x86_64-linux-gnu/dri/i965_dri.so: no symbols

As it turns out, the Ubuntu Bionic container I was using has moved ahead while my pinned copy of the debug symbols was older:

root@5a2f9c99d982:/blobs# dpkg -i ./libgl1-mesa-dri-dbgsym_17.2.4-0ubuntu2_amd6
4.ddeb
Selecting previously unselected package libgl1-mesa-dri-dbgsym:amd64.
(Reading database ... 21081 files and directories currently installed.)
Preparing to unpack .../libgl1-mesa-dri-dbgsym_17.2.4-0ubuntu2_amd64.ddeb ...
Unpacking libgl1-mesa-dri-dbgsym:amd64 (17.2.4-0ubuntu2) ...
dpkg: dependency problems prevent configuration of libgl1-mesa-dri-dbgsym:amd64:
 libgl1-mesa-dri-dbgsym:amd64 depends on libgl1-mesa-dri (= 17.2.4-0ubuntu2); however:
  Version of libgl1-mesa-dri:amd64 on system is 18.0.0~rc4-1ubuntu3.

Keep this in mind when you’re not pinning your container or dependencies and leave a project dormant - shit happens.

There’s a better way of installing dbgsym packages than downloading them from https://launchpad.net/ubuntu/+source/mesa/18.0.0~rc4-1ubuntu3/+build/14406495 and embedding those with git-lfs - however I don’t really want to fuck with apt keys so I’ll go ahead with the old method.

Now we trace the segfault back to the originating call after having the correct debug symbols installed:

gdbgui

Here’s the error:

#0 0x00007fffe4b23b0b in intel_disable_rb_aux_buffer (brw=brw@entry=0x7ffff7f9b040, draw_aux_buffer_disabled=draw_aux_buffer_disabled@entry=0x7fffffffdf30, min_level=min_level@entry=0, num_levels=num_levels@entry=1, usage=0x7fffe4e8f6af “for sampling”, tex_mt=, tex_mt=) at ../../../../../../src/mesa/drivers/dri/i965/brw_draw.c:361

I found some code online: https://github.com/intel/external-mesa/blob/master/src/mesa/drivers/dri/i965/brw_draw.c#L342

Let’s see if we can understand the source here. One of the parameters passed is "for sampling" which can be traced in the same file: https://github.com/intel/external-mesa/blob/master/src/mesa/drivers/dri/i965/brw_draw.c#L469

I filed a bug report against the repo: https://github.com/intel/external-mesa/issues/73

We’ll see what the maintainers advise. gdbgui gives me plenty of info (asm, memory inspection, etc.) so hopefully I can give useful information that’s requested to make a proper bug report.