Catching SIGSEGV caused by jump to null pointer hangs qemu-arm-static

Bug #1090038 reported by Chris McClelland
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Linaro QEMU
New
Undecided
Unassigned

Bug Description

Host system: Ubuntu 12.04 on x86_64

I'm maintaining a fork of UnitTest++ (https://github.com/makestuff/libutpp), a C/C++ unit-testing framework. It sets up signal-handlers in order to catch tests which crash, so it can report and continue with the next test. This signal handler works fine on all Linux platforms I've tried, including armel Debian running on regular QEMU. But on Linaro QEMU, the handler hangs.

Reproduction with standalone test-case:

# Verify test-case works on host...
$ wget -qO foo.c http://pastebin.com/raw.php?i=FRNwwsHn
$ gcc -Wall -pedantic-errors -std=gnu99 foo.c
$ ./a.out
Hello World
catchCrash(sayHello) returned 0
catchCrash(writeZero) returned 1
catchCrash(readZero) returned 1
catchCrash(jmpZero) returned 1

# Now setup armel chroot...
$ cd /var/qemu
$ apt-get install qemu-user-static
$ apt-get install debootstrap
$ qemu-debootstrap --arch armel wheezy armel http://ftp.de.debian.org/debian
$ mount -t proc proc armel/proc
$ mount -t sysfs sysfs armel/sys
$ mount -o bind /dev armel/dev
$ LC_ALL=C chroot armel

# In the chroot...
$ wget -qO foo.c http://pastebin.com/raw.php?i=FRNwwsHn
$ gcc -Wall -pedantic-errors -std=gnu99 foo.c
$ ./a.out
Hello World
catchCrash(sayHello) returned 0
catchCrash(writeZero) returned 1
catchCrash(readZero) returned 1
<...hangs...>

What's interesting is that SIGSEGV resulting from simply dereferencing a null pointer work fine, but if you call a null function-pointer, it hangs.

You can see a prettified rendering of the code here:

http://pastebin.com/embed_iframe.php?i=FRNwwsHn

I also tried with a build of qemu-arm-static from what I believe is the latest source:

# Host-side...
$ cd /var/qemu
$ wget -qO- --no-check-certificate https://launchpad.net/qemu-linaro/trunk/2012.09/+download/qemu-linaro-1.2.0-2012.09.tar.gz | tar zxf -
$ cd qemu-linaro-1.2.0-2012.09
$ ./configure --target-list=arm-linux-user --static
$ make
$ cd ..
$ mv armel/usr/bin/qemu-arm-static armel/usr/bin/qemu-arm-static-backup
$ cp qemu-linaro-1.2.0-2012.09/arm-linux-user/qemu-arm armel/usr/bin/qemu-arm-static
$ LC_ALL=C chroot armel

# In the chroot...
$ wget -qO foo.c http://pastebin.com/raw.php?i=FRNwwsHn
$ gcc -Wall -pedantic-errors -std=gnu99 foo.c
$ ./a.out
Hello World
catchCrash(sayHello) returned 0
catchCrash(writeZero) returned 1
catchCrash(readZero) returned 1
<...hangs...>

Any help or suggestions gratefully received.

Chris

Tags: armel
Revision history for this message
Peter Maydell (pmaydell) wrote :

This happens because linux-user mode doesn't emulate the guest MMU. In system emulation mode (where the test case runs OK) we identify accesses and jumps to 0 because they have no TLB entry and we end up with tlb_fill() calling raise_exception() and generating an emulated guest CPU exception (via a longjmp out to the top level). In linux-user mode we handle accesses to address 0 by catching the host SIGSEGV and identifying it as a fault in generated code which needs to be turned into a guest SIGSEGV. However if the guest jumps to address 0 then the SEGV ends up happening in QEMU proper:

#0 0x000055555560e4db in ldl_p (ptr=0x7ffefc9ca000) at /tmp/qemu-linaro-1.4.0-2013.03/include/qemu/bswap.h:261
#1 0x000055555560e55f in ldl_le_p (ptr=0x7ffefc9ca000) at /tmp/qemu-linaro-1.4.0-2013.03/include/qemu/bswap.h:294
#2 0x000055555560e64d in arm_ldl_code (env=0x55555783aec0, addr=0, do_swap=false) at /tmp/qemu-linaro-1.4.0-2013.03/target-arm/cpu.h:748
#3 0x000055555563623b in disas_arm_insn (env=0x55555783aec0, s=0x7fffffffda90) at /tmp/qemu-linaro-1.4.0-2013.03/target-arm/translate.c:6580
#4 0x000055555563e5cb in gen_intermediate_code_internal (env=0x55555783aec0, tb=0x7ffff39d99e8, search_pc=0)
    at /tmp/qemu-linaro-1.4.0-2013.03/target-arm/translate.c:9890
#5 0x000055555563ea13 in gen_intermediate_code (env=0x55555783aec0, tb=0x7ffff39d99e8) at /tmp/qemu-linaro-1.4.0-2013.03/target-arm/translate.c:10019
#6 0x000055555564dca0 in cpu_arm_gen_code (env=0x55555783aec0, tb=0x7ffff39d99e8, gen_code_size_ptr=0x7fffffffdbd8)
    at /tmp/qemu-linaro-1.4.0-2013.03/translate-all.c:174
#7 0x000055555564f044 in tb_gen_code (env=0x55555783aec0, pc=0, cs_base=0, flags=128, cflags=0) at /tmp/qemu-linaro-1.4.0-2013.03/translate-all.c:968
#8 0x00005555555a0a3b in tb_find_slow (env=0x55555783aec0, pc=0, cs_base=0, flags=128) at /tmp/qemu-linaro-1.4.0-2013.03/cpu-exec.c:125
#9 0x00005555555a0bb2 in tb_find_fast (env=0x55555783aec0) at /tmp/qemu-linaro-1.4.0-2013.03/cpu-exec.c:152
#10 0x00005555555a0fdf in cpu_arm_exec (env=0x55555783aec0) at /tmp/qemu-linaro-1.4.0-2013.03/cpu-exec.c:567
#11 0x00005555555c7fcb in cpu_loop (env=0x55555783aec0) at /tmp/qemu-linaro-1.4.0-2013.03/linux-user/main.c:707
#12 0x00005555555ca2f8 in main (argc=2, argv=0x7fffffffe5e8, envp=0x7fffffffe600) at /tmp/qemu-linaro-1.4.0-2013.03/linux-user/main.c:3984

and linux-user QEMU copes very badly with signals that are caused by bits of its own code (it turns them into signals directed at the guest, which usually means we just hang rather than aborting with a helpful error message).

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.