Foobared on armel

Bug #537458 reported by Loïc Minier
16
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Valgrind
New
Medium
valgrind (Ubuntu)
Fix Released
Medium
Loïc Minier

Bug Description

Binary package hint: valgrind

Hi there

I updated valgrind in lucid with a snapshot which had a lot of new ARM-specific code; I enabled it for ARM when I was at it.

Sadly, it dies on startup even with simple programs on arm. I suspect it might be due to our default of thumb2, but I'm not a vagrind hacker.

Thanks.

Related branches

Revision history for this message
Loïc Minier (lool) wrote :

root@bee:/# valgrind lib/libc.so.6
qemu: uncaught target signal 4 (Illegal instruction) - core dumped
Illegal instruction (core dumped)

Revision history for this message
Loïc Minier (lool) wrote :

In fact:
valgrind.bin --help
Illegal instruction

Running it in gdb, I get a SIGILL in _start().

Revision history for this message
Dave Martin (dave-martin-arm) wrote :

Valgrind support for ARM is recent, and AFAIK there is no Thumb-2 support yet and it probably doesn't support all of v7.

You could try it on an ARM (i.e., not Thumb) binary; that might work, but it still won't be useful for distributed lucid binaries yet.

I think this is being worked on, but in the meantime maybe we should not offer a binary package for this?

Revision history for this message
Paul Larson (pwlars) wrote :

even calling valgrind with no arguments gets sigill here

Changed in valgrind (Ubuntu):
importance: Undecided → Medium
status: New → Confirmed
Revision history for this message
Loïc Minier (lool) wrote :

I think removing the binary package makes it harder for someone to fix it for a relatively small win (avoids confusion of armel devs who try it and get SIGILL); I'd prefer if we keep it and ping upstream about it.

Revision history for this message
Loïc Minier (lool) wrote :

So gdb reports the crash at 0x3801eb34 in _start in /usr/lib/valgrind/memcheck-arm-linux; in the objdump -d output, I see that it's on the mvn instruction, see coregrind/m_main.c:
#elif defined(VGP_arm_linux)
asm("\n"
    "\t.align 2\n"
    "\t.global _start\n"
    "_start:\n"
    "\tldr r0, [pc, #36]\n"
    "\tldr r1, [pc, #36]\n"
    "\tadd r0, r1, r0\n"
    "\tldr r1, [pc, #32]\n"
    "\tadd r0, r1, r0\n"
    "\tmvn r1, #15\n"
    "\tand r0, r0, r1\n"
    "\tmov r1, sp\n"
    "\tmov sp, r0\n"
    "\tmov r0, r1\n"
    "\tb _start_in_C_linux\n"
    "\t.word vgPlain_interim_stack\n"
    "\t.word "VG_STRINGIFY(VG_STACK_GUARD_SZB)"\n"
    "\t.word "VG_STRINGIFY(VG_STACK_ACTIVE_SZB)"\n"
);

Apparently, mvn isn't available in T2 mode.

Revision history for this message
Loïc Minier (lool) wrote :

Apparently, Dave Martin at worked on an updated patch which covers more combinations of arguments:
http://pastebin.ubuntu.com/395620/

--- armhelper.s 2007-12-12 15:35:44.000000000 +0000
+++ armhelper.s 2010-03-15 11:57:20.000000000 +0000
@@ -8,15 +8,22 @@
  .global privateSnippetExecutor
  .type privateSnippetExecutor, %function
 privateSnippetExecutor:
+ .fnstart @ start of unwinder entry
+
         stmfd sp!, {r0-r3} @ follow other parameters on stack
+ .pad #16 @ throw this data away on exception
  mov r0, ip @ r0 points to functionoffset/vtable
- mov ip, sp @ fix up the ip
- stmfd sp!, {fp,ip,lr,pc} @ 8 x 4 => stack remains 8 aligned
- sub fp, ip, #4 @ set frame pointer
+ mov r1, sp @ r1 points to this and params
+ @ (see cppuno.cxx:codeSnippet())
+ stmfd sp!, {r4,lr} @ save return address
+ @ (r4 pushed to preserve stack alignment)
+ .save {r4,lr} @ restore these regs on exception

- add r1, sp, #16 @ r1 points to this and params
         bl cpp_vtable_call(PLT)

- add sp, sp, #32 @ restore stack
- ldr fp, [sp, #-32] @ restore frame pointer
- ldr pc, [sp, #-24] @ return
+ add sp, sp, #4 @ no need to restore r4 (we didn't touch it)
+ ldr pc, [sp], #20 @ return, discarding function arguments
+
+ .fnend @ end of unwinder entry
+
+ .size privateSnippetExecutor, . - privateSnippetExecutor

Revision history for this message
Dave Martin (dave-martin-arm) wrote :

@Loïc, did you mean to post to https://bugs.launchpad.net/bugs/417009 ?

Revision history for this message
Dave Martin (dave-martin-arm) wrote :

Re comment #6 above, I think the most likely cause of the crash on the MVN instruction is that this is the first 32-bit instruction in the sequence (and is the first instruction which is not in Thumb-1).

From my current understanding this is probably expected due to the current lack of v7 and Thumb-2 support in valgrind generally.

Code built with the lucid defaults will almost certainly not work with Valgrind right now, since the lucid toolchain will routinely use instructions specific to v7 and/or Thumb-2.

Revision history for this message
Loïc Minier (lool) wrote :

@Dave: hmm I thought MVN wasn't available in Thumb-2? It seems to me this code is the early init code of valgrind; not part of the code which gets parsed by valgrind.

Revision history for this message
Dave Martin (dave-martin-arm) wrote :

The MVN instruction used should be supported--- if not, we would be getting an assembler error at this point when the package is built.

Looking more carefully, there's a more straightforward problem: the offsets in the PC-relative load instructions make assumptions about the instruction size (i.e., ARM). (See https://wiki.ubuntu.com/ARM/Thumb2PortingHowto#Typical%20uses%20-%20loading%20a%20literal%20from%20the%20text%20section for a discussion.)

In Thumb-2, many of the instructions are 2 bytes instead of four, so the ldr instructions are loading junk (maybe even off the end of the text segment).

Can you try this patch? (There may be other problems too, of course...)

(I'm tempted to move all the \n to the end of the lines, since the code is not very readable as it stands, but in the interest of not fuzzing the patch I've only made the critical changes)

Revision history for this message
Dave Martin (dave-martin-arm) wrote :

(For \n, read \t)

tags: added: patch
Revision history for this message
Loïc Minier (lool) wrote :

Thanks, it all makes sense now! I was also misreading the ARM/Thumb instructions reference card

Sadly, the patch causes valgrind to FTBFS as follows:
gcc -DHAVE_CONFIG_H -I. -I.. -I.. -I../include -I../VEX/pub -DVGA_arm=1 -DVGO_linux=1 -DVGP_arm_linux=1 -I../coregrind -DVG_LIBDIR="\"/home/ubuntu/valgrind/valgrind-3.6.0~svn20100212/prefix/lib/valgrind"\" -DVG_PLATFORM="\"arm-linux\"" -O2 -g -Wall -Wmissing-prototypes -Wshadow -Wpointer-arith -Wstrict-prototypes -Wmissing-declarations -Wno-format-zero-length -fno-strict-aliasing -Wno-long-long -fno-stack-protector -Wno-pointer-sign -fno-stack-protector -MT libcoregrind_arm_linux_a-m_main.o -MD -MP -MF .deps/libcoregrind_arm_linux_a-m_main.Tpo -c -o libcoregrind_arm_linux_a-m_main.o `test -f 'm_main.c' || echo './'`m_main.c
/tmp/ccNzAyWv.s: Assembler messages:
/tmp/ccNzAyWv.s:29: Error: offset out of range
/tmp/ccNzAyWv.s:30: Error: offset out of range
/tmp/ccNzAyWv.s:32: Error: offset out of range
make[3]: *** [libcoregrind_arm_linux_a-m_main.o] Error 1
make[3]: Leaving directory `/home/ubuntu/valgrind/valgrind-3.6.0~svn20100212/coregrind'

Revision history for this message
Dave Martin (dave-martin-arm) wrote :

The previous patch ftbfs for me due to the assembler putting the data literals too far away.

This one solves the problem by making the location of the literal pool explicit using an .ltorg directive.

Revision history for this message
Loïc Minier (lool) wrote :

Nice work, it builds with the .ltorg at the end of the asm section

Revision history for this message
Loïc Minier (lool) wrote :

I still get a SIGILL at the same location with the patch

Revision history for this message
Dave Martin (dave-martin-arm) wrote :

Ah, it looks like _start the the real program entry point, assigned in valt_load_address_arm_linux.lds (probably).

Like any other assembler symbol, the entry point needs to be properly tagged as a function symbol in order for the process to start executing in Thumb. Currently, it looks like the code at _start is being interpreted as ARM code, causing a SIGILL.

$ gdb ./valgrind
[...]
Reading symbols from /home/ubuntu/src/review/valgrind/valgrind-3.6.0~svn20100212/tmp-prefix/usr/bin/valgrind...done.
(gdb) r
Starting program: /home/ubuntu/src/review/valgrind/valgrind-3.6.0~svn20100212/tmp-prefix/usr/bin/valgrind
Executing new program: /usr/lib/valgrind/memcheck-arm-linux
valgrind: no program specified
valgrind: Use --help for more information.

Program exited with code 01.
(gdb) q

Revision history for this message
Loïc Minier (lool) wrote :

Wee \o/ latest patch finally gets rid of the first SIGILL; now I can at least run valgrind with no args, and it wont die on startup.

The next thing which occurs is a SIGILL in vgModuleLocal_call_on_new_stack_0_1(); might be another case of thumb versus ARM mode.

Revision history for this message
Loïc Minier (lool) wrote :

After patching vgModuleLocal_call_on_new_stack_0_1() to have a .type %function too, I'm hitting:
disIntr(arn): unhandled instruction 0xDFA068F8
                 cond=13(0xD) 27:20=250(0xFA) 4:4=1 3:0=8(0x8)

vex: priv/guest_arm_toIR.c:4733 (disInstr_ARM_WRK): Assertion 0== (guest_R15_curr_instr & 3 failed)`
...

Looks like this needs much larged porting at this point.

Revision history for this message
Loïc Minier (lool) wrote :

Just for the sake of it, I tried to run a dummy int main(void) program built statically and in ARM mode, and that filas in a similar way; I guess some libgcc code gets copied and uses thumb.

Revision history for this message
Matthias Klose (doko) wrote : Re: [Bug 537458] Re: Foobared on armel

On 17.03.2010 15:27, Loïc Minier wrote:
> Just for the sake of it, I tried to run a dummy int main(void) program
> built statically and in ARM mode, and that filas in a similar way; I
> guess some libgcc code gets copied and uses thumb.

you could try to work around this by building with gcc-4.3 (the static libgcc
provided by this GCC is still built in arm mode). But still libc6-dev's crt*.o
files are built in thumb mode.

Changed in valgrind:
status: Unknown → New
Revision history for this message
Jacob Bramley (jacob-bramley) wrote :

> Just for the sake of it, I tried to run a dummy int main(void) program
> built statically and in ARM mode, and that filas in a similar way; I
> guess some libgcc code gets copied and uses thumb.

Try the attached assembly program; it worked for me in an ARMv6 environment where the CRT code included some ARMv6 instructions. Build it with "-nostdlib".

Revision history for this message
Dave Martin (dave-martin-arm) wrote :

Agreed, the complete library stack and toolchain libraries must be ARM for this to work. You could try using this valgrind build to debug Jaunty binaries in a Jaunty chroot--- that ought to work. Karmic might not; I'm not sure whether v6 and/or VFP are fully supported.

@Loïc, can you post your updated patch?

Cheers

Revision history for this message
Dave Martin (dave-martin-arm) wrote :

@Loïc

Never mind, I see your patch on the upstream bug now. http://bugs.kde.org/show_bug.cgi?id=231108

Revision history for this message
Loïc Minier (lool) wrote :

Building the minimal.s which Jacob attached with:
as -o minimal.o minimal.s
gcc -nostdlib -o minimal minimal.o
and running it with a patched valgrind worked fine!

Loïc Minier (lool)
Changed in valgrind (Ubuntu):
assignee: nobody → Loïc Minier (lool)
status: Confirmed → In Progress
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package valgrind - 1:3.6.0~svn20100212-0ubuntu4

---------------
valgrind (1:3.6.0~svn20100212-0ubuntu4) lucid; urgency=low

  * New dpatch, 60_thumb-sigill-fixes, from upstream KDE #231108, fixes
    SIGILLs on startup when build in Thumb mode. This allows valgrind-ing
    simple ARM-mode binaries, but supporting Thumb-mode binaries requires much
    larger changes; LP: #537458.
 -- Loic Minier <email address hidden> Fri, 26 Mar 2010 14:47:17 +0100

Changed in valgrind (Ubuntu):
status: In Progress → Fix Released
Changed in valgrind:
importance: Unknown → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.