openoffice.org FTBFS on armel

Bug #555977 reported by Steve Langasek
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
openoffice.org (Ubuntu)
Fix Released
High
Michael Casadevall
Lucid
Fix Released
High
Michael Casadevall

Bug Description

Binary package hint: openoffice.org

The last two attempts to build OOo 1:3.2.0-4ubuntu3 on armel have failed after just a couple of hours, so we seem to have a problem here:

Compiling: i18npool/unxlngr/misc/localedata_others_version.c
: && LD_LIBRARY_PATH=${LD_LIBRARY_PATH+${LD_LIBRARY_PATH}:}/build/buildd/openoffice.org-3.2.0/ooo-build-3-2-0-7/build/OOO320_m12/solver/320/unxlngr.pro/lib ../../../unxlngr.pro/bin/saxparser en_AU en_AU.xml ../../../unxlngr.pro/misc/localedata_en_AU.cxx ../../../unxlngr.pro/bin/localedata_en_AU.rdb /build/buildd/openoffice.org-3.2.0/ooo-build-3-2-0-7/build/OOO320_m12/solver/320/unxlngr.pro/bin/types.rdb
/bin/bash: line 1: 24891 Segmentation fault LD_LIBRARY_PATH=${LD_LIBRARY_PATH+${LD_LIBRARY_PATH}:}/build/buildd/openoffice.org-3.2.0/ooo-build-3-2-0-7/build/OOO320_m12/solver/320/unxlngr.pro/lib ../../../unxlngr.pro/bin/saxparser en_AU en_AU.xml ../../../unxlngr.pro/misc/localedata_en_AU.cxx ../../../unxlngr.pro/bin/localedata_en_AU.rdb /build/buildd/openoffice.org-3.2.0/ooo-build-3-2-0-7/build/OOO320_m12/solver/320/unxlngr.pro/bin/types.rdb
dmake: Error code 139, while making '../../../unxlngr.pro/misc/localedata_en_AU.cxx'

Full build log at <http://launchpadlibrarian.net/43132519/buildlog_ubuntu-lucid-armel.openoffice.org_1%3A3.2.0-4ubuntu3_FAILEDTOBUILD.txt.gz>.

Tags: armel
Steve Langasek (vorlon)
Changed in openoffice.org (Ubuntu Lucid):
status: New → Triaged
importance: Undecided → High
milestone: none → ubuntu-10.04-beta-2
assignee: nobody → Michael Casadevall (mcasadevall)
Revision history for this message
Chris Cheney (ccheney) wrote :

Also note that it doesn't look like anything relevant changed between the version that last built properly on arm and the subsequent ones that did not build. I may of course be wrong, but this seems like a buildd issue of some sort.

Revision history for this message
Michael Casadevall (mcasadevall) wrote :

I was able to reproduce this build failure of jocote after two hours of building. I want to retry on a Dove based board in case the failure is actually a problem common to the entire imx51 family, but this looks like a legit bug; It appears saxparser is segfaulitng versus bash, but I still don't have a clear reason why this may be ...

Revision history for this message
Chris Cheney (ccheney) wrote :

Adding link to patch of what changed between the working and failing build of OOo.

http://launchpadlibrarian.net/42029557/openoffice.org_1%3A3.2.0-4ubuntu1_1%3A3.2.0-4ubuntu2.diff.gz

It built properly on jambul, but with the diff above then failed on jambul and huito (maybe others as it was retried several times).

Revision history for this message
Steve Langasek (vorlon) wrote :

Here's what gdb gives for a backtrace:

(gdb) bt
#0 0x401d5596 in calculate ()
   from /home/vorlon/openoffice.org-3.2.0/ooo-build-3-2-0-7/build/OOO320_m12/solver/320/unxlngr.pro/lib/libuno_cppu.so.3
#1 0x401d2da4 in ?? ()
   from /home/vorlon/openoffice.org-3.2.0/ooo-build-3-2-0-7/build/OOO320_m12/solver/320/unxlngr.pro/lib/libuno_cppu.so.3
#2 0x401d2da4 in ?? ()
   from /home/vorlon/openoffice.org-3.2.0/ooo-build-3-2-0-7/build/OOO320_m12/solver/320/unxlngr.pro/lib/libuno_cppu.so.3
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Here's a 'disassemble' snippet:

0x401d5586 <calculate+250>: blx 0x401d5050
0x401d558a <calculate+254>: str r0, [r7, #12]
0x401d558c <calculate+256>: ldr r2, [r7, #12]
0x401d558e <calculate+258>: ldr r1, [r7, #16]
0x401d5590 <calculate+260>: adds.w r3, r2, r1, lsl #4
0x401d5594 <calculate+264>: itttt ne
0x401d5596 <calculate+266>: strne r5, [r3, #12]
0x401d5598 <calculate+268>: strne r4, [r3, #8]
0x401d559a <calculate+270>: strne.w r8, [r3, #4]

Steve Langasek (vorlon)
Changed in openoffice.org (Ubuntu Lucid):
milestone: ubuntu-10.04-beta-2 → ubuntu-10.04
Revision history for this message
Michael Casadevall (mcasadevall) wrote :

I did a test build of the lastest archive version on a dove board just to completely rule out the hardware. It segfaulted so this isn't a hardware issue. I also ran a parallel build of ubuntu1 on an imx51 build which also failed; the change that broke OOo appears toolchain related.

Alexander Sack (asac)
tags: added: armel
Revision history for this message
Michael Casadevall (mcasadevall) wrote :

With a patched gdb, I managed to get a backtrace, but its of limited use, the stack seems fairly corrupted:

mcasadevall@dawn:~/src/ooo/current/openoffice.org-3.2.0/build/i18npool/source/localedata/data$ LD_LIBRARY_PATH=${LD_LIBRARY_PATH+${LD_LIBRARY_PATH}:}/home/mcasadevall/src/ooo/current/openoffice.org-3.2.0/ooo-build-3-2-0-7/build/OOO320_m12/solver/320/unxlngr.pro/lib gdb ../../../unxlngr.pro/bin/saxparser
GNU gdb (GDB) 7.1-ubuntu
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabi".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/mcasadevall/src/ooo/current/openoffice.org-3.2.0/ooo-build-3-2-0-7/build/OOO320_m12/i18npool/unxlngr.pro/bin/saxparser...done.
<ffice.org-3.2.0/ooo-build-3-2-0-7/build/OOO320_m12/solver/320/unxlngr.pro/bin/types.rdb
Starting program: /home/mcasadevall/src/ooo/current/openoffice.org-3.2.0/ooo-build-3-2-0-7/build/OOO320_m12/i18npool/unxlngr.pro/bin/saxparser en_AU en_AU.xml ../../../unxlngr.pro/misc/localedata_en_AU.cxx ../../../unxlngr.pro/bin/localedata_en_AU.rdb /home/mcasadevall/src/ooo/current/openoffice.org-3.2.0/ooo-build-3-2-0-7/build/OOO320_m12/solver/320/unxlngr.pro/bin/types.rdb
[Thread debugging using libthread_db enabled]

Program received signal SIGSEGV, Segmentation fault.
0x2ac80596 in calculate ()
   from /home/mcasadevall/src/ooo/current/openoffice.org-3.2.0/ooo-build-3-2-0-7/build/OOO320_m12/solver/320/unxlngr.pro/lib/libuno_cppu.so.3
(gdb) bt
#0 0x2ac80596 in calculate ()
   from /home/mcasadevall/src/ooo/current/openoffice.org-3.2.0/ooo-build-3-2-0-7/build/OOO320_m12/solver/320/unxlngr.pro/lib/libuno_cppu.so.3
#1 0x2ac7dda4 in ?? ()
   from /home/mcasadevall/src/ooo/current/openoffice.org-3.2.0/ooo-build-3-2-0-7/build/OOO320_m12/solver/320/unxlngr.pro/lib/libuno_cppu.so.3
#2 0x2ac7dda4 in ?? ()
   from /home/mcasadevall/src/ooo/current/openoffice.org-3.2.0/ooo-build-3-2-0-7/build/OOO320_m12/solver/320/unxlngr.pro/lib/libuno_cppu.so.3

Revision history for this message
Michael Casadevall (mcasadevall) wrote :

Doing more test builds, building with -marm makes the segfault with i18npool disappear (and presumably would let the build finish). Given our proximity to final freeze, I think this is an acceptable workaround and then we can properly solve it for 10.10. Any objections?

Revision history for this message
Dave Martin (dave-martin-arm) wrote :

gdb may be failing to produce a backtrace due to missing debug symbols--- make sure you have the debug symbols for all of OOo's dependencies installed.

Due to amount of time required to build OOo I suggest to go with the ARM build if we have no clue about how to fix the bug.

Can we build just the affected objects with -marm?

Revision history for this message
Matthias Klose (doko) wrote :

building some subprojects with other optimization options is possible. see ooo-build/patches/dev300/ubuntu-arm-thumb.diff how this can be done.

Revision history for this message
Matthias Klose (doko) wrote :

bridges/source/cpp_uno/{shared,gcc3_linux_arm}/makefile.mk would be candidates for

+.IF "$(CPUNAME)"=="ARM"
+CFLAGS += -marm
+.ENDIF

Revision history for this message
Steve Langasek (vorlon) wrote :

> gdb may be failing to produce a backtrace due to missing debug symbols---
> make sure you have the debug symbols for all of OOo's dependencies installed.

Doesn't seem to be the issue here, first because the stack points to an address within an object belonging to the OOo source, and second because of this error message:

> Backtrace stopped: previous frame identical to this frame (corrupt stack?)

A gdb bug, maybe; not something that's solved by having more symbols available.

Revision history for this message
Dave Martin (dave-martin-arm) wrote :

You may be right, but I recommend having all the debug symbols installed anyway. Not having them is asking for trouble right now (though work is ongoing in gdb to improve backtracing)

@Michael, is there a way I can get your debug binaries and a core dump?

Revision history for this message
Michael Casadevall (mcasadevall) wrote :

@Dave Martin, no core dump is recorded. I posted the backtrace available, but the debugger is choking on something. Problem is NOT with UNO (at least directly), seems only i18ntools in the source is effected. I don't want to selectively change optimizations, because that may introduce subtile bugs between OOo modules. Lets just build the whole thing -marm, and fix thumb2 building for 10.10.

Revision history for this message
Dave Martin (dave-martin-arm) wrote :

The reasoning behind my suggestion is that OOo is vast - probably the biggest single app we're dealing with, so we really want to keep Thumb-2 here.

However, I agree with you in the short term— we want to fix the real problem rather than chase random non-issues caused by a nonstandard build configuration. I agree we should go with -marm for now and focus on fixing the problem in time for lucid-rc.

For tracking problems of this type, especially when the debugger itself has some issues, it would be valuable to reconfigure the buildds to save core dumps for process crashes occurring during builds. Do you know who we can discuss that with?

In the meantime, can you fire off a manual build, or is this one of these problems that only happens on the buildds?

Revision history for this message
Michael Casadevall (mcasadevall) wrote :

@dmat: no, it happens on all hardware (Dove, imx51), I get no core dump on my system when the segfault happens. We can SRU OOo to remove the -marm option when we finally fix it properly since its a kludge to get OOo for release. We're about to reach final freeze, so I don't think we're going to have another shot at fixing this now. I've patched a patch to ccheney to force -marm, and I'm retesting it now locally. If it works, that will be uploaded later today.

Revision history for this message
Chris Cheney (ccheney) wrote :

The patch above that Michael referred to appeared to not work last night. I will likely end up having to upload my new version of OOo (1:3.2.0-6ubuntu1) with the patch he couldn't get to work and see if it has any better luck on the buildds as Final Freeze is tomorrow.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package openoffice.org - 1:3.2.0-7ubuntu1

---------------
openoffice.org (1:3.2.0-7ubuntu1) lucid; urgency=low

  * Resynchronise with Debian (r1940). Remaining changes:
    - Add Launchpad integration support.
    - Add Launchpad translations support.
    - Add package openoffice.org-style-human.
    - Add some Ubuntu-specific bitmaps. Adjust broffice diversions for these.
    - Add support for compressing debs with lzma.
    - Add support for shared /usr/share/doc directories.
    - Add support to build l10n as a separate source.
    - Add Xb-Npp-xxx tags according to "firefox distro add-on suport" spec.
    - openoffice.org-help switch to internal copy of lucene.
    - Disable check_for_running_ooo as it appears to not be needed anymore.
    - Disable gnome-vfs support since it is buggy.
    - Disable URE debconf warning.
    - Switch desktop files from %U to %F for gvfs fuse.
  * Disable lpi bug reporting for lucid per blueprint.
  * Update COMMON_DOCDIR to further reduce duplication.
  * Update Human icon theme.
  * Update Oracle logo on OOo splash screen.
  * Update Latvian translation. Closes LP: #555716
  * Correct Latvian locale settings. Closes LP: #555276
  * Work around apt upgrade bug for openoffice.org-evolution.
    Closes: LP: #556348
  * Force OOo to build for ARM instead of thumb2 to work around a segfault.
    Closes LP: #555977
 -- Chris Cheney <email address hidden> Wed, 14 Apr 2010 12:00:00 -0500

Changed in openoffice.org (Ubuntu Lucid):
status: Triaged → Fix Released
Revision history for this message
Michael Casadevall (mcasadevall) wrote :

Looking at buildd logs, we're now 8 hours out of 48-72 hours of build time, and well past the initial point of breakage with OOo. Seems we're worked around the problem.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.