fails to build recipe with "bzr: out of memory"

Bug #746822 reported by Martin Pool
60
This bug affects 9 people
Affects Status Importance Assigned to Milestone
Bazaar
Fix Released
High
Unassigned
Launchpad itself
Fix Released
Critical
Unassigned

Bug Description

https://launchpadlibrarian.net/67824936/buildlog.txt.gz

 > Retrieving 'lp:maria' to put at '/home/buildd/build-e0819dac0d749b3228f2dfbb915ad4c43c9a16fe/chroot-autobuild/home/buildd/work/tree/recipe-{debupstream}-0~{revno}'.
> bzr: out of memory

This is like a repeat of bug 681582, but I guess an additional fix is needed beyond what has been done there.

Martin Pool (mbp)
Changed in bzr:
status: New → Confirmed
importance: Undecided → High
Ian Booth (wallyworld)
Changed in launchpad:
importance: Undecided → High
status: New → Triaged
Martin Pool (mbp)
tags: added: memory performance
Steve Magoun (smagoun)
tags: added: oem-services
Revision history for this message
Jelmer Vernooij (jelmer) wrote :

Some other things that we can do to improve the memory usage:

 * deploy a newer version of bzr which uses less memory (this is in the pipeline)
 * increase the ulimits for the recipe build jobs on the buildds; if I remember correctly they were set pretty low because of responsiveness issues with the buildds that should be less of an issue now the buildd-manager has moved to twisted

Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 746822] Re: fails to build recipe with "bzr: out of memory"

On Wed, Jun 8, 2011 at 9:07 PM, Jelmer Vernooij
<email address hidden> wrote:
> Some other things that we can do to improve the memory usage:
>
>  * deploy a newer version of bzr which uses less memory (this is in the pipeline)

+1

>  * increase the ulimits for the recipe build jobs on the buildds; if I remember correctly they were set pretty low because of responsiveness issues with the buildds that should be less of an issue now the buildd-manager has moved to twisted

The buildd manager was always twistd. The buildds need to be
responsive or the manager will consider them dead.

Revision history for this message
Jelmer Vernooij (jelmer) wrote :

On 08/06/11 10:26, Robert Collins wrote:
> On Wed, Jun 8, 2011 at 9:07 PM, Jelmer Vernooij
> <email address hidden> wrote:
>> Some other things that we can do to improve the memory usage:
>>
>> * deploy a newer version of bzr which uses less memory (this is in the pipeline)
> +1
I'll have a look at this - bzr-builder also needs to be updated, might
as well do both.

>> * increase the ulimits for the recipe build jobs on the buildds; if I
> remember correctly they were set pretty low because of responsiveness
> issues with the buildds that should be less of an issue now the buildd-
> manager has moved to twisted
>
> The buildd manager was always twistd. The buildds need to be
> responsive or the manager will consider them dead.
Sorry, I mean until the buildd manager was made properly twisted by
bigjools and jml last year.

I recall there were issues with builders going AWOL when they were
building large recipe builds, and taking a really long time (> 30
seconds or something) to respond to simple "are you alive" query ? The
strict ulimits were put in place specifically to try to mitigate that,
but the issue later disappeared because of another fix, roughly around
the same time the buildd manager was made fully twisted. IMBW.

I don't see any ulimits in place (in the lib/canonical/buildd scripts)
for non-recipe builds; I would expect the limits to be the same for both
regular and recipe builds.

Cheers,

Jelmer

Revision history for this message
Scott Ritchie (scottritchie) wrote :

Any updates? My daily recipes just started failing 2 weeks ago (but were succeeding before then) so I seem to have crossed into the threshold (or available memory on the builders decreased).

This is still quite annoying.

Revision history for this message
Jelmer Vernooij (jelmer) wrote :

We're still waiting for a deployment of a newer bzr-builder and bzr to the builders.

Revision history for this message
Martin Pool (mbp) wrote :

Is there an RT or some similar handle for the deployment?

Revision history for this message
Jelmer Vernooij (jelmer) wrote :

This is RT #46345

Revision history for this message
Martin Pool (mbp) wrote :

I've asked for the rt to be completed.

Revision history for this message
Scott Ritchie (scottritchie) wrote :

This is your 3 week nag reminder politely asking this be completed as it's still blocking my work :)

Revision history for this message
Martin Pool (mbp) wrote :

Thanks for the reminder, Scott. I'm sorry it is taking so long.

This is still waiting on the RT, which is apparently waiting on Jelmer reworking the packages to not require an updated quilt. Jelmer, if you can push that I will try to get IS to actually install them.

Revision history for this message
Martin Pool (mbp) wrote :

We think this is now fixed, as a more efficient version of bzr has been rolled out to the buildds. I have retried some of the builds that were previously reported to have failed and they either passed or failed for non-bzr-related reasons during the actual build.

(Retrospectively critical as a stakeholder escalation.)

Please let us know if this works well or not on your jobs.

Changed in launchpad:
importance: High → Critical
status: Triaged → Fix Released
Changed in bzr:
status: Confirmed → Fix Released
tags: added: affects-linaro
Revision history for this message
David Allwicher (aber) wrote :
Revision history for this message
Philip Muškovac (yofel) wrote :
Changed in launchpad:
status: Fix Released → Triaged
Revision history for this message
Francis J. Lacoste (flacoste) wrote :

Reopening in Launchpad, should probably be reopened in bzr too. It's failling on 4 different builders at least (actinium, dubnium, lemon, uranium). Not sure if it works on any.

I'll investigate the RAM available on these builders.

Revision history for this message
Martin Pool (mbp) wrote :

There's some suggestion it's not actually deployed on all buildds,
which I'm investigating. Otherwise we'll have to see about
reproducing this within a limited amount of memory locally and
investigating more.

Changed in bzr:
status: Fix Released → Confirmed
Revision history for this message
Martin Pool (mbp) wrote :

It looks like the deployment of this was buggy (causing bug 884516) and the setup on qastaging was not reproduced on lpnet. So, the deployment to lpnet was inconsistent across machines and also inconsistent with what was done on qas.

We have gone back to bzr 2.4.0 on lpnet, as used last week.

We will try again to deploy the bzr* updates on qastaging, test it properly, then again deploy to lpnet.

see also bug 693524 which is a different out-of-memory error not apparently related to bzr.

Revision history for this message
Martin Pool (mbp) wrote :

I've tested 'bzr build' on the projectneon recipes locally in a 1GB ulimit, and it worked, so I think this is ok in bzr and just needs to be actually rolled out.

Changed in bzr:
status: Confirmed → Fix Released
Changed in launchpad:
status: Triaged → In Progress
Revision history for this message
Martin Pool (mbp) wrote : Fwd: [Bug 746822] Re: fails to build recipe with "bzr: out of memory"

local test:

=time -v bzr2.4 build project-neon-kdesdk.recipe ./projectneon-build-2 -v
Building tree.
Retrieving 'lp:kdesdk' to put at './projectneon-build-2'.
Retrieving 'lp:~neon/project-neon/kdesdk-ubuntu' to put at
'./projectneon-build-2/debian'.
       Command being timed: "bzr2.4 build project-neon-kdesdk.recipe
./projectneon-build-2 -v"
       User time (seconds): 0.51
       System time (seconds): 0.27
       Percent of CPU this job got: 1%
       Elapsed (wall clock) time (h:mm:ss or m:ss): 0:53.15
       Average shared text size (kbytes): 0
       Average unshared data size (kbytes): 0
       Average stack size (kbytes): 0
       Average total size (kbytes): 0
       Maximum resident set size (kbytes): 164800
       Average resident set size (kbytes): 0
       Major (requiring I/O) page faults: 1
       Minor (reclaiming a frame) page faults: 26367
       Voluntary context switches: 103
       Involuntary context switches: 97
       Swaps: 0
       File system inputs: 56
       File system outputs: 6448
       Socket messages sent: 0
       Socket messages received: 0
       Signals delivered: 0
       Page size (bytes): 4096
       Exit status: 0

so, the maximum resident memory was 164MB and we should be safe within
the 1GB limit for buildds.

Revision history for this message
Marcin Juszkiewicz (hrw) wrote :

15:56 hrw@puchatek:ci$ /usr/bin/time -v bzr build recipes/gcc-linaro.bzr build/gcc-linaro -v
Building tree.
Retrieving '/home/hrw/devel/canonical/2011-oneiric/ci/bazary/packaging-gcc-linaro/' to put at 'build/gcc-linaro'.
Retrieving '/home/hrw/devel/canonical/2011-oneiric/ci/bazary/gcc-linaro/' to put at 'build/gcc-linaro/src'.
        Command being timed: "bzr build recipes/gcc-linaro.bzr build/gcc-linaro -v"
        User time (seconds): 47.65
        System time (seconds): 9.59
        Percent of CPU this job got: 18%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 5:16.39
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 2893248
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 25
        Minor (reclaiming a frame) page faults: 275194
        Voluntary context switches: 19156
        Involuntary context switches: 2809
        Swaps: 0
        File system inputs: 1590280
        File system outputs: 320
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

hrw@puchatek:ci$ bzr --version
Bazaar (bzr) 2.4.1
  Python interpreter: /usr/bin/python 2.7.2
  Python standard library: /usr/lib/python2.7
  Platform: Linux-3.0.0-12-generic-x86_64-with-Ubuntu-11.10-oneiric
  bzrlib: /usr/lib/python2.7/dist-packages/bzrlib
  Bazaar configuration: /home/hrw/.bazaar
  Bazaar log file: /home/hrw/.bzr.log

Revision history for this message
Jelmer Vernooij (jelmer) wrote :

Please note that launchpad actually uses "bzr dailydeb", rather than "bzr build". The former uses stacked branches, which can have a significant impact on performance.

Revision history for this message
Martin Pool (mbp) wrote : Re: [Bug 746822] Re: fails to build recipe with "bzr: out of memory"

Good point, I'll re-test with dailydeb.

Revision history for this message
Martin Pool (mbp) wrote :

dailydeb on the projectneon recipe uses

Maximum resident set size (kbytes): 623696

and works correctly under a 700MB ulimit, which is more strict than on the buildds.

Revision history for this message
Martin Pool (mbp) wrote :

follow on bug 884997

tags: added: escalated
Revision history for this message
Martin Pool (mbp) wrote :

qa investigations:

one bzr daily recipe which was previously passing now fails because of
bug 885497, which does seem to be an actual integration bug rather
than a deployment issue.

we could possibly fix this bug 746822 and avoid bug 885497 by
upgrading only bzr and not bzr-builder so we're going to try that.

Revision history for this message
Martin Pool (mbp) wrote : Re: [Bug 746822] [NEW] fails to build recipe with "bzr: out of memory"

our next attempt (with only bzr and not bzr-builder) has worked ok on qa,
and it should be rolled to lpnet next week, after uds

--
Martin

Revision history for this message
Jelmer Vernooij (jelmer) wrote :

This has now been rolled out, thanks to Lamont and Martin.

I've confirmed this fixes the issue for the Samba 4 recipes. The widelands recipe mentioned above seems to be working now too.

Revision history for this message
Jelmer Vernooij (jelmer) wrote :

The Linaro gcc source package built fine too, with this recipe: https://code.launchpad.net/~linaro-pkg/+recipe/gcc-linaro-native-daily

https://code.launchpad.net/~linaro-pkg/+archive/testing-daily-builds/+recipebuild/115540

So I think this can be considered fixed now.

Jelmer Vernooij (jelmer)
Changed in launchpad:
status: In Progress → Fix Released
Revision history for this message
Marcin Juszkiewicz (hrw) wrote :

Thanks everyone for making it fixed. Now I can go with my blueprints and make gcc-linaro daily/request working ;)

Revision history for this message
Martin Pool (mbp) wrote : Re: [Bug 746822] Re: fails to build recipe with "bzr: out of memory"

glad you like it

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.