'bzr pull ../other' slower with packs

Bug #172970 reported by John A Meinel
4
Affects Status Importance Assigned to Milestone
Bazaar
Fix Released
Medium
Unassigned

Bug Description

I just wanted to spawn off a workingtree-less branch so I did:

bzr init x
cd x
bzr remove-tree
bzr pull ../bzr.dev

And it spent a *lot* of time. It is because of a "get_revision_graph()" call
(which is another O(history) call), because for some reason it is calling
Branch.set_revision_history().

  /Users/jameinel/dev/bzr/bzr.dev/bzrlib/branch.py(1489)pull()
- -> self.update_revisions(source, stop_revision)
  /Users/jameinel/dev/bzr/bzr.dev/bzrlib/decorators.py(165)write_locked()
- -> return unbound(self, *args, **kwargs)
  /Users/jameinel/dev/bzr/bzr.dev/bzrlib/branch.py(1463)update_revisions()
- -> other_branch=other)
  /Users/jameinel/dev/bzr/bzr.dev/bzrlib/decorators.py(165)write_locked()
- -> return unbound(self, *args, **kwargs)
  /Users/jameinel/dev/bzr/bzr.dev/bzrlib/branch.py(1439)generate_revision_history()
- -> last_rev, other_branch))
  /Users/jameinel/dev/bzr/bzr.dev/bzrlib/decorators.py(165)write_locked()
- -> return unbound(self, *args, **kwargs)
  /Users/jameinel/dev/bzr/bzr.dev/bzrlib/branch.py(1376)set_revision_history()
- -> self._write_revision_history(rev_history)
  /Users/jameinel/dev/bzr/bzr.dev/bzrlib/branch.py(1970)_write_revision_history()
- -> if history != self._lefthand_history(history[-1]):
  /Users/jameinel/dev/bzr/bzr.dev/bzrlib/branch.py(1409)_lefthand_history()
- -> stop_graph = self.repository.get_revision_graph(revision_id)
  /Users/jameinel/dev/bzr/bzr.dev/bzrlib/decorators.py(127)read_locked()
- -> return unbound(self, *args, **kwargs)

/Users/jameinel/dev/bzr/bzr.dev/bzrlib/repofmt/knitrepo.py(201)get_revision_graph()
- -> return a_weave.get_graph([revision_id])

I'm not sure why Branch.pull is calling set_revision_history() rather than
calling Branch.set_last_revision_info().

This is a Branch6 branch, so it doesn't need to iterate the entire history,
unless it is verifying that the revision number is correct.

By the way Martin, thank you so very, very much for the SIGQUIT handler.

John
=:->

Revision history for this message
John A Meinel (jameinel) wrote :

I'm uploading the callgrind for doing this from a merge directive.

I'll upload one for a plain bazaar branch next.

description: updated
Revision history for this message
John A Meinel (jameinel) wrote :
Revision history for this message
John A Meinel (jameinel) wrote :

In the associated branch, I'm adding a function to tell the index that we are about to do something which will require most of the data (CombinedIndex.buffer_all()).

The associated branch is mostly a workaround, rather than fixing the root of the problem (we shouldn't be using get_revision_graph() here.)

Without buffering, it takes 36.521s to do the 'bzr pull'. With buffering it drops to 17.760s. Which is still way too high, but is a very, noticeable improvement. (Note it should be about 1s which is all startup time, because it really only has to set branch.last_revision_info(). Arguably it should check that the original revision is in the ancestry of the target revision, but that is it, and if we use --overwrite it shouldn't even have to check that.)

Revision history for this message
John A Meinel (jameinel) wrote :

Note that on a slower machine with knits "time bzr pull ../other" takes only 4s. So even though this helps, it still is a 4x regression versus knits. (It is a 9x regression without the buffering, though)

Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 172970] Re: packs regression: 'bzr pull ../other'

On Fri, 2007-11-30 at 02:32 +0000, John A Meinel wrote:
> In the associated branch, I'm adding a function to tell the index that
> we are about to do something which will require most of the data
> (CombinedIndex.buffer_all()).
>
> The associated branch is mostly a workaround, rather than fixing the
> root of the problem (we shouldn't be using get_revision_graph() here.)
>
> Without buffering, it takes 36.521s to do the 'bzr pull'. With buffering
> it drops to 17.760s. Which is still way too high, but is a very,
> noticeable improvement. (Note it should be about 1s which is all startup
> time, because it really only has to set branch.last_revision_info().
> Arguably it should check that the original revision is in the ancestry
> of the target revision, but that is it, and if we use --overwrite it
> shouldn't even have to check that.)

I think that the time take to write and test buffer_all() could have
fixed the root cause for this bug :(
--
GPG key available at: <http://www.robertcollins.net/keys.txt>.

Revision history for this message
John A Meinel (jameinel) wrote : Re: packs regression: 'bzr pull ../other'

maybe, but it probably wouldn't have fixed bzr status, bzr log, etc

Revision history for this message
John A Meinel (jameinel) wrote :

and I spent far more time submitting proper bug reports for all of the performance regressions I've encountered (including adding callgrinds, and properly tagging them, per your request)

Revision history for this message
John A Meinel (jameinel) wrote :

this was fixed a while back with Branch6 to avoid calling 'set_revision_history' and instead just 'set_last_revision_info'

Changed in bzr:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.