indexerror in Knit._get_components_positions pulling in pack repo

Bug #154283 reported by Martin Pool
2
Affects Status Importance Assigned to Milestone
Bazaar
Invalid
Undecided
Unassigned

Bug Description

  affects bzr
  tags packs

mbp@grace% ~/old/bzr.20071017/pack-repository/bzr -Dfetch branch http://people.ubuntu.com/~robertc/baz2.0/repository
bzr: ERROR: exceptions.IndexError: tuple index out of range

Traceback (most recent call last):
  File "/home/mbp/old/bzr.20071017/pack-repository/bzrlib/commands.py", line 802, in run_bzr_catch_errors
    return run_bzr(argv)
  File "/home/mbp/old/bzr.20071017/pack-repository/bzrlib/commands.py", line 758, in run_bzr
    ret = run(*run_argv)
  File "/home/mbp/old/bzr.20071017/pack-repository/bzrlib/commands.py", line 492, in run_argv_aliases
    return self.run(**all_cmd_args)
  File "/home/mbp/old/bzr.20071017/pack-repository/bzrlib/builtins.py", line 893, in run
    possible_transports=[to_transport])
  File "/home/mbp/old/bzr.20071017/pack-repository/bzrlib/bzrdir.py", line 835, in sprout
    wt = result.create_workingtree()
  File "/home/mbp/old/bzr.20071017/pack-repository/bzrlib/bzrdir.py", line 1106, in create_workingtree
    return self._format.workingtree_format.initialize(self, revision_id)
  File "/home/mbp/old/bzr.20071017/pack-repository/bzrlib/workingtree_4.py", line 1299, in initialize
    transform.build_tree(basis, wt)
  File "/home/mbp/old/bzr.20071017/pack-repository/bzrlib/transform.py", line 1265, in build_tree
    return _build_tree(tree, wt)
  File "/home/mbp/old/bzr.20071017/pack-repository/bzrlib/transform.py", line 1348, in _build_tree
    tree.iter_files_bytes(deferred_contents)):
  File "/home/mbp/old/bzr.20071017/pack-repository/bzrlib/repository.py", line 1132, in iter_files_bytes
    yield callable_data, weave.get_lines(revision_id)
  File "/home/mbp/old/bzr.20071017/pack-repository/bzrlib/knit.py", line 942, in get_lines
    return self.get_line_list([version_id])[0]
  File "/home/mbp/old/bzr.20071017/pack-repository/bzrlib/knit.py", line 977, in get_line_list
    text_map, content_map = self._get_content_maps(version_ids)
  File "/home/mbp/old/bzr.20071017/pack-repository/bzrlib/knit.py", line 989, in _get_content_maps
    record_map = self._get_record_map(version_ids)
  File "/home/mbp/old/bzr.20071017/pack-repository/bzrlib/knit.py", line 955, in _get_record_map
    position_map = self._get_components_positions(version_ids)
  File "/home/mbp/old/bzr.20071017/pack-repository/bzrlib/knit.py", line 800, in _get_components_positions
    next = self.get_parents(cursor)[0]
IndexError: tuple index out of range

bzr 0.92.0.dev.0 on python 2.5.1.final.0 (linux2)
arguments: ['/home/mbp/old/bzr.20071017/pack-repository/bzr', '-Dfetch', 'branch', 'http://people.ubuntu.com/~robertc/baz2.0/repository']
encoding: 'UTF-8', fsenc: 'UTF-8', lang: 'en_AU.UTF-8'
plugins:
  gtk /home/mbp/.bazaar/plugins/gtk [0.91.0]
  launchpad /home/mbp/old/bzr.20071017/pack-repository/bzrlib/plugins/launchpad [unknown]
  multiparent /home/mbp/old/bzr.20071017/pack-repository/bzrlib/plugins/multiparent.pyc [unknown]
  pqm /home/mbp/.bazaar/plugins/pqm [unknown]

** Please send this report to <email address hidden>
   with a description of what you were doing when the
   error occurred.
zsh: exit 4 ~/old/bzr.20071017/pack-repository/bzr -Dfetch branch
~/old/bzr.20071017/pack-repository/bzr -Dfetch branch 102.22s user 1.61s system 14% cpu 12:00.04 total

This was pulling from the branch shown above into a newly initialized
repository, then trying to extract the tree from it.

--
Martin

Revision history for this message
John A Meinel (jameinel) wrote :

I'm not following this closely but I thought I'd mention a bit of diagnosis. I think the lines in question are:
                method = self._index.get_method(cursor)
                if method == 'fulltext':
                    next = None
                else:
                    next = self.get_parents(cursor)[0]

Which would indicate that we have a record with no parents that is not marked as a fulltext. This shouldn't be possible because we don't have anything to delta against if we don't have any parents. (So it should *have* to be a fulltext).

It may be that Packs change that logic, though I'm not sure how. We certainly could do:

parents = self.get_parents(cursor)
if not parents:
  next = parents[0]

However, that would miss the case where we have the fulltext cached *and* parents. And it seems like a data corruption to have no parents but no fulltext.

Could it just be that http://people.ubuntu.com/~robertc/baz2.0/repository is broken?

Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 154283] Re: indexerror in Knit._get_components_positions pulling in pack repo

On Fri, 2007-10-19 at 19:45 +0000, John A Meinel wrote:
>
> Could it just be that
> http://people.ubuntu.com/~robertc/baz2.0/repository is broken?

Is broken. And this has been documented on the list many times:).

bzr.dev was broken when I started the work on packs. Aaron wrote a patch
for check and reconcile to detect this, and I later found more index
issues which Andrew added to the patch. This patch has been merged, but
I have not regenerated my repository yet.

To play with packs:
pull bzr.dev to a normal new branch.
./bzr reconcile
./bzr upgrade --experimental

Cheers,
Rob
--
GPG key available at: <http://www.robertcollins.net/keys.txt>.

Revision history for this message
Martin Pool (mbp) wrote :

For other people following along, the recipe robert gives will let you play with packs, but it's apparently not enough to let you successfully pull from his baz2.0 repository.

I've written a smaller/faster version of check in https://code.edge.launchpad.net/~bzr/bzr-check-knits/trunk which will let you know if a particular repo is affected by this problem.

Robert suggests that pull should do a check after all the data has been brought in, but before finishing the write group, and this should make sure that all the text dependencies for new knit revisions being added are present.

When this fails, it looks like this:

checking knits of GraphKnitRepository('file:///home/mbp/newbzr/robertc-repository-packs-unrec/.bzr/repository/')
in store GraphKnitTextStore('file:///home/mbp/newbzr/robertc-repository-packs-unrec/.bzr/repository/knits/')
  BAD
  file: join-branches.txt-20050309044946-f635037a889c0388
  verson: <email address hidden>
Knit KnitVersionedFile(file:///home/mbp/newbzr/robertc-repository-packs-unrec/.bzr/repository/text%3Ajoin-branches.txt-20050309044946-f635037a889c0388) corrupt:
version <email address hidden>
    has compression method 'line-delta' but index parents ()

so if this is the case, just checking that the dependencies are there won't be enough.

Revision history for this message
Martin Pool (mbp) wrote :

and that message is with this patch applied

=== modified file 'bzrlib/knit.py'
--- bzrlib/knit.py 2007-10-22 01:23:51 +0000
+++ bzrlib/knit.py 2007-10-22 06:50:29 +0000
@@ -815,7 +815,17 @@
                 if method == 'fulltext':
                     next = None
                 else:
- next = self.get_parents(cursor)[0]
+ parents = self.get_parents(cursor)
+ try:
+ next = parents[0]
+ except IndexError:
+ raise KnitCorrupt(
+ self,
+ "\nversion %s\n"
+ " has compression method %r " \
+ "but index parents %r " % (
+ cursor, method, parents))
+ next = parents[0]
                 index_memo = self._index.get_position(cursor)
                 component_data[cursor] = (method, index_memo, next)
                 cursor = next

Revision history for this message
Martin Pool (mbp) wrote :

Looking with Robert at the knit indices: the version that's raising this error *does* have a pointer to a compression parent, but that parent is marked 'a' for absent in the index, and that may be because it is not referenced by the revision graph or inventory graph.

_get_components_positions should actually be calling get_parents_with_ghosts, or (better) get_compression_parents, which can return both the parent(s) and the method. If we didn't exclude ghosts, then we'd get a more obvious error about not being able to reconstruct the parent.

Revision history for this message
John A Meinel (jameinel) wrote :

+1 to the patch which at least gives a better error.

Revision history for this message
Martin Pool (mbp) wrote :

Closing this bug if it was due to a quirky/invalid source repository.

Changed in bzr:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.