packs do not check for missing compression parents

Bug #165290 reported by Robert Collins
2
Affects Status Importance Assigned to Milestone
Bazaar
Fix Released
Critical
John A Meinel

Bug Description

When pulling knit->pack or pack->pack, no error is raised if a
compression parent is missing, this leads to silent corruption.

 affects bzr
 tag packs
 status triaged
--
GPG key available at: <http://www.robertcollins.net/keys.txt>.

Tags: packs
Martin Pool (mbp)
Changed in bzr:
importance: Undecided → Critical
milestone: none → 1.0rc1
status: Triaged → Confirmed
Revision history for this message
John A Meinel (jameinel) wrote :

I'll give this a shot.

Changed in bzr:
assignee: nobody → jameinel
status: Confirmed → In Progress
Revision history for this message
Robert Collins (lifeless) wrote :

for efficiency I suggest that this be done as:
fetch all the texts - _copy_nodes_graph
after that, use nodes to generate a set of referenced texts
text_refs = set()
text_keys = set()
        for index, key, value, references in nodes:
            if references[1]:
                text_refs.update(references[1])
            text_keys.add(key)

Then filter that by the keys copied
external_refs = text_refs - text_keys

Now look up external_refs in self._pack_collection.text_index.combined_index
found_items = list(self._pack_collection.text_index.combined_index.iter_entries(external_refs))
if len(found_items) != len(external_refs):
    identify the missing ones here.

-Rob

Revision history for this message
Martin Pool (mbp) wrote :

I don't see mail from John saying he finished this, so I'll try to pick it up.

Revision history for this message
Martin Pool (mbp) wrote :

https://code.edge.launchpad.net/~jameinel/bzr/fetch-check-parents-165290

ok, john seems to have added a plausible test for this (test_fetch_missing_basis_text), and to have code in pack_repo along the lines Robert suggests. However, the test does not actually observe the expected error when pulling into a repository that is missing texts it ought to have.

In the relevant test we're called on an InterPackRepo, and into its .fetch method. This does seem to call Packer.pack(), which should run the checks...

Strangely enough inside the fetch we have self.source._pack_collection.all_packs() returning None. However, there are two packs in the source directory, and they are named correctly by the pack-names file, and returned from the _pack_collection.names() method. But:

(Pdb) spc.get_pack_by_name('b93e1621bbfcd744e128a46c27058d79')
*** AttributeError: 'ExistingPack' object has no attribute 'transport'

ah, the problem seems to be that the ExistingPack is unprintable because its str method tries to use a missing attribute; so this is annoying but not a big problem.

to be continued...

Revision history for this message
Martin Pool (mbp) wrote :

ok, i think this is not far off, but I won't have time to finish it and have it reviewed for our deadline.

Changed in bzr:
milestone: 1.0alpha1 → 1.0alpha2
Revision history for this message
John A Meinel (jameinel) wrote :

It has been finished in the attached branch.

Changed in bzr:
status: In Progress → Fix Committed
John A Meinel (jameinel)
Changed in bzr:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.