reconcile doesn't adjust knit index references to otherwise-unreferenced file revisions

Bug #155730 reported by Martin Pool
2
Affects Status Importance Assigned to Milestone
Bazaar
Fix Released
Critical
Andrew Bennetts

Bug Description

I have a branch of bzr.dev, and I ran the check command from the current bzr.dev. I believe this is meant to detect cases where reconciliation is needed. It does not detect any problems.

I've previously run reconcile here. Inspection of the repository shows that there are file texts present that are not referenced by their corresponding revision, which is what I thought was supposed to be removed.

This is a problem because if I now upgrade to pack format, the inventory-unreferenced revisions will not be pulled and I'm left with deltas whose parent is missing.

Revision history for this message
Martin Pool (mbp) wrote :

The particular case I encountered first is

>>> bad_fileid
'join-branches.txt-20050309044946-f635037a889c0388'
>>> bad_revid
'<email address hidden>'
>>>

I think there are several others. All date from around this time, when there was probably a problem in the commit code causing it to make unnecessary file versions.

In the knit post-reconcile, this version is delta compressed relative to

>>> missing_parent
'<email address hidden>'

in that revision (0205c) the file was merged from two identical parents. A new file version was created, but not referenced by the inventory. Reconcile should be removing that version, and adjusting later versions to not depend upon it.

Changed in bzr:
importance: Undecided → Critical
status: New → Confirmed
Andrew Bennetts (spiv)
Changed in bzr:
assignee: nobody → spiv
Revision history for this message
Andrew Bennetts (spiv) wrote :

More information:

The "bad_revid" version of that file is not referenced by the corresponding inventory. It is referenced by 1290 other inventories though (out of over 14000).

The "missing_parent" version of that file is not referenced by any inventory.

So this explains why the packs code decided to copy the bad_revid version, but not the missing_parent one.

Also, all 12 versions in that versioned file are identical (the have the same text and sha1).

So in this case it appears that the inventory for bad_revid is correctly not referencing the new bad_revid version, and so those 1290 other inventories ideally would not be referencing it either.

Revision history for this message
Martin Pool (mbp) wrote :

It does indeed seem that it's not referenced by the inventory that introduced it:

>>> repo.get_inventory(bad_revid)[bad_fileid].revision
'<email address hidden>'

The plan now is that Andrew will change reconcile to insert a fulltext for revisions that are currently based on otherwise-unreferenced file versions.

Revision history for this message
Andrew Bennetts (spiv) wrote :

I have a fix submitted to the list. It is implemented as Martin describes, by forcing the storage of a full-text rather than a delta for versions based on unreferenced versions.

Changed in bzr:
status: Confirmed → Fix Committed
Revision history for this message
Andrew Bennetts (spiv) wrote :

Robert suggested I do an acid test. I've confirmed that I can do "bzr init --experimental foo; cd foo; bzr pull ../reconciled-bzr.dev; bzr push ../new-dir" to pull a reconciled bzr.dev into packs, and make a copy of that branch with bzr push, without any trouble.

So the fix sent to the list definitely appears to fix this bug.

Revision history for this message
Robert Collins (lifeless) wrote :

The fix is in .dev

Changed in bzr:
milestone: none → 1.0rc1
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.