overly large delta creation when fetching from 2a repositories
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Bazaar |
Fix Released
|
Critical
|
John A Meinel | ||
1.16 |
Fix Released
|
Critical
|
Unassigned | ||
Launchpad itself |
Fix Released
|
High
|
Unassigned |
Bug Description
Bzr does not correctly calculate deltas in 2a branches, resulting in AbsentContentFa
Workarounds
========
Use SFTP:// or HTTP:// (without a smart server) URLs to read from the branch.
Symptoms
========
Traceback (most recent call last):
... File "/usr/lib/
_translate_
File "/usr/lib/
raise errors.
ErrorFromSmartS
Related branches
- Vincent Ladeuil: Approve
- Diff: 267 lines api
Robert Collins (lifeless) wrote : | #1 |
Changed in bzr: | |
status: | New → Incomplete |
Martin Pool (mbp) wrote : | #2 |
So I just got this myself, trying to branch from Launchpad:
mbp@grace% bzr branch lp:~mbp/bzr/progress progress
bzr: ERROR: bzrlib.
Traceback (most recent call last):
File "/usr/lib/
return the_callable(*args, **kwargs)
File "/usr/lib/
ret = run(*run_argv)
File "/usr/lib/
return self.run(
File "/usr/lib/
source_
File "/usr/lib/
result_
File "/usr/lib/
find_
File "/usr/lib/
result = unbound(self, *args, **kwargs)
File "/usr/lib/
pb=pb, find_ghosts=
File "/usr/lib/
self.__fetch()
File "/usr/lib/
self.
File "/usr/lib/
stream, from_format, [])
File "/usr/lib/
return self._locked_
File "/usr/lib/
for substream_type, substream in stream:
File "/usr/lib/
for kind, substream in stream:
File "/usr/lib/
for bytes in byte_stream:
File "/usr/lib/
_translate_
File "/usr/lib/
raise errors.
ErrorFromSmartS
bzr 1.17dev on python 2.6.2 (linux2)
arguments: ['/usr/bin/bzr', 'branch', 'lp:~mbp/bzr/progress', 'progress']
encoding: 'UTF-8', fsenc: 'UTF-8', lang: 'en_AU.UTF-8'
plugins:
explorer /home/mbp/
gtk /home/mbp/
launchpad /us...
Changed in bzr: | |
importance: | Undecided → Critical |
status: | Incomplete → Confirmed |
Martin Pool (mbp) wrote : | #3 |
It's reproducible for me running that same command. I have an hpss trace for it:
Mon 2009-06-22 20:48:07 +1000
0.026 bzr arguments: [u'-Dhpss', u'branch', u'lp:~mbp/bzr/progress', u'progress']
0.043 looking for plugins in /home/mbp/
[32345] 2009-06-22 20:48:07.761 WARNING: Unable to load plugin 'gtk'. It requested API version (1, 15, 0) of module <module 'bzrlib' from '/home/
0.119 looking for plugins in /home/mbp/
0.137 looking for plugins in /usr/lib/
0.138 Plugin name netrc_credentia
0.138 Plugin name launchpad already loaded
0.141 encoding stdout as sys.stdout encoding 'UTF-8'
2.426 hpss: Built a new medium: SmartSSHClientM
2.428 hpss call: 'BzrDir.open', '~mbp/bzr/
2.428 (to bzr+ssh:
2.473 ssh implementation is OpenSSH
8.555 result: ('yes',)
8.556 hpss call: 'BzrDir.
8.556 (to bzr+ssh:
8.930 result: ('branch', 'Bazaar Branch Format 7 (needs bzr 1.6)\n')
8.930 hpss call: 'BzrDir.
8.930 (to bzr+ssh:
9.478 result: ('ok', '', 'no', 'no', 'yes', 'Bazaar RepositoryForma
9.483 hpss call: 'Branch.
9.484 (to bzr+ssh:
9.871 result: ('ok', '/~bzr/bzr/trunk')
9.872 hpss call: 'BzrDir.open', '~bzr/bzr/trunk/'
9.872 (to bzr+ssh:
10.311 result: ('yes',)
10.311 hpss call: 'BzrDir.
10.311 (to bzr+ssh:
10.763 result: ('branch', 'Bazaar Branch Format 7 (needs bzr 1.6)\n')
10.763 hpss call: 'BzrDir.
10.763 (to bzr+ssh:
11.405 result: ('ok', '', 'no', 'no', 'yes', 'Bazaar RepositoryForma
11.405 hpss call: 'Branch.
11.405 (to bzr+ssh:
11.831 result: ('NotStacked',)
11.832 hpss call: 'Branch.
11.832 (to bzr+ssh:
12.212 result: ('ok', '4148', '<email address hidden>')
12.216 hpss call: 'BzrDir.
12.216 (to bzr+ssh:
12.613 result: ('Bazaar-NG meta directory, format 1\n', 'Bazaar RepositoryForma
12.644 Using fetch logic to copy between RemoteRepositor
Martin Pool (mbp) wrote : | #4 |
Perhaps unsurprisingly (if we believe the error is on the server not in a bad request), I get the same error using bzr.1.15. 1.13 can branch from this, but it's not using the get_stream rpc.
Martin Pool (mbp) wrote : | #5 |
sftp can be used as a workaround.
summary: |
- absent factory exception from smart server when merging + absent factory exception from smart server when merging or branching |
Martin Pool (mbp) wrote : Re: absent factory exception from smart server when merging or branching | #6 |
Do you mean bug 354036?
Martin Pool (mbp) wrote : | #7 |
10:19 <lifeless> just run the fixer, then try the command that was erroring for you yesterday
10:23 <lifeless> http://
Running this reports:
[0] mbp@grace% python ~/Desktop/
Missing inventories: set([('<email address hidden>',), ('<email address hidden>',), ('<email address hidden>',), ('<email address hidden>',), ('<email address hidden>',), ('<email address hidden>',), ('<email address hidden>',), ('<email address hidden>',), ('<email address hidden>',), ('<email address hidden>',), ('<email address hidden>',), ('<email address hidden>',), ('<email address hidden>',), ('<email address hidden>',), ('<email address hidden>',), ('<email address hidden>',), ('<email address hidden>',), ('<email address hidden>',), ('<email address hidden>',)])
python ~/Desktop/
Robert Collins (lifeless) wrote : | #8 |
Ok Martin, that means that what you encountered was the known, fixed, bug with xml based inventory stacked branches. Whatever is going on with the launchpad branch is likely conceptually the same, but its got different code involved.
summary: |
- absent factory exception from smart server when merging or branching + absent factory exception from smart server when streaming 2a stacked + branches |
Robert Collins (lifeless) wrote : Re: absent factory exception from smart server when streaming 2a stacked branches | #9 |
BLocker; we can't ship 2.0 with this open.
description: | updated |
tags: | added: brisbane-core hpss |
Changed in bzr: | |
milestone: | none → 2.0 |
Robert Collins (lifeless) wrote : | #10 |
I've filed an RT ticket to get the logs from the server. And a separate bug on launchpad-code to get future errors logged via OOPS
Changed in bzr: | |
status: | Confirmed → Triaged |
Robert Collins (lifeless) wrote : | #11 |
RT 34739
description: | updated |
Robert Collins (lifeless) wrote : | #12 |
Robert Collins (lifeless) wrote : | #13 |
Robert Collins (lifeless) wrote : | #14 |
Unfortunately, neither log appears to include the backtrace we're interested in.
Robert Collins (lifeless) wrote : | #15 |
Branch tip: <email address hidden>
graph.PendingAn
['<email address hidden>', '<email address hidden>', '<email address hidden>', '<email address hidden>', '<email address hidden>', '<email address hidden>']
after copying the repo down with lftp, remove the stacked parameter
source = b.repository.
stream = source.
>>> for kind, sub in stream:
... print kind
... for record in sub:
... print record.key
... print len(record.
texts
('test_
Traceback (most recent call last):
File "<stdin>", line 5, in <module>
AttributeError: 'AbsentContentF
>>> b.repository.
set([('
Robert Collins (lifeless) wrote : | #16 |
The complete set of texts that bzr tries to send is much larger - 157 items long.
Robert Collins (lifeless) wrote : | #17 |
I believe this is the problem. Specifically, because it is considering 'whole nodes' not 'interesting items', we see all the items in new pages, not just the items that are new in those pages.
chk_bytes = self.from_
def _filter_
for record, items in chk_map.
for name, bytes in items:
if record is not None:
To me this reinforces the need to push this diff code into chk_map.py, and have a single implementation of it.
description: | updated |
description: | updated |
Robert Collins (lifeless) wrote : | #18 |
So, further investigation shows that we *do* have an appropriate function in chk_map, which is being used.
However, it is clearly outputting uninteresting items, and I don't know why yet. I need to EOD though.
John A Meinel (jameinel) wrote : | #19 |
Just to follow up on Robert's "further investigation". iter_interestin
for record, items in chk_map.
for name, bytes in items:
Those items have already gone through a set difference.
Still need to probe into why it seems we are getting more than we should. *my* contention is that something about the *values* is genuinely different because of the buggy parents issues. (one side claims more per-file parents than the other side does.)
Robert Collins (lifeless) wrote : | #20 |
Partial fix - catching the commonest case (two subtrees changed, two pushes to a stacked repo needed to trigger it) - has landed. More work is needed to completely fix, but this is a 99% fix we should get out to users.
Changed in bzr: | |
status: | Triaged → In Progress |
tags: | added: lp-needs |
Changed in launchpad-code: | |
importance: | Undecided → Critical |
Robert Collins (lifeless) wrote : | #21 |
mwhudson says the lp branch for this is getting tested now.
Jonathan Lange (jml) wrote : | #22 |
We've landed the fix on our bzr branch. Waiting for the rollout.
Changed in launchpad-code: | |
milestone: | none → 2.2.6 |
status: | New → Fix Committed |
Martin Pool (mbp) wrote : | #23 |
Was released in 1.16.1
Martin Pool (mbp) wrote : | #24 |
Also now fixed in trunk
Changed in bzr: | |
status: | In Progress → Fix Released |
Robert Collins (lifeless) wrote : | #25 |
Partly fixed in trunk.
Changed in bzr: | |
status: | Fix Released → In Progress |
Tim Penhey (thumper) wrote : | #26 |
Moving launchpad-code back to triaged, as we are awaiting the full fix from bzr.
Changed in launchpad-code: | |
status: | Fix Committed → Triaged |
milestone: | 2.2.6 → 2.2.7 |
Martin Pool (mbp) wrote : | #27 |
Robert says this error can still occur when the tree depth is different.
----
Tim reported this error yesterday; could this be a regression of the same thing?
henning@i-5f426f36$ bzr branch bzr+ssh:
Warning: Permanently added 'bazaar.
Branched 8712 revision(s).
henning@i-5f426f36$ cd /var/launchpad/
bzr: ERROR: bzrlib.
Traceback (most recent call last):
File "/usr/lib/
return the_callable(*args, **kwargs)
File "/usr/lib/
ret = run(*run_argv)
File "/usr/lib/
return self.run(
File "/usr/lib/
location, revision, remember, possible_
File "/usr/lib/
other_
File "/usr/lib/
merger.
File "/usr/lib/
self.
File "/usr/lib/
target.
File "/usr/lib/
result = unbound(self, *args, **kwargs)
File "/usr/lib/
pb=pb)
File "/usr/lib/
find_
File "/usr/lib/
result = unbound(self, *args, **kwargs)
File "/usr/lib/
pb=pb, find_ghosts=
File "/usr/lib/
self.__fetch()
File "/usr/lib/
self.
File "/usr/lib/
stream, from_format, [])
File "/usr/lib/
return self._locked_
File "/usr/lib/
for substream_type, substream in stream:
File "/usr/lib/
for kind, substrea...
Martin Pool (mbp) wrote : | #28 |
Well, the client side traceback looks the same; we'd like to get the server side traceback to know.
Robert Collins (lifeless) wrote : Re: [Bug 390563] Re: absent factory exception from smart server when streaming 2a stacked branches | #29 |
On Tue, 2009-06-30 at 00:19 +0000, Tim Penhey wrote:
> Moving launchpad-code back to triaged, as we are awaiting the full fix
> from bzr.
I think LP code should be fine with the hotfix so far - in 1.17 you'll
get the complete fix, but its something I'd be less happy hotfixing as
it requires deeper changes to the code to make work - John is rewriting
the code path.
The fix so far is a 99% fix and in practice I don't think LP will see
any of the unsolved cases before 1.17 is available.
-Rob
Robert Collins (lifeless) wrote : | #30 |
On Tue, 2009-06-30 at 00:19 +0000, Martin Pool wrote:
> Robert says this error can still occur when the tree depth is different.
>
> ----
>
> Tim reported this error yesterday; could this be a regression of the
> same thing?
This is almost certainly a different bug. Tim, please file a new bug
report for this.
-Rob
Robert Collins (lifeless) wrote : | #31 |
Actually, scratch that; misread the backtrace - it *could* be the same.
Please:
- get the server backtrace as Martin says.
- have the user upgrade to 1.16.1 or bzr nightlies.
Thanks,
Rob
Tim Penhey (thumper) wrote : | #32 |
On Tue, 30 Jun 2009 12:35:59 Robert Collins wrote:
> On Tue, 2009-06-30 at 00:19 +0000, Tim Penhey wrote:
> > Moving launchpad-code back to triaged, as we are awaiting the full fix
> > from bzr.
>
> I think LP code should be fine with the hotfix so far - in 1.17 you'll
> get the complete fix, but its something I'd be less happy hotfixing as
> it requires deeper changes to the code to make work - John is rewriting
> the code path.
>
> The fix so far is a 99% fix and in practice I don't think LP will see
> any of the unsolved cases before 1.17 is available.
We did see an issue though :(
I talked to poolie about it.
However I get your point. Perhaps not worth cherry picking - is that your
rationale?
Robert Collins (lifeless) wrote : | #33 |
On Tue, 2009-06-30 at 01:27 +0000, Tim Penhey wrote:
> We did see an issue though :(
We need to analyse that particular case in detail. I haven't pulled a
copy of the branch yet to do that. I see three cases:
- its the same problem but the current fix is less complete than
thought.
- its the corner cases we know about and have in progress.
- its something else.
> I talked to poolie about it.
>
> However I get your point. Perhaps not worth cherry picking - is that
> your rationale?
Not *safe* to cherry pick is what I mean. Deep changes need a little
bedding down.
-Rob
Ursula Junque (ursinha) wrote : Re: absent factory exception from smart server when streaming 2a stacked branches | #34 |
According to rockstar in today's Launchpad Prod. Meeting:
<rockstar> Ursinha, there have been multiple team discussions about it.
<Ursinha> rockstar, is that bug really critical?
<rockstar> Ursinha, so, yes, it's really critical, but we're dependent on bzr fixing it.
<rockstar> Ursinha, we may downgrade it for the sake of not misusing Critical. We're still discussing it.
<Ursinha> rockstar, can you update the bug as soon as you have news, pretty please?
<rockstar> Ursinha, will do.
Ursula Junque (ursinha) wrote : | #35 |
<thumper> Ursinha: the critical bug isn't really critical
<thumper> just high
<thumper> should be fixed with 1.17 release of bzr
Changed in launchpad-code: | |
importance: | Critical → High |
Robert Collins (lifeless) wrote : | #36 |
The critical component was truely critical and has been cherrypicked into lp production some time ago. The lp task would be appropriately closed, I think.
The remaining work is for the complete-case, and I think that that is is still pending a final review.
Aaron Bentley (abentley) wrote : Re: [Bug 390563] Re: absent factory exception from smart server when streaming 2a stacked branches | #37 |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Robert Collins wrote:
> The critical component was truely critical and has been cherrypicked
> into lp production some time ago. The lp task would be appropriately
> closed, I think.
This is hitting us any time we use bundles to create/update branches.
It's a pretty high-frequency, high-irritation event.
Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://
iEYEARECAAYFAkp
vU0An0Gip9U/
=bBYT
-----END PGP SIGNATURE-----
Robert Collins (lifeless) wrote : | #38 |
On Fri, 2009-07-10 at 03:00 +0000, Aaron Bentley wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Robert Collins wrote:
> > The critical component was truely critical and has been cherrypicked
> > into lp production some time ago. The lp task would be appropriately
> > closed, I think.
>
> This is hitting us any time we use bundles to create/update branches.
> It's a pretty high-frequency, high-irritation event.
We should definitely solve your problem. However, I think thats a
different bug. ACF is a very generic error, its like 'segfault' for C
programs. Getting an ACF means something is wrong, but doesn't indicate
if its missing data or bad delta logic.
Bug 390563 - *this* bug - is 'when all required data is present, bzr
still fails to generate a stream'. And the fix that has been
cherrypicked should be robust except when the tree height changes.
We should debug the bundle based one. I think it is bug 393349, as JML
has suggested.
-Rob
Changed in bzr: | |
assignee: | nobody → John A Meinel (jameinel) |
John A Meinel (jameinel) wrote : | #39 |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Aaron Bentley wrote:
> Robert Collins wrote:
>> The critical component was truely critical and has been cherrypicked
>> into lp production some time ago. The lp task would be appropriately
>> closed, I think.
>
> This is hitting us any time we use bundles to create/update branches.
> It's a pretty high-frequency, high-irritation event.
>
> Aaron
so... is this happening when:
1) applying a bundle to an existing stacked branch?
2) applying a bundle that was created *from* a stacked branch?
If I was guessing, I could easily say that the bundle code:
1) Doesn't contain the fulltext for inventories of parent revisions,
nor does it contain the data for all texts that are referenced by the
inventories that it *does* have.
2) Thus when creating a stacked branch from the bundle, we don't have
the parent inventories to *insert* into the new stacked repository.
(Without going to the backing repository and pulling them out)
3) And thus the stacked branch is now broken.
This points the failure at the "insert data from a bundle" code, causing
it to generate invalid stacked repositories.
This would all be simplified quite a bit if we just made a bundle
effectively just a way to transfer the data for the stream that you get
out of StreamSource. There is a small caveat that the bundle *also*
needs to think of itself as not containing any data, and thus does the
same "what is missing" checks that a stacked repository does.
So put another way:
1) BundleRepository just sets itself up as a stacked repository, stacked
on top of a hypothetical repository that only has the minimal data
defined by the target revision.
2) Alternatively we could go to the real target branch and use that.
However at present we have a habit of setting submit_location to a local
mirror, and obviously that will generate the wrong information about
"what data do you have that I don't".
I'll see if I can trivially recreate this.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://
iEYEARECAAYFAkp
R14AnjjmiO/
=gcg6
-----END PGP SIGNATURE-----
Aaron Bentley (abentley) wrote : | #40 |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
John A Meinel wrote:
> Aaron Bentley wrote:
>> This is hitting us any time we use bundles to create/update branches.
>> It's a pretty high-frequency, high-irritation event.
>
>> Aaron
>
> so... is this happening when:
>
> 1) applying a bundle to an existing stacked branch?
After applying the bundle to the stacked branch, our branch puller is
unable to pull the branch.
> 2) applying a bundle that was created *from* a stacked branch?
That's not the case I was describing. I don't think anyone's creating
bundles from stacked branches.
> If I was guessing, I could easily say that the bundle code:
>
> 1) Doesn't contain the fulltext for inventories of parent revisions,
Bundles aren't meant to contain fulltexts of any inventories.
> nor does it contain the data for all texts that are referenced by the
> inventories that it *does* have.
It contains the data for the texts that are referenced by the
inventories it does have, as long as the referenced texts are new. If
they are present in the basis revision, they will not be included.
> 2) Thus when creating a stacked branch from the bundle, we don't have
> the parent inventories to *insert* into the new stacked repository.
Why do we need the parent inventories?
> (Without going to the backing repository and pulling them out)
add_mpdiffs reconstructs the fulltexts and calls
VersionedFile.
regenerate the fulltexts. But it usually wouldn't try to insert
inventories that were already present in the backing repository.
> 3) And thus the stacked branch is now broken.
>
> This points the failure at the "insert data from a bundle" code, causing
> it to generate invalid stacked repositories.
>
> This would all be simplified quite a bit if we just made a bundle
> effectively just a way to transfer the data for the stream that you get
> out of StreamSource.
I started work on that at the sprint.
> There is a small caveat that the bundle *also*
> needs to think of itself as not containing any data
Pardon?
>, and thus does the
> same "what is missing" checks that a stacked repository does.
>
> So put another way:
>
> 1) BundleRepository just sets itself up as a stacked repository, stacked
> on top of a hypothetical repository that only has the minimal data
> defined by the target revision.
>
> 2) Alternatively we could go to the real target branch and use that.
In the Launchpad case, we are stacked on the real target branch.
> However at present we have a habit of setting submit_location to a local
> mirror, and obviously that will generate the wrong information about
> "what data do you have that I don't".
If the head revision of the local mirror is accurate or stale, it may
bundle more data than is strictly needed, but not too little.
Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://
iEYEARECAAYFAkp
OhIAn02lm1hJDHv
=Xf9G
-----END PGP SIGNATURE-----
John A Meinel (jameinel) wrote : | #41 |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Aaron Bentley wrote:
> John A Meinel wrote:
>> Aaron Bentley wrote:
>>> This is hitting us any time we use bundles to create/update branches.
>>> It's a pretty high-frequency, high-irritation event.
>>> Aaron
>> so... is this happening when:
>
>> 1) applying a bundle to an existing stacked branch?
>
> After applying the bundle to the stacked branch, our branch puller is
> unable to pull the branch.
>
So if you follow the rest of my emails, it turns out that creating
bundles from --2a formats is just 100% broken. regardless stacking or
not (chk pages are not being inserted into the bundle, and the raw text
from 'repo.inventories' is not sufficient without them). But there are
still issues to be discussed with how "insert_revisions" is going to
interact with stacking.
>> If I was guessing, I could easily say that the bundle code:
>
>> 1) Doesn't contain the fulltext for inventories of parent revisions,
>
> Bundles aren't meant to contain fulltexts of any inventories.
>
>> nor does it contain the data for all texts that are referenced by the
>> inventories that it *does* have.
>
> It contains the data for the texts that are referenced by the
> inventories it does have, as long as the referenced texts are new. If
> they are present in the basis revision, they will not be included.
>
>> 2) Thus when creating a stacked branch from the bundle, we don't have
>> the parent inventories to *insert* into the new stacked repository.
>
> Why do we need the parent inventories?
Stacking requirements. The specific issues are:
1) When accessing a stacked branch via bzr+ssh the *server process* does
not have access to the backing repository. This was by design, and not
just accidental fallout.
a) Because of this choice, when *streaming* data from the smart
server, the stacked branch needs enough information to generate a
stream for all revisions that it has, and then the client will ask
the fallback directly for another stream for all the remaining
data.
2) To stream the data for a set of revisions, we need the text keys that
should be transmitted. This was changed recently from:
set_of_
to
set_of_
set_
This was done so that we could be more resilient in the face of ghosts,
and inventory inconsistencies, etc.
We have two things that we wanted to trust that we've found out we can't:
file_key == (file_id, revision) #always true
ie = inventory(
ie.revision == file_key[1] # not always true
This is the inventory inconsistency issue. The other is the ghosts issue:
if file_key in inventory then either:
file_revision == inventory.revision # the text was in this revision
or
file_revision in set_of_
The issue is with round-tripping and bzr-svn. Basically, if you did work
in bzr, and push that into svn, you end up with last-modified revision
information being stored in svn which points to a bzr revision_id which
was *not* pushed to svn. (you merge a bzr change that modifi...
Robert Collins (lifeless) wrote : | #42 |
Aaron Bentley (abentley) wrote : | #43 |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
John A Meinel wrote:
> Aaron Bentley wrote:
>> Why do we need the parent inventories?
>
> Stacking requirements. The specific issues are:
>
> 1) When accessing a stacked branch via bzr+ssh the *server process* does
> not have access to the backing repository.
It would be nice to support server-side bundle application, but it is
not a requirement. Bundles are deltas, and the client usually needs to
read from the repository in order to install them. If that requires
that fallbacks be available, I think that's a reasonable extension of
the current behaviour.
> 1) If you insert an line-delta for the XML inventory into the stacked
> branch, then it only references newly introduced texts, and does not
> reference all possible texts.
All possible in what context? We certainly don't want to reference
every text in the repository.
> Thus even though we don't have the parent
> inventory later when doing "streaming fetch" we don't try to send more
> data then we have available because it just wasn't referenced.
If you are suggesting that installing an inventory requires installing
all the texts it references, that would require doing it on the client
side, because storing all that in the bundle is pointless waste.
> 2) However, if you are at the end of a delta chain, etc, and you insert
> a "fulltext" for the XML inventory, suddenly when doing streaming fetch
> from that stacked branch, it will see all these text keys that it thinks
> it needs to transmit, for which it cannot find in the local repository.
> [and boom].
I think it doesn't make a lot of sense for a stacked repository to have
a copy of every text referenced in any of its inventories, so I don't
really get this.
> My point was that the streaming code might not look at the
> Branch.
> Repository.
> it *will* have the revisions you are trying to bundle.
Okay, I will make sure that the new bundle format uses
Branch.
Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://
iEYEARECAAYFAkp
9k4AnRKP4mgQLKO
=Xs0R
-----END PGP SIGNATURE-----
John A Meinel (jameinel) wrote : | #44 |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Aaron Bentley wrote:
> John A Meinel wrote:
>> Aaron Bentley wrote:
>>> Why do we need the parent inventories?
>> Stacking requirements. The specific issues are:
>
>> 1) When accessing a stacked branch via bzr+ssh the *server process* does
>> not have access to the backing repository.
>
> It would be nice to support server-side bundle application, but it is
> not a requirement. Bundles are deltas, and the client usually needs to
> read from the repository in order to install them. If that requires
> that fallbacks be available, I think that's a reasonable extension of
> the current behaviour.
>
>> 1) If you insert an line-delta for the XML inventory into the stacked
>> branch, then it only references newly introduced texts, and does not
>> reference all possible texts.
>
> All possible in what context? We certainly don't want to reference
> every text in the repository.
>
>> Thus even though we don't have the parent
>> inventory later when doing "streaming fetch" we don't try to send more
>> data then we have available because it just wasn't referenced.
>
> If you are suggesting that installing an inventory requires installing
> all the texts it references, that would require doing it on the client
> side, because storing all that in the bundle is pointless waste.
Assume you have revisions a, b, c
In the base repository you have a & b
in the stacked you have c
All revisions have texts foo, bar, baz
'c' only modifies baz
At that point, we need to be able to create a fulltext for the inventory
at c. Either via
1) A fulltext of c
or
2) A fulltext of a parent, and a delta chain including c.
At that point, the stacked branch has 2 options:
1) If it doesn't have inventory for b, it must have the file content for
foo, bar and baz
2) If it does have inventory b, it only needs the file content for baz.
The basic requirement is that a stacked repository should be independent
of its fallback for the case of streaming fetch. So either we
1) Insert a parent inventory so that we can determine that someone who
has revision 'b' will already have these file keys
or
2) Have the file keys ourselves so that we can ensure someone fetching
'c' has everything they need.
Does that make sense?
>
>> 2) However, if you are at the end of a delta chain, etc, and you insert
>> a "fulltext" for the XML inventory, suddenly when doing streaming fetch
>> from that stacked branch, it will see all these text keys that it thinks
>> it needs to transmit, for which it cannot find in the local repository.
>> [and boom].
>
> I think it doesn't make a lot of sense for a stacked repository to have
> a copy of every text referenced in any of its inventories, so I don't
> really get this.
Which is why we want to insert the inventories for parent revisions, so
that *without hitting the fallback* we can determine what file keys do
not need to be fetched.
John
=:->
>
>> My point was that the streaming code might not look at the
>> Branch.
>> Repository.
>> it *will* have the revisions you are trying to bundl...
Aaron Bentley (abentley) wrote : | #45 |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
John A Meinel wrote:
>>> Thus even though we don't have the parent
>>> inventory later when doing "streaming fetch" we don't try to send more
>>> data then we have available because it just wasn't referenced.
>> If you are suggesting that installing an inventory requires installing
>> all the texts it references, that would require doing it on the client
>> side, because storing all that in the bundle is pointless waste.
>
> Assume you have revisions a, b, c
>
> In the base repository you have a & b
> in the stacked you have c
>
> All revisions have texts foo, bar, baz
> 'c' only modifies baz
>
> At that point, we need to be able to create a fulltext for the inventory
> at c. Either via
> 1) A fulltext of c
> or
> 2) A fulltext of a parent, and a delta chain including c.
>
> At that point, the stacked branch has 2 options:
>
> 1) If it doesn't have inventory for b, it must have the file content for
> foo, bar and baz
Why?
> 2) If it does have inventory b, it only needs the file content for baz.
>
> The basic requirement is that a stacked repository should be independent
> of its fallback for the case of streaming fetch. So either we
>
> 1) Insert a parent inventory so that we can determine that someone who
> has revision 'b' will already have these file keys
> or
Why do we care about that? Why can't we just send the keys we have and
let the client worry about getting the keys we didn't send?
> 2) Have the file keys ourselves so that we can ensure someone fetching
> 'c' has everything they need.
>
> Does that make sense?
Not yet.
Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://
iEYEARECAAYFAkp
8vAAn1rhpas7k39
=oZpW
-----END PGP SIGNATURE-----
summary: |
- absent factory exception from smart server when streaming 2a stacked - branches + overly large delta creation when fetching from 2a repositories |
Tim Penhey (thumper) wrote : | #46 |
I'll move this to 2.2.8 in the hope that there is a fix in Bazaar in time to deploy to LP.
Changed in launchpad-code: | |
milestone: | 2.2.7 → 2.2.8 |
Robert Collins (lifeless) wrote : Re: [Bug 390563] Re: overly large delta creation when fetching from 2a repositories | #47 |
On Thu, 2009-07-23 at 23:03 +0000, Tim Penhey wrote:
> I'll move this to 2.2.8 in the hope that there is a fix in Bazaar in
> time to deploy to LP.
LP shouldn't be stressing about this bug now; the critical component
landed in LP shortly after it was opened.
-Rob
Changed in launchpad-code: | |
milestone: | 2.2.8 → 2.2.7 |
status: | Triaged → Fix Released |
Robert Collins (lifeless) wrote : | #48 |
The full fix is present in 1.18
Changed in bzr: | |
milestone: | 2.0 → 1.18 |
status: | In Progress → Fix Released |
Not bug 365615, probably the older stacking bug with damaged branches.