apt-get update fails hash checks on https repositories when file size changes

Bug #1157943 reported by Thomas Bushnell, BSG
24
This bug affects 1 person
Affects Status Importance Assigned to Milestone
apt (Ubuntu)
Fix Released
Undecided
Unassigned
Precise
Won't Fix
Undecided
Unassigned

Bug Description

apt uses its own strategy for sending Range: requests on https, instead of the libcurl handling. Here's is a scenario where it gets it wrong:

1) apt downloads the file but doesn't put the file in place yet (perhaps it got interrupted or something)
2) the file on the server gets replaced by a smaller file
3) the next update run wants to download the file, sees a partial read, and asks for Range: (len(file)-1)-
4) the server sees a Range: request for a byte-range past the end of (the current version of) the file, considers it invalid, and streams the entire file. (This is correct behavior.)
5) apt assumes the response is the range it expected, and appends it to the local staging copy (minus one byte).

Instead of rolling apt's own attempt to handle ranges in the https method, it should just use libcurl's. Attached is a patch which solves the problem.

Tags: patch
Revision history for this message
Thomas Bushnell, BSG (tbushnell) wrote :
Revision history for this message
Thomas Bushnell, BSG (tbushnell) wrote :

Oh, and the bug is particularly pathological. Because after it happens, the index file is in the staging area is now always larger than what's on the server, it won't self correct.

Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "apt-fix.diff" seems to be a patch. If it isn't, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are a member of the ~ubuntu-reviewers, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issues please contact him.]

tags: added: patch
Revision history for this message
Steve Langasek (vorlon) wrote :

David, are you happy with this patch?

Revision history for this message
David Kalnischkies (donkult) wrote :

(assuming "David" means me: Better ask Michael Vogt as he is debugging another https problem at the moment)

That said, step 4 is incorrect, Range answers with an error code of 416. The behavior described is that of Range with If-Range.
I haven't tested this at all, but browsing documentation makes me doubt that this is sent by curl (which date it should sent?).

So we need to check for a range error and retry without one – still doesn't really solve the problem as we still get the "wrong" data in the scenario above (just replace "smaller" with "bigger" to have valid ranges).

I guess to really fix this we have to bit the bullet and work with a CURLOPT_HEADERFUNCTION to see what the response is. I might be complete wrong though and will leave that up to someone who actually has access to infrastructure for testing this.

Revision history for this message
Thomas Bushnell, BSG (tbushnell) wrote :

David, servers are not required to send 416; it's only a SHOULD.

Revision history for this message
Thomas Bushnell, BSG (tbushnell) wrote :

The behavior in 4 is not the If-Range + Range behavior. The latter is about what happens when the contents have changed; the behavior in 4 is not about changed content, but about a range request which starts after the end of the file on the server.

Revision history for this message
Adam Stokes (adam-stokes) wrote :

David, could you provide your thoughts in response to comment #6 and #7?

Thank you,
Adam

Revision history for this message
David Kalnischkies (donkult) wrote :

#6: SHOULD means that there are servers outwhere behaving like that. So we have to support both. Not that we can ignore should-clauses.
#7: Do I really have to comment that? I am not that good at explaining jokes, but lets try:
1. APT downloads a part of a file.
2. replaces the file on the server
3+4. Deal with the current file on the server which is from APTs view-point a new file as it has a part of the old file.
So in which way is it not dealing with changed content?

Both do not change the fact that you usually have a part smaller than the (new) file downloaded in which case the patch doesn't help (actually, I wonder how the patch is supposed to help at all as we still write the result unconditionally to the end of the file) – and it also doesn't change that I said I am not the one to talk to about https as I am not a big fan of "works correctly as the compiler isn't complaining"-patches and that is all I could do for https currently as I don't use https. (aka: I will try to help review working patches, but I am not producing one myself)

Revision history for this message
Thomas Bushnell, BSG (tbushnell) wrote :

David, #6 means that servers SHOULD send a 416. But apt should support both, because it cannot rely on servers sending a 416.

#7, you have misunderstood. Yes, there is newer contents, and the server returns the current complete contents of the file in this case, and apt is *not replacing the file correctly*.

Revision history for this message
Louis Bouchard (louis) wrote :

David, any opinion on Thomas's previous comment ? Jury is still out on wether or not the patch proposed is adequate.

Thank you,

...Louis

Revision history for this message
Adam Stokes (adam-stokes) wrote :

Ping, need some movement/response to OP comments.

Thanks!
Adam

Revision history for this message
Adam Stokes (adam-stokes) wrote :

David, sorry that ping was meant for you to try and unblock this issue.

Thanks again
Adam

Revision history for this message
David Kalnischkies (donkult) wrote :

Could you guys please refrain from pinging me for comments without reading my previous comments?

Not only does this response now eat up all the time I have available for APT this week, its also highly demotivating to have the public record of two @canoncial.com people that they don't have to read bugreports if they can instead just ping volunteers like me to read for them.

The bugreport includes already all the information anyone needs to know to get an idea what I personally think about this "patch" (not that this would be important), the likelihood of https patches being accepted by me (APPROX ZERO BECAUSE I AM NOT DOING HTTPS STUFF) and what a real patch needs to use to fix the issue (hint CURLOPT_HEADERFUNCTION hint) and not just break it differently (e.g.: removing If-Range is not fixing anything, … ).

Anything else you two want me to comment? (Rhetoric question, answers would just be 'read' by my killfile, so don't do it)
Thanks!

David, who is really happy now he has opened his mailclient at 2 o'clock in the morning.

Revision history for this message
Philipp Kern (pkern) wrote :

I'm also unconvinced of this patch. It drops the If-Range logic altogether which means that it requests a range without specifying what the mtime of the partial file was. It also does not use If-Modified-Since in this case. So you'd still end up with a file that's partially corrupted if the file changed underneath.

Curl does not seem to have logic to do If-Range properly. So in this case the return code of the HTTP request will indicate if we got the entire file (200 OK) or a partial snippet (206 Partial Content)[1]. 206 will include the range and last-modified date of the current file, which we can check against what we expect. 200 OK should replace the whole file.

(No, I'm not volunteering.)

[1] http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html

Revision history for this message
Thomas Bushnell, BSG (tbushnell) wrote :

The current code completely corrupts things. It's definitely got a bug. Dropping If-Range entirely at least gets *correct* behavior.
David, people *have* read your previous comments, *and responded to them*.

If you don't want to "do https stuff", then apt needs a maintainer who does.

So, since the current code *definitely* has a bug, and since you've decided that the proposed patch is incorrect, what is your plan for fixing the bug?

Revision history for this message
David Kalnischkies (donkult) wrote :

My problem with https is that I hadn't the infrastructure set up to test it – and code needs testing, especially if it ends up on millions of systems.
Others might be good enough to write (or apply) bugfree code without testing, but I am not.

So what I do with patches I can't test is that I review them and tell people who else to ask, while being crystal clear that I am not the one to talk to. This assumes of course that the review isn't mostly ignored AND that the patch is actually tested and someone talks to the right people.

As much as I want to believe that the initial author of a patch is someone who has tested its patch intensively, that is not always the case and sometimes its just that author has a completely different environment compared to the tester, who hits a problem after everything was fine for weeks for the author thanks to a different server or what.

So, as this thread slowly gets annoying and nobody has tested the patch so far I tried getting our testframework ready for partial downloads (I needed that anyway) and https (as a bonus). We have recently got our own little webserver implementation in APT, so the Range-support part wasn't as hard as it sounds. https is relatively easy if you manage to be able to set up stunnel4. (both is proof of concept so far, so no patch yet).

What I found made me pretty unhappy though as it confirms my suspicion: The patch was never tested as range-support is utterly broken in our https client and the patch does nothing in regards to changing it: The responsecode in a partial response is 206, which is treated as an error by our https, so with or without patch we either get a 200 with the complete file which means we add the complete file on top the partial file we already have (-> CURLOPT_HEADERFUNCTION) or we get a 206 which means the error handling discards the file (which would actually be fine) [I assume that curl changed at some point its behavior as this has probably worked in the past, but I wouldn't hold my breath].

Of course, this can only be seen if the stuff is actually tested, so the criticism I had so far was "just":
Removing the "- 1" means that we have one more situation in which we have to handle a 416 (the case in which the file is complete, but hashes not computed and moved). Of course, we have to handle 416, but in the meantime "If-Range" helps a bit as it helps ignoring invalid byte ranges (those which are the result of a changed file on the server, which is the most common situation in which a 416 could happen) [assuming the first issue would be solved].

So whats the take home lesson: I am paranoid, never believe in "*correct* behavior"-claims and both is constantly reenforced by life (and bugreports).
On the "downside" I might actually continue working on that mess now, which I wanted to avoid…

Revision history for this message
Brian Murray (brian-murray) wrote :
Download full text (4.0 KiB)

It looks to me like this may be resolved in Trusty due to the following changelog entries:

apt (0.9.12) unstable; urgency=low

  [ Christian Perrier ]
  * Fix typo in apt-private/private-show.cc. Thanks to Benjamin
    Keresa. Closes: #724073

  [ Mark Hymers ]
  * fix libapt-inst for >2G debs (closes: #725483)

  [ David Kalnischkies ]
  * don't strip :any from dependencies in single-arch (Closes: 723586)
  * pkg from only trusted sources keeps being trusted (Closes: 617690)
  * compression-neutral message for missing data.tar member (Closes: 722710)
  * print-uris prints regardless of quiet-level again (Closes: 722207)
  * retry without partial data after a 416 response (Closes: 710924)
  * replace "filesize - 1" trick in http with proper 416 handling
  * fix partial (206 and 416) support in https
  * handle complete responses to https range requests (Closes: 617643, 667699)
    (LP: 1157943)
  * don't consider holds for autoremoval (Closes: 724995) ...

Read more...

Revision history for this message
Michael Vogt (mvo) wrote :

This bug should indeed be fixed in trusty with the upload that Brian identified. If a backport for precise is needed I can look into that.

Changed in apt (Ubuntu):
status: New → Fix Released
Revision history for this message
Michael Vogt (mvo) wrote :

Sorry for the slow reply. I wasn't quite sure if precise is needed or not. Attached is a debdiff with the backport of the http/https refactor that David did. The diff is a bit scary because it contains a lot of churn/shuffle of the http/https code.

Note that the debdiff needs a bit of cleanup before it can be uploaded, i.e. the autoconf changes and the po file changes are stuff that the apt build system is automatically updating and I was doing it on a trusty box instead of in a precise environment.

Plus additional testing with various scenarios (I did only light testing so far).

Revision history for this message
Steve Langasek (vorlon) wrote :

The Precise Pangolin has reached end of life, so this bug will not be fixed for that release

Changed in apt (Ubuntu Precise):
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.