[TOPBLOCKER] phased update support does not give idempotent answer for each (machine,update)

Bug #1383539 reported by Steve Langasek
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu system image
Fix Released
Critical
Barry Warsaw

Bug Description

The phased update implementation in ./systemimage/helpers.py does:

def phased_percentage(*, reset=False):
    global _pp_cache
    if _pp_cache is None:
        with open(UNIQUE_MACHINE_ID_FILE, 'rb') as fp:
            data = fp.read()
        now = str(time.time()).encode('us-ascii')
        r = random.Random()
        r.seed(data + now)
        _pp_cache = r.randint(0, 100)
    try:
        return _pp_cache
    finally:
        if reset:
            _pp_cache = None

Since the current time is used as a seed, this function will return a different value each time it's called.

The phased update model requires that, for each given machine and each given update, the calculated percentage persists. With this implementation, since the seed is different for each invocation and the result is not cached, phasing will happen much more quickly than intended.

system-image should avoid using the time as a seed, and instead use a predictable (but well-distributed) seed of the machine ID plus an image identifier (perhaps full channel name + image ID).

Related branches

Steve Langasek (vorlon)
Changed in system-image (Ubuntu):
assignee: nobody → Barry Warsaw (barry)
Revision history for this message
Steve Langasek (vorlon) wrote :

This bug is important to fix before phone images go to factory, since it impacts our ability to do phased over-the-air updates.

Changed in system-image (Ubuntu):
importance: Undecided → High
Barry Warsaw (barry)
no longer affects: system-image (Ubuntu)
Changed in ubuntu-system-image:
assignee: nobody → Barry Warsaw (barry)
importance: Undecided → High
milestone: none → 2.5.1
status: New → Triaged
Barry Warsaw (barry)
Changed in ubuntu-system-image:
status: Triaged → In Progress
Barry Warsaw (barry)
tags: added: client
Revision history for this message
Barry Warsaw (barry) wrote :

We have a problem in using the target build number in the percentage calculation. At the point that the phased updated percentage is used, we don't actually know the target yet.

When we read the index file, we filter out any images that have a percentage >= the calculated pu percentage. *Then* we calculate the winning update path from the index, which tells us the build number target.

If we changed the implementation such that the winning upgrade path was calculated first, this would imply a subtle semantic difference.

Under the old regime, let's say your upgrade path is 10:11:12 but your phased percentage is > the percentage for upgrade 12. You'd filter out the update to 12 but you would still potentially update to 11.

If we switched things around to calculate the update first, and we use the same scenario (path is 10:11:12 but device-% > image-%) then we'd just not update the device at all. Maybe that's okay, and maybe it's what was originally (albeit underspecified ;) intended.

The other possibility is that if our target is 10:11:12 but 12 is filtered out, then we'd refine the upgrade path to 10:11 and then recalculate the winners from all candidates. I think that would lead to the same outcome as before, but it's more tricky to implement and thus riskier for a quick bug fix.

Thinking...

Revision history for this message
Barry Warsaw (barry) wrote :

... Okay, so we rip out phased % calculation from index build-up, which means we calculate the winning path as if we *were* going to update to the target. Then we calculate the winning path as if phased % is 100. Then we use the target build number of the winning path in the pu calculate and if that leads us to a device-% that is > phased-%, we simply report that there's no update available.

This makes sense, and although there's a subtle semantic difference from what was happening, it's probably what was originally intended. The implementation change is not too risky.

Revision history for this message
Olli Ries (ories) wrote :

I agree that this is critical and should be fixed before 10/30

tags: added: rtm14 touch-2014-10-23
Revision history for this message
Barry Warsaw (barry) wrote :

'phased-percentage' key is tied to a specific image in the index.json file. If that image number is a full upgrade and our winning candidate lands on that image, then we can clearly use the phased-percentage to decide whether we'll select that upgrade path, or pretend we're up-to-date.

However, if the image is a delta and it's not the end-point for the upgrade what do we do? I.e. a delta update flows through an image with a non-matching percentage.

I think ideally we would just ignore any phased-percentages for non-endpoint delta updates. Ideally, the server would guarantee that such mid-upgrade deltas never had a phased-percentage key.

Thoughts?

Revision history for this message
Barry Warsaw (barry) wrote :

After discussion with slangasek and stgraber, the client will ignore any mid-point delta phased percentages. It will only look at the end-point image after the upgrade candidate winner has been selected. Thus, if the device has a higher percentage, or if the winner's last image's percentage is zero, the upgrade will not be taken.

Revision history for this message
Steve Langasek (vorlon) wrote : Re: [Bug 1383539] Re: phased update support does not give idempotent answer for each (machine, update)

On Tue, Oct 21, 2014 at 03:37:36PM -0000, Barry Warsaw wrote:

> I think ideally we would just ignore any phased-percentages for non-
> endpoint delta updates. Ideally, the server would guarantee that such
> mid-upgrade deltas never had a phased-percentage key.

Agreed. A delta that's in the middle of the upgrade path should be either
always used, or never used if it's broken to the point of needing
blacklisting.

Revision history for this message
Barry Warsaw (barry) wrote :

On Oct 21, 2014, at 05:54 PM, Steve Langasek wrote:

>Agreed. A delta that's in the middle of the upgrade path should be either
>always used, or never used if it's broken to the point of needing
>blacklisting.

stgraber confirmed that the current server code will only put a
phased-percentage key on an end-point image.

Barry Warsaw (barry)
Changed in ubuntu-system-image:
importance: High → Critical
Revision history for this message
Barry Warsaw (barry) wrote : Re: phased update support does not give idempotent answer for each (machine,update)

For testing:

<stgraber> https://phablet.stgraber.org
<stgraber> ubuntu-touch/test
<stgraber> mako

Revision history for this message
Barry Warsaw (barry) wrote :

Two other useful things:

--info should include the machine id

We should have a switch to either override the machine id or the device's percentage.

Barry Warsaw (barry)
Changed in ubuntu-system-image:
status: In Progress → Fix Committed
Barry Warsaw (barry)
Changed in ubuntu-system-image:
status: Fix Committed → Fix Released
Olli Ries (ories)
summary: - phased update support does not give idempotent answer for each
- (machine,update)
+ [TOPBLOCKER] phased update support does not give idempotent answer for
+ each (machine,update)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.