reused api oauth nonce causes oops

Bug #750984 reported by Martin Pool
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Launchpad itself
Fix Released
Critical
Brad Crittenden

Bug Description

I have a cron job running lp:kanban periodically, across ~canonical-bazaar.

Just once, a couple of days ago, it failed as follows:

Traceback (most recent call last):
  File "./bin/kanban", line 14, in <module>
    main(sys.argv)
  File "./kanban/entry_point.py", line 25, in main
    controller.run(argv[1:])
  File "/home/mbp/lib/python/commandant/controller.py", line 205, in run
    run_bzr(argv)
  File "/usr/lib/python2.6/dist-packages/bzrlib/commands.py", line 1055, in run_bzr
    ret = run(*run_argv)
  File "/usr/lib/python2.6/dist-packages/bzrlib/commands.py", line 661, in run_argv_aliases
    return self.run_direct(**all_cmd_args)
  File "/usr/lib/python2.6/dist-packages/bzrlib/commands.py", line 665, in run_direct
    return self._operation.run_simple(*args, **kwargs)
  File "/usr/lib/python2.6/dist-packages/bzrlib/cleanup.py", line 122, in run_simple
    self.cleanups, self.func, *args, **kwargs)
  File "/usr/lib/python2.6/dist-packages/bzrlib/cleanup.py", line 156, in _do_with_cleanups
    result = func(*args, **kwargs)
  File "./kanban/commands.py", line 69, in run
    bugs = get_person_assigned_bugs(launchpad, person_name)
  File "./kanban/launchpad.py", line 148, in get_person_assigned_bugs
    bug_set.update(get_person_directly_assigned_bugs(launchpad, member))
  File "./kanban/launchpad.py", line 159, in get_person_directly_assigned_bugs
    assignee=person):
  File "/usr/lib/pymodules/python2.6/lazr/restfulclient/resource.py", line 735, in __iter__
    unicode(self._root._browser.get(URI(next_link))))
  File "/usr/lib/pymodules/python2.6/lazr/restfulclient/_browser.py", line 316, in get
    response, content = self._request(url, extra_headers=headers)
  File "/usr/lib/pymodules/python2.6/lazr/restfulclient/_browser.py", line 306, in _request
    raise HTTPError(response, content)
lazr.restfulclient.errors.HTTPError: HTTP Error 401: Unauthorized
Response headers:
---
content-length: 58
content-type: text/plain
date: Sat, 02 Apr 2011 16:48:07 GMT
server: zope.server.http (HTTP)
status: 401
via: 1.1 wildcard.launchpad.net
x-lazr-oopsid: OOPS-1918G1133
x-powered-by: Zope (www.zope.org), Python (www.python.org)
---
Response body:
---
Invalid nonce/timestamp: This nonce has been used already.
---

Tags: api oops qa-ok

Related branches

Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 750984] [NEW] 401 "this nonce has been used already" error from api

do you have any code that would retry requests in there?

Revision history for this message
Martin Pool (mbp) wrote :

There's no code to retry requests in lp:kanban. I don't think there
is any in lazr.restful.

The relevant code is

def get_person_directly_assigned_bugs(launchpad, person):
    """Generator yields L{Bug}s assigned to C{person}.

    @param launchpad: A C{Launchpad} instance.
    @param person: A C{person} instance from Launchpad.
    """
    for bug_task in person.searchTasks(status=RELEVANT_STATUSES,
                                       assignee=person):
        # It's nice to see fixed bugs for the sake of a sense of
        # accomplishment, but we don't want the kanban to get too big.
        trace(bug_task)
        if (bug_task.status == "Fix Released"):
            date_closed = bug_task.date_closed
            age = datetime.now(date_closed.tzinfo) - date_closed
            if (age > timedelta(days=31)):
                trace("fixed too long ago, omitting")
                continue
        yield _create_bug(bug_task)

Revision history for this message
Robert Collins (lifeless) wrote : Re: 401 "this nonce has been used already" error from api

On the server side, we shouldn't record an OOPS for this; its something clients can trivially cause (via hacking or mistakes).

Revision history for this message
Martin Pool (mbp) wrote : Re: [Bug 750984] Re: 401 "this nonce has been used already" error from api

On 6 April 2011 17:38, Robert Collins <email address hidden> wrote:
> On the server side, we shouldn't record an OOPS for this; its something
> clients can trivially cause (via hacking or mistakes).

Well, that's true, but on the other hand it seems like in this case it
is happening because of a fault in Launchpad, or perhaps lplib. In
that sense it's a bit like other 400-type errors.

Revision history for this message
Robert Collins (lifeless) wrote :

On Thu, Apr 7, 2011 at 11:23 AM, Martin Pool <email address hidden> wrote:
> On 6 April 2011 17:38, Robert Collins <email address hidden> wrote:
>> On the server side, we shouldn't record an OOPS for this; its something
>> clients can trivially cause (via hacking or mistakes).
>
> Well, that's true, but on the other hand it seems like in this case it
> is happening because of a fault in Launchpad, or perhaps lplib.  In
> that sense it's a bit like other 400-type errors.

If its a fault in LP, then yes we should fix it. However, we want
OOPSes to always indicate that there is something developers or ops
need to do : its a server /fault/ reporting system. We don't generally
log OOPSes for 400 class errors; 404's with a canonical-site referer
are a key exception, for instance.

Brad Crittenden (bac)
Changed in launchpad:
assignee: nobody → Brad Crittenden (bac)
Revision history for this message
Martin Pool (mbp) wrote :

I'm not sure if bac took this on with the intention of just silencing
the oops, or of a deeper investigation.

As far as I can see the client is not sending a reused nonce, and
Launchpad is incorrectly saying it has already been reused, which
would be a bug in lp. I think it is at worth doing the investigation
before deciding to squash this class of errors: it is something that
_could_ be generated by a client error but I don't see any evidence it
actually is.

Revision history for this message
Robert Collins (lifeless) wrote :

It *can* be caused by bust clients and its a normal situation to have
occur => its *not* appropriate as an OOPS.

Separate to that there is a good question around why you are seeing
it, but the reason this bug is critical is the OOPSness. Lets not
conflate these things: even if we diagnose why you are seeing this and
conclude its a server bug, we'd still have other clients able to
trivially inject terrible noise into our crash database by sending
previously used nonces at LP.

Once the OOPS is removed, this can be either closed and a new one
without the OOPS attached to it opened, or this bug can be dropped to
high-or-low priority.

-Rob

Revision history for this message
Brad Crittenden (bac) wrote : Re: 401 "this nonce has been used already" error from api

Based on Robert's comments I will be fixing this bug to only avoid the OOPS for improper token/nonce/timestamp client data.

Revision history for this message
Martin Pool (mbp) wrote : Re: [Bug 750984] Re: 401 "this nonce has been used already" error from api

I discussed this offline with Robert.

I have no objection to squashing the oops alone.

If a valid client request causes an error (whether it oopses or not)
that also matches the definition of critical, but we can handle that
as a separate bug. I probably won't bother filing it unless I see it
occur again.

Revision history for this message
Launchpad QA Bot (lpqabot) wrote :
tags: added: qa-needstesting
Changed in launchpad:
status: Triaged → Fix Committed
Brad Crittenden (bac)
tags: added: qa-ok
removed: qa-needstesting
Martin Pool (mbp)
summary: - 401 "this nonce has been used already" error from api
+ reused api oauth nonce causes oops
William Grant (wgrant)
Changed in launchpad:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.