Duplicate open-ils.search.z3950.search_class calls lead to drone exhaustion

Bug #1940698 reported by Jeff Davis
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Evergreen
Confirmed
Medium
Unassigned

Bug Description

EG 3.7

We've had multiple incidents recently where we've seen hundreds or even thousands of identical open-ils.search.z3950.search_class requests over a period of 10 minutes or more. This leads to open-ils.search drone exhaustion (there are copious "no children available" warnings in the logs).

I don't know how to replicate the issue. Holding down the Enter key when doing a search in the "Import Record from Z39.50" UI will send off a search request once per second or so, but I doubt that's the cause when it happens repeatedly over an extended period.

tags: added: parallel-requests
Revision history for this message
Jeff Davis (jdavis-sitka) wrote :

It's also possible these requests are coming from the Dojo-based acq picklist UI, but again, I'm not quite sure how to replicate.

Revision history for this message
Galen Charlton (gmc) wrote :

I've traced a path how this happens:

- User logs in, goes to the AngularJS Z39.50 search page, and does a search
- User leaves the Z39.50 tab open, maybe does other stuff
- Some time later, the AngularJS user session checker detects that the user session has expired, then invokes open-ils.auth.session.delete to remove it
- Ordinarily, this would redirect all tabs to the login page.
- Instead, for an as yet unknown reason, the following happens:

1. The Z3950SearchCtrl starts getting invoked repeatedly, spamming the following requests:

   open-ils.search open-ils.search.z3950.retrieve_services null
   open-ils.pcrud open-ils.pcrud.search.cbs.atomic null, {"id":{"!=":null}}, {}

2. At the same time, something repeatedly invokes $scope.search() directly or a grid refresh, causing the Z39.50 search to be done repeatedly

A full page refresh does not appear to be happening, as otherwise the original Z39.50 search's parameters would not be retained. Also, a full page refresh would result in PCRUD queries of vibtg, which are not evidenced in the logs.

As it happens, the Z3950SearchCtrl initialization is occurring twice as there is both a route configuration and an ng-controller defined (via ctx.page_ctrl). Removing ctx.page_ctrl from Open-ILS/var/templates/staff/cat/z3950/index.tt2 removes the double-initialization. However, double-init is not the same as infinitely-looped init, so this may be a red herring.

So... some sort of infinite digest loop?

Also, open-ils.search.z3950.search_class lacks a direct permissions check. instead letting native-evergreen-catalog searches through regardless of whether the user is logged in. It does check permissions before performing searches for remote Z39.50 targets. However, as a mitigation it would be overall better if REMOTE_Z3950_SEARCH were checked at the top level.

Revision history for this message
Jeff Davis (jdavis-sitka) wrote :

We had a real-life incident with this bug today that fits Galen's analysis. At 2:10 AM, a cataloguer's auth session was deleted and a staff workstation was redirected to the staff client login page; immediately afterward, we got spammed with open-ils.search.z3950.retrieve_services and open-ils.search.z3950.search_class requests. Given the time of day, I assume the user left the client open on their workstation and got logged out automatically once their session finally timed out, which triggered the bug.

Revision history for this message
Galen Charlton (gmc) wrote :

Noting that yesterday I saw happen on a 3.9.1 system.

Changed in evergreen:
status: New → Confirmed
importance: Undecided → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.