More robust Elasticsearch data uploading

Bug #1457712 reported by Aaron Wells
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mahara
Fix Released
Medium
Aaron Wells
15.10
Fix Released
Undecided
Unassigned

Bug Description

The search/elasticsearch plugin has a cron job that loads data into the Elasticsearch server. Currently, this cron job has a couple of problems:

1. It sends all the data from search_elasticsearch_queue in just one /_bulk operation. This means that if you've got a proxy server in front of your Elasticsearch server, it's quite likely to reject the request because it'll be larger than the allowed request size.

2. It deletes the records from the queue table *before* attempting to upload them to Elasticsearch. If the upload fails, all of the records that were part of that bulk operation are basically lost and never get indexed.

Aaron Wells (u-aaronw)
tags: added: elasticsearch
Changed in mahara:
importance: Undecided → Medium
status: New → In Progress
milestone: none → 15.10.0
assignee: nobody → Aaron Wells (u-aaronw)
Revision history for this message
Mahara Bot (dev-mahara) wrote : A patch has been submitted for review

Patch for "master" branch: https://reviews.mahara.org/4831

Revision history for this message
Mahara Bot (dev-mahara) wrote : A change has been merged

Reviewed: https://reviews.mahara.org/4831
Committed: https://git.nzoss.org.nz/mahara/mahara/commit/f7194ce1c0c47c9da1b69026e61b71f9024b1a86
Submitter: Robert Lyon (<email address hidden>)
Branch: master

commit f7194ce1c0c47c9da1b69026e61b71f9024b1a86
Author: Aaron Wells <email address hidden>
Date: Tue Jun 9 18:00:47 2015 +1200

More robust handling of Elasticsearch bulk operations

Bug 1457712. This patch accomplishes four main things:

1. Sets a limit on the number on the number of documents
per Elasticsearch bulk request.

2. Doesn't delete records from the queue table until
after they have been successfully sent

3. If a bulk request fails, later retries the records
individually

4. Performs deletion in bulk

Change-Id: I9ac5e3a33b473e256fdf331800dc60101c126dcc

Robert Lyon (robertl-9)
Changed in mahara:
status: In Progress → Fix Committed
Robert Lyon (robertl-9)
Changed in mahara:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.