Blowing Xapian max term length corrupts index

Bug #843668 reported by Mikkel Kamstrup Erlandsen
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Zeitgeist Extensions
Fix Released
High
Mikkel Kamstrup Erlandsen
zeitgeist-extensions (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

Xapian has a (not very well documented) max term length of 245 bytes. See fx. http://xapian.org/docs/omega/termprefixes.html. For some reason this is not always gracefully handled inside Xapian and busting that limit may occasionally corrupt the index.

This is reproducible by indexing long URLs (at least 245 bytes long). We already had a cap at 2000 characters, but that was apparently not good enough.

Related branches

Changed in zeitgeist-extensions:
assignee: nobody → Mikkel Kamstrup Erlandsen (kamstrup)
importance: Undecided → High
status: New → Triaged
Changed in zeitgeist-extensions:
status: Triaged → Fix Committed
Changed in zeitgeist-extensions:
milestone: none → fts-0.0.12
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package zeitgeist-extensions - 0.0.12-0ubuntu1

---------------
zeitgeist-extensions (0.0.12-0ubuntu1) oneiric; urgency=low

  * New upstream release:
    - fts can SIGSEGV ZG during reindex (LP: #617309)
    - zeitgeist-daemon crashed with RuntimeError in _check_index():
      basic_string::assign (LP: #839740)
    - Blowing Xapian max term length corrupts index (LP: #843668)
    - Can't recover from FTS index corruption (LP: #705944)
 -- Didier Roche <email address hidden> Thu, 08 Sep 2011 11:25:16 +0200

Changed in zeitgeist-extensions (Ubuntu):
status: New → Fix Released
Revision history for this message
Richard Boulton (richardboulton) wrote :

Note from a Xapian developer; the report here says: "For some reason this is not always gracefully handled inside Xapian and busting that limit may occasionally corrupt the index." We're not aware of any situation in which adding a term longer than the limit can result in a corrupted index, and I don't recall any such report. If you have a way to reproduce such a corruption, we'd be interested in it, so that we can fix it.

Revision history for this message
Mikkel Kamstrup Erlandsen (kamstrup) wrote :

Richard: Sure - I never personally could reproduce this issue, but one user seemed to get it very reliably. I can check with him to see if we can narrow it down.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.