Display field strangeness on some 3.1+ systems

Bug #1806724 reported by Michele Morgan
34
This bug affects 7 people
Affects Status Importance Assigned to Milestone
Evergreen
Confirmed
Medium
Unassigned

Bug Description

We are seeing problems with highlighting and duplicated fields in record summaries on our 3.1.8 system opac and staff client. The problems are more noticeable when performing keyword and subject searches.

On a catalog search results page, highlighting of terms is inconsistent. A given word will be highlighted for one hit, but not the next. Screenshots are included.

On a record summary page, many fields are duplicated. Screenshots of this are included as well.

This is occurring on the following systems:

- 3.1.8 using the NACO normalizer and a synonym dictionary (NOBLE)
- 3.1.6 using the NACO normalizer and a synonym dictionary (Missouri Evergreen)
- 3.2.1 (PINES)

A 3.1.5 system using stock index configurations does not show the problem (Niagara Falls)

Here's a link to an IRC discussion of the issues:

http://irc.evergreen-ils.org/evergreen/2018-12-03#i_387019

Revision history for this message
Michele Morgan (mmorgan) wrote :
Revision history for this message
Michele Morgan (mmorgan) wrote :
Revision history for this message
Michele Morgan (mmorgan) wrote :
Revision history for this message
Michele Morgan (mmorgan) wrote :
Changed in evergreen:
status: New → Confirmed
Revision history for this message
Martha Driscoll (mjdriscoll) wrote :

Regarding the highlighting problem, NOBLE disabled stemming by setting all config.metabib_class_ts_map lines whose ts_config is english_nostop to FALSE. This results in highlighting of some search terms but not others. For example a search for 'Harry Potter' highlights Potter but not Harry (which stems to harri). A search for 'oboe' (stems to obo) results in no highlighting.

Revision history for this message
Martha Driscoll (mjdriscoll) wrote :

Duplicate fields are being caused by additional dictionary definitions in config.metabib_class_ts_map. NOBLE added a synonym dictionary and applied it to keyword, title, and subject classes with a weight of C. A new view in 3.1, search.best_tsconfig, retrieves keyword, title, and subject classes twice, once for our synonym ts_config and once for the english_nostop ts_config.

I changed the view definition to omit the custom NOBLE ts_config's but not sure if that's the best solution or if the index_weight should be changed from C to something else for the custom ts_config's.

SELECT m.id,
    COALESCE(f.ts_config, c.ts_config, 'simple'::text) AS ts_config
   FROM config.metabib_field m
     LEFT JOIN config.metabib_class_ts_map c ON c.field_class = m.field_class AND c.index_weight = 'C'::bpchar AND c.ts_config <> 'synonym_noble'::text
     LEFT JOIN config.metabib_field_ts_map f ON f.metabib_field = m.id AND f.index_weight = 'C'::bpchar;

Revision history for this message
Martha Driscoll (mjdriscoll) wrote :

NOBLE's fix for both highlighting search terms and avoiding double subject headings is to change search.best_tsconfig to get ts_config's with a weight of 'A' which points to the non-stemmed simple indexes. The custom dictionary is still weight 'C' so not producing duplicates anymore. Highlighting is working for the dictionary words although I'm not sure why. A search for 'no. nine' highlights 'no. 9'. (nine and 9 are in the synonym list).

Obviously not a fix for most of the community who uses the stemmed indexes, but I hope there is a way to accommodate custom dictionaries.

search.best_tsconfig:
SELECT m.id,
    COALESCE(f.ts_config, c.ts_config, 'simple'::text) AS ts_config
   FROM config.metabib_field m
     LEFT JOIN config.metabib_class_ts_map c ON c.field_class = m.field_class AND c.index_weight = 'A'::bpchar
     LEFT JOIN config.metabib_field_ts_map f ON f.metabib_field = m.id AND f.index_weight = 'A'::bpchar;

config.metabib_class_ts_map:
id field_class ts_config active index_weight
2 keyword english_nostop f C
10 subject english_nostop f C
8 series english_nostop f C
4 title english_nostop f C
6 author english_nostop f C
11 identifier simple t A
1 keyword simple t A
3 title simple t A
5 author simple t A
7 series simple t A
9 subject simple t A
13 title synonym_noble t C
14 subject synonym_noble t C
12 keyword synonym_noble t C

Revision history for this message
Mike Rylander (mrylander) wrote :

I wonder if the problem that search.best_tsconfig just needs to respect the active flag on the class and field ts_map tables?

Changed in evergreen:
importance: Undecided → Medium
Revision history for this message
Jason Stephenson (jstephenson) wrote :

Martha's suggesting in comment #7 works for us at CW MARS. We encountered the doubling of some fields in OPAC record summary view after the upgrade. We disabled search term highlighting because it was behaving inconsistently on our test machine. I guess this bug could explain that.

Mike, are you suggesting just modifying the original query to include a where condition config.metabib_field_ts_map is active? If so, I can try that on a test server.

Revision history for this message
Jason Stephenson (jstephenson) wrote :

And, nope. Just adding c.active and f.active to the left join criteria doesn't resolve the doubled subjects and content notes issue for me.

Revision history for this message
Jason Stephenson (jstephenson) wrote :

I want to add that I noticed the doubling occurs when the query string is present in the URL. If you strip out the query=...; part, then the doubling does not occur, at least on our 3.2.4 system. This suggest, to me, that there may be an alternate way to fix this.

Revision history for this message
Michele Morgan (mmorgan) wrote :

Jason, Mike's comment #6 on bug 1822875 is relevant.

When no query string is present in the url, you're linking directly to the record. Direct links to records do not currently use display fields. That's why the duplication doesn't appear when you remove the query string.

Revision history for this message
Josh Stompro (u-launchpad-stompro-org) wrote :

Hello, 3.3.3 also has this issue. We are getting display fields in triplicate because of our synonym dictionaries for numbers and roman numerals.

All entries in config.metabibl_field_ts_map and config.metabib_class_ts_map have active=true, so I also don't see that helping.

The index weight change from Martha seems to work for us to remove the duplicates.

Josh

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.