See references not always displaying on browse search

Bug #1358392 reported by Tim Spindler
28
This bug affects 5 people
Affects Status Importance Assigned to Milestone
Evergreen
Confirmed
Medium
Unassigned

Bug Description

Evergreen 2.5.5
Debian 3.2.57
Opensrf 2.30

See references (4xx) from the authority record are not properaly generated.

The heading "Bengal tigers" has a see reference of "Pantheras tigris tigris".

=LDR 01098nz a2200241n 4500
=001 274321
=003 CWMARS
=005 20100510084134.0
=008 100322i|\anannbabn\\\\\\\\\\|a\ana\\\\\c
=010 \\$ash2010004791
=035 \\$a(DLC)sh2010004791
=040 \\$aWaU$beng$cDLC
=053 \0$aQL737.C23$cZoology
=150 \\$aBengal tiger
=450 \\$aPanthera tigris bengalensis
=450 \\$aPanthera tigris tigris
=450 \\$aRoyal Bengal tiger
=550 \\$wg$aTiger
=670 \\$aWork cat.: Steiner, B.A. Biography of a Bengal tiger, c1979.
=670 \\$aITIS, Mar. 19, 2010$b(Panthera tigris tigris -- Bengal tiger)
=670 \\$aThe IUCN red list of threatened species, via WWW, Mar. 19, 2010$b(Panthera tigris ssp. tigris. Common names: English Bengal Tiger)
=670 \\$aWikipedia, Mar. 19, 2010:$bBengal tiger (The Bengal tiger, or Royal Bengal tiger (Panthera tigris tigris, previously Panthera tigris bengalensis), is a subspecies of tiger, found in India, Bangladesh, Nepal and Bhutan)
=670 \\$aNCBI taxonomy browser, Mar. 19, 2010$b(Panthera tigris tigris (Bengal tiger))
=670 \\$aWeb. 3$b(bengal tiger, usu. cap. B)
=901 \\$c274321$tauthority
The display for Bengal Tiger is correct:

Bengal (India). Governor-General (1774-1785 : Hastings) (1)
Bengal tiger (4)
Bengali (4)

There is no entry for Royal Bengal

Royal Barry Wills Associates. (1)
Royal Bath Hotel (Bournemouth, England) (1)
Royal Botanic Gardens (1)
Royal Botanic Gardens, Kew (9)

There is no entry for Panthera tigris

Panther Swamp National Wildlife Refuge (Miss.) (3)
Panthera (12)
Panthers (18)

I did the following including a reingest to try and get the display to work.

Fixing Bengal Tigers

1. Remove entry from authority.simple_heading table
 DELETE FROM authority.simple_heading WHERE record=274321;
2. Remove entry from metabib.browse_entry_simple_heading _map;
 DELETE FROM metabib. browse_entry_simple_heading_map WHERE (SELECT m.id FROM metabib. browse_entry_simple_heading_map m LEFT JOIN authority.simple_heading ash ON m.simple_heading =ash.id WHERE ash.id IS NULL);
3. Remove entries from metabib.browse_entry
 SELECT * FROM metabib.browse_entry WHERE value~*'(Bengal Tigers)' OR value~*'(Royal Tigers)' OR value~*'(Pantheras Tigris)';

 id value index_vector sort_value
 7201731 Bengal tigers 'bengal':1 'tiger':2 bengal tigers
 9488199 Rajpur, last of the Bengal tigers 'bengal':5 'last':2 'of':3 'rajpur':1 'the':4 'tiger':6 rajpur last of the bengal tigers

 2 row(s)

 DELETE FROM metabib.browse_entry_simple_heading_map WHERE entry IN (7201731);
 DELETE FROM metabib.browse_entry_def_map WHERE entry IN (7201731);
 DELETE FROM metabib.browse_entry WHERE id IN (7201731);

4. Reingest authority for Birds Vocalization

 UPDATE config.internal_flag SET enabled = 't' WHERE name = 'ingest.reingest.force_on_same_marc';
 UPDATE authority.record_entry SET id = id WHERE id =7201731;
 UPDATE config.internal_flag SET enabled = 't' WHERE name = 'ingest.reingest.force_on_same_marc';

 --results no change, bengal tigers reference returned but no see references.

----------------------------

It also appears see references sometimes appear. Here is an example where the see reference appears but there is a See Also reference associated with this. If there is no see also, It does not appear the reference displays.

=LDR 00405cz a2200169n 4500
=001 30911
=003 CWMARS
=005 19890420132941.2
=008 860211i|\anannbab|\\\\\\\\\\|b\ana\\\\\\
=010 \\$ash 85020760
=035 \\$a(DLC)sh 85020760
=040 \\$aDLC$cDLC$dDLC
=053 \\$aHD5855$bHD5856
=150 \\$aCasual labor
=450 \\$aLabor, Casual
=550 \\$wg$aEmployees
=550 \\$aMigrant labor
=550 \\$aSeasonal variations (Economics)$0(CWMARS)74343
=901 \\$c30911$tauthority

labor casual

    See Also From Tracing -- Topical Term Seasonal variations (Economics) (3)
    SEE Heading -- Topical Term Casual labor (1)

tags: added: browse search
tags: added: references see
Revision history for this message
Mike Rylander (mrylander) wrote :

Tim,

Are there any bib records making use of the exact term "Royal Bengal tiger", authority-linked or otherwise? Are there any bibs using the "Bengal tiger" heading that are authority linked?

The existence of linked bibs changes what from the authority record, because this is a bib browse interface, not an authority browse interface.

Revision history for this message
Tim Spindler (tspindler-cwmars) wrote :

Mike,

There are four records with a 650 of "Bengal Tiger" which is linked to the authority record. There are no records that have the unauthorized heading of "Royal Bengal Tiger" in in the 650 or other fields.

Revision history for this message
Srey Seng (sreyseng) wrote :
Revision history for this message
Kathy Lussier (klussier) wrote :

I'm adding the description Srey gave for the duplicated bug:

When browsing for an unauthorized heading, the expected behavior is that we would see the unauthorized heading in the browse search result, with a "See" reference to its authorized heading, if the authorized heading is linked to a bib.

However, this is not the case.

Assumptions:
(1) Auth A & Auth B has been imported into Evergreen
(2) Bibs exists that are controlled by Auth A & Auth B (bib-auth linking has been run)
(3) Auth-auth linking has purposely not been run***

Auth A:
=100 1\$aPoppen, Nikki,$d1967-
=400 1\$aPoppen-Eagen, Nikki,$d1967-
=400 1\$aEagen, Nikki Poppen-,$d1967-
=500 1\$aScott, Bronwyn,$d1967-

Auth B:
=100 1\$aScott, Bronwyn,$d1967-
=500 1\$aPoppen, Nikki,$d1967-

For example, given the above Assumptions and Auth snippets:
When I browse search for "Eagen, Nikki Poppen-, 1967-", I should get a See reference to "Poppen, Nikki, 1967-." Similarly, when I browse search for "Poppen-Eagen, Nikki, 1967-", I should get a See reference to "Poppen, Nikki, 1967-." This is not the case.

Without a "complete" bib/auth set, unauthorized headings are never exposed in the browse result list. A complete set would be where you have two authority records that is linked to each other, and at least the authority record with the targeted unauthorized heading needs to link to at least one bib.

Unauthorized headings should not have to be in a "complete" set to surface, they should surface even in an isolated set. An isolated set would be where the targeted unauthorized heading exists in an authority record that is linked to a bib without also having to be linked to another authority record.

Complete sets are a minority, and there are significantly much more authority records that lives in an isolated set****.

Notes:
----------------------------
***This is to emulate scenario where we have authority records that do not link to other authority records but do link to bibs and do have unauthorized headings. In other words, we can have authority records that have unauthorized entries but do not have cross references, either because the cross references have not been imported into the database or no cross references exists yet, or there's no 5xx in the authority record.
****Either because these authority records simply don't have cross references, or the cross references haven't been imported.

Srey Seng (sreyseng)
Changed in evergreen:
assignee: nobody → Srey Seng (sreyseng)
Revision history for this message
Srey Seng (sreyseng) wrote :

Hello,

Here is a WIP fix:
http://git.evergreen-ils.org/?p=working/Evergreen.git;a=shortlog;h=refs/heads/user/sreyseng/lp1358392-see-references-not-always-displaying-on-browse-search

Any testing/comment/feedback is much appreciated!

------------------
Sample test procedure on stock DB w/ sample data
------------------

Using Marc records from the attached zip file:
1) Import in all auths
2) Import in all bibs
3) Run auth-bib linking

You should have the following authority records linked to some sample bib records:

Poppen, Nikki:
=100 1\$aPoppen, Nikki,$d1967-
=400 1\$aPoppen-Eagen, Nikki,$d1967-
=400 1\$aEagen, Nikki Poppen-,$d1967-
=500 1\$aScott, Bronwyn,$d1967-

Scott Brownwyn:
=100 1\$aScott, Bronwyn,$d1967-
=500 1\$aPoppen, Nikki,$d1967-

Browse for the 4xx in Poppen, Nikki's authority record. You should get See references, whereas before, you wouldn't.
------------------

Changed in evergreen:
assignee: Srey Seng (sreyseng) → nobody
Revision history for this message
Srey Seng (sreyseng) wrote :

Also appears to reduce instances of situations where the browse search "lands" on a wrong result.
?field.comment=Also appears to reduce instances of situations where the browse search "lands" on a wrong result.

Revision history for this message
Srey Seng (sreyseng) wrote :
Revision history for this message
Mike Rylander (mrylander) wrote :

I haven't looked at the code yet, but the description and screen shot suggest it is a good improvement.

However, I want to correct an assertion made in the description (copied from the duplicate bug): the behavior as it exists today was intentional, and cross-referencing authorities was intended to be a requirement for surfacing authority-only data. The reason is that this was built as a bib browse, not as an authority browse.

UPDATE: I looked at the linked WIP. Without the appropriate baseline schema changes it's difficult to see what was actually changed. It's generally the convetion of the dev folks (thus far) for the upgrade script to come last. Please update the baseline with your changes.

Question: Will main entries for unauthorized, in-use headings show up if the main entry heading is not in use by any bibs?

Thanks!

Revision history for this message
Srey Seng (sreyseng) wrote :

Hi Mike,

I was looking through various browse interfaces when trying to learn more about authority browse. Through additional research on authority control and conversations with various librarians I came to the conclusion that all browse searches on unauthorized headings should show a See reference to the authorized heading (given no 'dead' references). Without fully understanding the original intention/requirements around Evergreen's bib-browse interface, I thought this way, so I do apologize for the strong assertion!

Regarding question, unauthorized headings in use on bib should show up for browse, even if the main authorized heading is not linked to any bibs.

As requested, here is a new branch with changes reflected in the baseline schema (I've also updated the description):
http://git.evergreen-ils.org/?p=working/Evergreen.git;a=shortlog;h=refs/heads/user/sreyseng/lp1358392-see-references-not-always-displaying-on-browse-search-v2

Again, thanks a bunch for your feedback and for taking a look!

Thanks!

Revision history for this message
Mike Rylander (mrylander) wrote :

Thanks for posting the in-line change version, that's very helpful. I haven't digested the full set of changes yet, but just as a heads-up, the use of specific fields (well, the prefix thereof) won't work for the final version, but that's fine for now to get the point across. Reason being, there is nothing saying that the 4XX fields will be the location of unauthorized headings in any given control set, that just happens to be the case for the LoC one.

I'll look at the net effect and see if I can suggest a solution that fits with the rest of the code. Just as a warning, I'll be out of the office and away from all computers tomorrow through next Tuesday, though, so it will be a few days until I can do that, though.

I also want to say that I'm very glad that you are looking at the guts of this code, as often it's hard to spread the knowledge of the more complex internal workings in many areas and jumping in really is the best way to learn it well. Thanks for digging into it!

Revision history for this message
Srey Seng (sreyseng) wrote :

Got it, thanks for the heads up!

I do understand now why the use of hard code specific fields might not work given control set definitions can change. Really appreciate the explanation!

Revision history for this message
Mike Rylander (mrylander) wrote :

As an aside, I wanted to (further) clarify what I said before about the original development.

As you're seeing, this is feature set is complicated both from the code and functionality perspectives. It certainly isn't that I, or anyone else, want to artificially restrict the features where it would be useful -- we all can agree that we want the most capable system we can build.

Because authority control, particularly when it comes to how it's presented to patrons (WRT avoiding data overload and confusion) is not a simple feature set to define and build, we needed to find an balance for the first-version project. From at strict developement perspective, surfacing unauthorized headings that aren't used in bibs fell on the "authority centered, not bib centered" side of the fence. If in used by a bib, obviously they should show up.

Now, all that said, the bib browse project as a whole introduced a /ton/ of infrastructure that we can build on, including currently unused backend functionality for providing more authority-centered (or, perhaps, more authority-including) aspects to the API.

And now, having gone through all that and re-familiarizing myself with the internals, I'll point you at the authority.control_set_auth_field_metabib_field_map_blind_refs_only view before I turn into a pumpkin for the next few days. That view investigates the unauthorized (non-main, non-linkable) headings. I think we can simply fold that into the definition of authority.control_set_auth_field_metabib_field_map_refs, which is already a union, and that should make the original stored proc do what you've done by hand. Along the lines of:

CREATE OR REPLACE VIEW authority.control_set_auth_field_metabib_field_map_refs AS
         SELECT control_set_auth_field_metabib_field_map_main.authority_field, control_set_auth_field_metabib_field_map_main.metabib_field
           FROM authority.control_set_auth_field_metabib_field_map_main
UNION
         SELECT control_set_auth_field_metabib_field_map_refs_only.authority_field, control_set_auth_field_metabib_field_map_refs_only.metabib_field
           FROM authority.control_set_auth_field_metabib_field_map_refs_only
UNION
         SELECT control_set_auth_field_metabib_field_map_blind_refs_only.authority_field, control_set_auth_field_metabib_field_map_blind_refs_only.metabib_field
           FROM authority.control_set_auth_field_metabib_field_map_blind_refs_only;

I'll check back in next week. If you have a chance, please do try that out.

Revision history for this message
Srey Seng (sreyseng) wrote :

Hi Mike,

Lots of good information!

---
I would like to briefly clarify the current behavior of when unauthorized entries show up (when no bibs make use of the heading).

AUTH A:
Poppen, Nikki
=100 1\$aPoppen, Nikki,$d1967-
=400 1\$aPoppen-Eagen, Nikki,$d1967-
=400 1\$aEagen, Nikki Poppen-,$d1967-
=500 1\$aScott, Bronwyn,$d1967-
BIB 240 / 241

AUTH B:
Scott Brownwyn
=100 1\$aScott, Bronwyn,$d1967-
=500 1\$aPoppen, Nikki,$d1967-

Current behavior when no bibs makes use of unauthorized heading:
In order for unauthorized entries of AUTH A to display, AUTH A needs to be a source in the authority linking table, and the target (AUTH B) needs to link to a bib.

---

Given that, I tried updating the existing definition of "authority.control_set_auth_field_metabib_field_map_refs" with the
new definition provided (containing the addition of a new union with "control_set_auth_field_metabib_field_map_blind_refs_only").

I am assuming the original stored proc refers to staged_browse.

After updating the view with the definition, the current behavior regarding unauthorized headings remain the same.
I do not see the desired behavior (where unauthorized headings on an auth display regardless of existence of cross-references as long as the auth itself is linked to a bib).

I believe by the time staged_browse does the join with the updated view "authority.control_set_auth_field_metabib_field_map_refs", the unauthorized entry is already filtered out by prior joins.

Thanks!

Revision history for this message
Srey Seng (sreyseng) wrote :

Please see attachment for lots more before/after side by side from a test system with "real" data. Very significant amount of hidden entries are exposed.

Large picture, you will have to zoom in I believe.

Revision history for this message
Kathy Lussier (klussier) wrote :

Marking this bug confirmed since Srey had independently reported it in another bug.

Changed in evergreen:
status: New → Confirmed
importance: Undecided → Medium
Revision history for this message
Kathy Lussier (klussier) wrote :

Marking this as a duplicate of bug 1638299. Galen included Srey's branch in with hte 3.0 authority infrastructure project and updated it to work with the new code.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.