W3C wants query parameters standardized on ampersands; semicolons are now bad

Bug #1687545 reported by Dan Scott
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Evergreen
Confirmed
Medium
Unassigned
3.5
Confirmed
Medium
Unassigned
3.6
Confirmed
Medium
Unassigned

Bug Description

Per https://www.w3.org/TR/2014/REC-html5-20141028/forms.html#url-encoded-form-data, step 8 of creating URL-encoded form data, where it tells you how to add a second or subsequent key/value pair is:

  If this is not the first entry, append a single U+0026 AMPERSAND character (&) to result.

That is, semicolons are no longer an acceptable option.

Why does this matter? Because modern tools and APIs are now treating semicolons as just plain data rather than an alternative to ampersand, and this has real impact. For example, the https://developer.mozilla.org/en-US/docs/Web/API/URLSearchParams browser API and corresponding polyfill https://github.com/WebReflection/url-search-params would be a handy way to replace our custom openils.CGI code.

But given a URL of https://example.org/eg/opac/advanced?bool=and;qtype=keyword, asking for the qtype parameter returns null, and asking for the bool parameter returns more than you would expect:

  var params = new URLSearchParams(location.search.slice(1));
  params.get('qtype');
  // null
  params.get('bool');
  // "and;qtype=keyword"

We could continue using openils.CGI and adding our own workarounds to modern tools, but it might be better to standardize on ampersands sooner rather than later.

Revision history for this message
Dan Scott (denials) wrote :

Per http://perldoc.perl.org/CGI.html, "-newstyle_urls
Separate the name=value pairs in CGI parameter query strings with semicolons rather than ampersands. [...] Semicolon-delimited query strings are always accepted, and will be emitted by self_url() and query_string(). newstyle_urls became the default in version 2.64."

It's possible to return the behaviour to ampersands using the -oldstyle_urls flag. To be investigated.

Revision history for this message
Dan Scott (denials) wrote :

Confirmed, adding the flag to OpenILS/WWW/EGCatLoader.pm does the trick for much of the TPAC:

  use CGI qw(:all -utf8 -oldstyle_urls);

In Evergreen 2.10, at least, there are a lot of invocations of "use CGI":

/usr/local/share/perl/5.18.2/OpenILS/Reporter/Proxy.pm:use CGI;
/usr/local/share/perl/5.18.2/OpenILS/WWW/BadDebt.pm:use CGI;
/usr/local/share/perl/5.18.2/OpenILS/WWW/TemplateBatchBibUpdate.pm:use CGI;
/usr/local/share/perl/5.18.2/OpenILS/WWW/FlatFielder.pm:use CGI qw(:all -utf8);
/usr/local/share/perl/5.18.2/OpenILS/WWW/Exporter.pm:use CGI;
/usr/local/share/perl/5.18.2/OpenILS/WWW/EGWeb/CGI_utf8.pm:use CGI qw(:all -utf8);
/usr/local/share/perl/5.18.2/OpenILS/WWW/SuperCat/Feed.pm:use CGI;
/usr/local/share/perl/5.18.2/OpenILS/WWW/EGCatLoader.pm:use CGI qw(:all -utf8 -oldstyle_urls);
/usr/local/share/perl/5.18.2/OpenILS/WWW/XMLRPCGateway.pm:use CGI;
/usr/local/share/perl/5.18.2/OpenILS/WWW/AddedContent.pm:use CGI;
/usr/local/share/perl/5.18.2/OpenILS/WWW/Proxy.pm:use CGI;
/usr/local/share/perl/5.18.2/OpenILS/WWW/AutoSuggest.pm:use CGI qw(:all -utf8);
/usr/local/share/perl/5.18.2/OpenILS/WWW/Proxy/Authen.pm:use CGI;
/usr/local/share/perl/5.18.2/OpenILS/WWW/IDL2js.pm:use CGI;
/usr/local/share/perl/5.18.2/OpenILS/WWW/SuperCat.pm:use CGI;
/usr/local/share/perl/5.18.2/OpenILS/WWW/Vandelay.pm:use CGI;
/usr/local/share/perl/5.18.2/OpenILS/WWW/Reporter.pm:use CGI;
/usr/local/share/perl/5.18.2/OpenILS/WWW/PhoneList.pm:use CGI;
/usr/local/share/perl/5.18.2/OpenILS/WWW/Redirect.pm:use CGI ();

Revision history for this message
Galen Charlton (gmc) wrote :

*grumble grumble grumble in the direction of the W3C*

Now that that's out of my system, I note that because of URL patterns like '/eg/opac/results?bookbag=123;page=0;locg=1;depth=0' that get used as permalinks, we'll need to make sure we don't break them coming in; going out is a different matter, of course.

Revision history for this message
Dan Scott (denials) wrote :

The other piece of the puzzle to getting the TT2 mkurl() macro to generate links with & is to set the -oldstyle_urls flag in Template::Plugin::CGI. I thought we might have to fork that, but note that Template::Toolkit itself just celebrated its first release in three years on April 15, 2017: http://search.cpan.org/~abw/Template-Toolkit-2.27/ - so the timing might be right to press for at least an option to invoke -oldstyle_urls in the CGI plugin!

On the bright side, incoming URLs such as https://example.org/eg/opac/results?query=fiction;qtype=keyword;fi%3Asearch_format=;locg=124;detail_record_view=1;_adv=1 work with either & or ; when the server is doing the parsing. It's when we do client-side parsing of the URL (as when we reconstructing the copy location search options by parsing out "fi:locations" parameters) where we run into trouble with ';' delimiters with the URLSearchParams API and friends.

Luckily it seems like openils.CGI is mostly used in conify, serials, and acquisitions--not too many situations that are user (and bookmark / saved link) facing. Copy location search appears to be the only case where that happens in the js/ui/default/opac space on a quick perusal.

Revision history for this message
Robert J Jackson (rjackson-deactivatedaccount) wrote :

It looks like perhaps this impacts end users as shown by the following log entries?

gateway log line:

2017-08-02 07:11:58 app3-prod logger: xx.xxx.xxx.xx - - [02/Aug/2017:07:11:57 -0400] "GET /eg/opac/results?fi%3Afrom_metarecord=4277067%3Bmetarecord%3D4277067%3Bmodifier%3Dmetabib%3Bfi%3Aicon_format%3Ddvd HTTP/1.1" 200 113320

corresponding entry from the osrfsys log file:

osrfsys.07.log:2017-08-02 07:11:58 app3-prod open-ils.search: [INFO:10150:Application.pm:159:15016680882512309] CALL: open-ils.search open-ils.search.biblio.multiclass.query HASH(0x4814738), from_metarecord(4277067;metarecord=4277067;modifier=metabib;fi:icon_format=dvd) depth(0), 1

This is for Evergreen Indiana consortium running with 2.12.4

Revision history for this message
Lynn Floyd (lfloyd) wrote :
Changed in evergreen:
status: New → Confirmed
Changed in evergreen:
importance: Undecided → Medium
tags: added: opac
Revision history for this message
Jason Boyer (jboyer) wrote :

I have a branch here to address this: https://git.evergreen-ils.org/?p=working/Evergreen.git;a=shortlog;h=refs/heads/user/jboyer/lp1687545_chicago_handshake / working/user/jboyer/lp1687545_chicago_handshake but you have to supply your own Malört.

tags: added: pullrequest
removed: opac
Revision history for this message
Mike Rylander (mrylander) wrote :

After just 3.5 short years, I offer a fix for this:

https://git.evergreen-ils.org/?p=working/Evergreen.git;a=shortlog;h=refs/heads/user/miker/lp-1687545-amp-in-urls

From the commit message:

Since we have a local version of the CGI TT plugin, and that's what is used in the mkurl() TT macro, we can tell it to use ampersands instead of semicolons when building OPAC links. This commit imports CGI with the -oldstyle_urls flag, which does exactly that.

tags: added: opac
Revision history for this message
Mike Rylander (mrylander) wrote :
tags: removed: opac
Revision history for this message
Chris Sharp (chrissharp123) wrote :

Confirmed working on PINES production running 3.6. Pushed to master, rel_3_6 and rel_3_5. Thanks Dan, Jason, and Mike!

tags: added: signedoff
Changed in evergreen:
milestone: none → 3.6.2
status: Confirmed → Fix Committed
Jason Boyer (jboyer)
Changed in evergreen:
milestone: 3.6.2 → 3.7-beta
milestone: 3.7-beta → none
no longer affects: evergreen/3.6
Changed in evergreen:
milestone: none → 3.6.3
milestone: 3.6.3 → none
status: Fix Committed → Fix Released
milestone: none → 3.6.2
Changed in evergreen:
status: Fix Released → Confirmed
milestone: 3.6.2 → 3.7-beta2
milestone: 3.7-beta2 → none
Revision history for this message
Jason Stephenson (jstephenson) wrote :

The code for this branch was reverted from the mainline Evergreen repositories in an attempt to address bug 1917951 and bug 1918470.

This bug is therefore "re-opened" by having its status changed from Fix Committed to Confirmed.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.