curtin mirror URL contains double slash (/) after mirror hostname, impacting proxy cachability

Bug #1592666 reported by Trent Lloyd
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
MAAS
Fix Released
Medium
Andres Rodriguez
2.0
Fix Released
Medium
Andres Rodriguez

Bug Description

The mirror URL configured by MAAS contains a double slash after the mirror hostname. Other than being unclean, this is bad for HTTP proxies as the extra slash will result in caching a separate set of files to not having an extra slash.

root@semisomnolent-evia:~# cat /etc/apt/sources.list
deb http://mirror.internode.on.net//pub/ubuntu/ubuntu xenial main restricted universe multiverse

The data path is as follows

 (1) The archive URL is configured stored as a single string in the database: http://mirror.internode.on.net/pub/ubuntu/ubuntu

 (2) maas:src/maasserver/preseed.py:get_preseed_context() splits this into two variables for "hostname" and "directory" using get_netloc_and_path() which calls urllib.parse.urlparse() and returns only netloc and path:
     >>> urlparse("http://mirror.internode.on.net/pub/ubuntu/ubuntu")
ParseResult(scheme='http', netloc='mirror.internode.on.net', path='/pub/ubuntu/ubuntu', params='', query='', fragment='')

 (3) This is rendered into the curtin data using contrib/preseeds_v2/curtin_userdata as follows:
apt_mirrors:
  ubuntu_archive: http://{{main_archive_hostname}}/{{main_archive_directory}}
  ubuntu_security: http://{{main_archive_hostname}}/{{main_archive_directory}}

  At this stage the slash is duplicate because it is present in both the and the string concatenation.

 (4) Interestingly the debian preseed section (debconf_selections) appears rendered by get_system_info() and directly outputs main_archive instead. Though there are other template for this for some cases such as enlisting that split it like the original version but this is not what ends up in /root/curtin-install-cfg.yaml after install.

Rather than decide where to add or remove the leading slash, I propose that we remove the hostname/directory split entirely. It does not appear to serve any purpose, and is simply glued back together again before being output from MAAS. The problem would then solve itself.

This would also have the advantage of not stripping the scheme (http/https) in case someone wishes to use https, even though that is not common right now because of gpg signing.

Related branches

Changed in maas:
importance: Undecided → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.