Transports should explicitly track whether a port was specified

Bug #150860 reported by Andrew Bennetts
2
Affects Status Importance Assigned to Milestone
Bazaar
Fix Released
Undecided
Vincent Ladeuil

Bug Description

If a URL does not include an explicit port in its netloc (e.g. "http://example.com/"), and we have a default_port registered for that protocol in bzrlib, then we treat it exactly the same as if the user had entered "http://example.com:80/". This is usually ok; we take care to normalise URLs so that e.g. we don't generate "http://example.com:80/" because 80 is the default port.

However, it's not strictly correct for all protocols that having a standard port means we can consider a URL with an explicit port equal to the standard port the same as an otherwise-identical URL with no port. For instance "bzr+ssh://example.com/" would typically connect to port 22, but it's possible that the user has configured OpenSSH to default to a different port for that hostname. So "bzr+ssh://example.com:22/" is not necessarily equivalent to "bzr+ssh://example.com/". We currently avoid this problem by not defining default_port for bzr+ssh (the OpenSSH/paramiko layers will default to 22 anyway).

So the Transport interface (particularly ConnectedTransport._split_url in the current implementation) should not be lose this information; i.e. it should distinguish between "port was unspecified" and "port was specified and happens to be the same as the default". This may make the code a little simpler and clearer. It may also help avoid more bugs like bug 146715, which was caused by the current behaviour.

It would also make it possible for us to record and display back to the user URLs like "http://example.com:80/" if that's exactly what the user chose to input, which is arguably a good thing. At the moment the default port normalisation makes that impossible.

Revision history for this message
Vincent Ladeuil (vila) wrote :

Martin suggested:

When parsing a url, if no port is specified, leave it as None in the
parsed tuple or transport. When unparsing them, None goes back to
unspecified and any explicit port, even the default, is explicitly
shown.

The default port only comes in at the point of opening the network
socket if it's still None -- often this will be done inside pycurl or
something rather than bzrlib.

The only drawback is that if we ever get proto://foo/ and
proto://foo:defaultport/ they won't be parsed the same way. We can
specifically list which protocols can or can't do this. On the other
hand I'm not sure it's really critical - if we just keep urls as
they're passed in, then we should rarely see http://host:80/ with the
port number.

Revision history for this message
Vincent Ladeuil (vila) wrote :

From the comment above, I'd say _port in Transport objects should represent what the user have provided.

At connection time if port is None, then, and only then we may force the default value for tranports that requires it (and as Martin said, it may well be outside of bzrlib).

I don't think we will lose anything by not being able to recognize that proto://foo/ and proto://foo:defaultport/ are really the same thing.

Historically that's what bzr has always done.

Then by refactoring transport I introduced a bug for the bzr scheme trying to handle the default port (bug #133965).
Then Andrew did the right thing to fix that bug but introduced another one (bug #146715).

The solution seems to be to unroll those fixes an fix the first bug by making RemoteTransport specify 4155 in _build_medium when no port have been specified by the user.

Summary: port in brzlib is what the user provided, if still None at connection time scheme default port is used.

Changed in bzr:
assignee: nobody → v-ladeuil
Vincent Ladeuil (vila)
Changed in bzr:
status: New → In Progress
Vincent Ladeuil (vila)
Changed in bzr:
status: In Progress → Fix Committed
Vincent Ladeuil (vila)
Changed in bzr:
milestone: none → 1.0rc1
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.