Comment 13 for bug 308181

Revision history for this message
In , Dnewman (dnewman) wrote :

This past weekend I was working on some "proof of concept" code to kind of hash
out how it is going to work and some things occured to me that maybe others can
comment on.

If this was a simple ftp client, such as ncftp, then when I initiate a
connection to domain.org I could do a simple DNS lookup of _ftp._tcp.domain.org
and plow through the list of SRV records, sorted by priority and picked at
random by weight within the same priority. When a connection failes, remove it
from the list and try again. Once a successful connection has been established
the data concerning SRV records could be thrown away.

The web, by contrast, is a completely different paradigm. Instead of having
one continuous connection throughout the life of the visit, a seperate
connection is used for each "transaction". And each transaction may be
comprised of multiple connections when a particular page contains multiple
objects.

This is where my experience with Cisco Local Directors, ArrowPoints and Central
Dispatch comes in and confuses the issue. These devices are keeping running
tabs on the services that they are watching. For example the Local Director
will mark a service as bad when it gets a failed connection. Even though the
connection would finally make it over to a good server the user experienced a
delay assuming the server was off the network and had to wait for TCP
timeouts. From that point on the LD would direct traffic to services it knew
were good. Every once in a while it would try the bad one just for yuks to see
if it had been fixed.

So assuming we don't want the user to get hung up on a down server every time
they traverse the service list we need to keep a running tab on sevices we've
tried in the past. So does this mean that once we have queried the SRV record
we keep it cached within mozilla until the TTL runs out? Or do we make actual
DNS queries every time and run through the same algorithm to choose a server?
Or we could keep a running list of servers we've connected to and how
successful we were at it in a seperate list. Then we could do seperate SRV
queries every time and compare what we get back to our running list. We would
then try a server marked as down only if we had no other alternatives.

That then brings up the issue of how long do we keep the information around
for? I suppose it is no different than a history, but it eats up memory.