On 2/16/2011 2:35 PM, Michael Hudson-Doyle wrote:
> So, to possibly state the obvious, the problem here was that the conch
> process ran out of file handles, followed by poor error recovery?
>
> In which case there are two more or less questions: 1) why did conch run
> out of file handles? 2) can we handle this situation better?
>
> For 1), is it just that there is an extra fd open per connection, or is
> there a leak? If the former, then we're presumably pretty close to
> hitting the fd limit in production from time to time!
>
Looking again, there is one more handle per connection. It is the one we
use to detect the process exiting.
So we now have:
1) network socket from client
2) stdin
3) stderr
4) stdout
5) socket from forking server which will write, eg 'exited 10\n', when
the child process finally exits.
So we've effectively reduced our peak concurrent requests from about 250
down to about 200.
Coupled with the poor cleanup once we've gotten into that situation, it
gets bad fast.
Note that Andrew has already started work on allowing us to run multiple
Conch processes (and associated forking processes), so that we can
handle no-downtime deployments anyway. Which also helps us with high
availability, etc.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 2/16/2011 2:35 PM, Michael Hudson-Doyle wrote:
> So, to possibly state the obvious, the problem here was that the conch
> process ran out of file handles, followed by poor error recovery?
>
> In which case there are two more or less questions: 1) why did conch run
> out of file handles? 2) can we handle this situation better?
>
> For 1), is it just that there is an extra fd open per connection, or is
> there a leak? If the former, then we're presumably pretty close to
> hitting the fd limit in production from time to time!
>
Looking again, there is one more handle per connection. It is the one we
use to detect the process exiting.
So we now have:
1) network socket from client
2) stdin
3) stderr
4) stdout
5) socket from forking server which will write, eg 'exited 10\n', when
the child process finally exits.
So we've effectively reduced our peak concurrent requests from about 250
down to about 200.
Coupled with the poor cleanup once we've gotten into that situation, it
gets bad fast.
Note that Andrew has already started work on allowing us to run multiple
Conch processes (and associated forking processes), so that we can
handle no-downtime deployments anyway. Which also helps us with high
availability, etc.
John enigmail. mozdev. org/
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://
iEYEARECAAYFAk1 cRRcACgkQJdeBCY SNAAPQ6gCeM3xSm iLLMcd0/ XCPwpM1kJkw RMEzf03ZVztKDYS 9s
86sAoM3lbLg43aN
=eUmX
-----END PGP SIGNATURE-----