On 2/16/2011 2:35 PM, Michael Hudson-Doyle wrote:
> So, to possibly state the obvious, the problem here was that the conch
> process ran out of file handles, followed by poor error recovery?
Basically.
>
> In which case there are two more or less questions: 1) why did conch run
> out of file handles? 2) can we handle this situation better?
1) because we currently get close to the limit of connections,
especially right after a major rollout
2) yes
>
> For 1), is it just that there is an extra fd open per connection, or is
> there a leak? If the former, then we're presumably pretty close to
> hitting the fd limit in production from time to time!
>
I'm pretty sure we already run out from time to time, but handle
recovery better.
Here is what I can sort out:
1) We might be using 1 more handle for requesting the fork, but nothing
I can find is 'leaked'.
2) We are close to running out of handles in production right now.
3) The spawnProcess() code recovers cleanly from running out of
handles. Causing *that* connection to fail, but future connections
are unaffected.
4) The LPForkingService code leaks handles when it fails to spawn. So
while we have 1k handles, once we've started running out of them, we
have fewer and fewer to work with.
Which means that we run into the limit more and more until the
service is completely unusable.
5) The children spawned could clean up slightly better than they do
today.
(3) is true because we create the pipes *before* we fork+exec, and
if we were to run out of handles, we would just never exec in the
first place. Since we now have to spawn a process before we create
the handles to connect to it, this layering is broken. One crummy
option is to hold open a bunch of file handles, and close them when
we want to fork a new process, etc.
I'm working on some code to clean up the children forked, and then I
want to clean up the twisted code to make sure we don't leak during
failure periods.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 2/16/2011 2:35 PM, Michael Hudson-Doyle wrote:
> So, to possibly state the obvious, the problem here was that the conch
> process ran out of file handles, followed by poor error recovery?
Basically.
>
> In which case there are two more or less questions: 1) why did conch run
> out of file handles? 2) can we handle this situation better?
1) because we currently get close to the limit of connections,
especially right after a major rollout
2) yes
>
> For 1), is it just that there is an extra fd open per connection, or is
> there a leak? If the former, then we're presumably pretty close to
> hitting the fd limit in production from time to time!
>
I'm pretty sure we already run out from time to time, but handle
recovery better.
Here is what I can sort out:
1) We might be using 1 more handle for requesting the fork, but nothing
I can find is 'leaked'.
2) We are close to running out of handles in production right now.
3) The spawnProcess() code recovers cleanly from running out of
handles. Causing *that* connection to fail, but future connections
are unaffected.
4) The LPForkingService code leaks handles when it fails to spawn. So
while we have 1k handles, once we've started running out of them, we
have fewer and fewer to work with.
Which means that we run into the limit more and more until the
service is completely unusable.
5) The children spawned could clean up slightly better than they do
today.
(3) is true because we create the pipes *before* we fork+exec, and
if we were to run out of handles, we would just never exec in the
first place. Since we now have to spawn a process before we create
the handles to connect to it, this layering is broken. One crummy
option is to hold open a bunch of file handles, and close them when
we want to fork a new process, etc.
I'm working on some code to clean up the children forked, and then I
want to clean up the twisted code to make sure we don't leak during
failure periods.
John enigmail. mozdev. org/
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://
iEYEARECAAYFAk1 cQAsACgkQJdeBCY SNAAOpmgCeJGFAb KYhsCLuc+ woQQ5/TqxX voehoIdcxpxeB11 Lfhlinj
pQ8AoIu2f+
=oy5y
-----END PGP SIGNATURE-----