[nycphp-talk] background processes (was: building an 1-way email list manager)
Hans Zaunere
lists at zaunere.com
Tue Nov 15 11:14:20 EST 2005
Chris Shiflett wrote on Tuesday, November 15, 2005 9:44 AM:
> Dan Cech wrote:
> > I for one would be very interested in a more in-depth discussion on
> > this subject, as it sounds like there is a lot more to it than I was
> > aware.
>
> You might find this useful (I did):
>
> http://netevil.org/node.php?nid=280
That's a good link (although I'm not sure I would agree that select() is a
form of threading) but it's not really the same as trying to run a process
in the background (daemonize).
The crux of a good fork is the double-fork, in addition to destroying the
relationship between the Apache process and PHP (file descriptors). For
mod_php, this is quite an infamous problem, which is related to things like:
http://www.securitytracker.com/alerts/2003/Dec/1008559.html
(and search for "Need to fork so apache doesn't kill us" for some C
examples).
And:
http://aspn.activestate.com/ASPN/Mail/Message/php-Dev/538021
Which as far as I know, is still an issue in Apache/PHP/etc (not CGI AFAIK).
Anyway, the basic best-practice to fork is something like:
function Daemonize() {
$pid = pcntl_fork();
if( $pid < 0 )
return NULL;
if( $pid )
exit(1);
posix_setsid();
$pid = pcntl_fork();
if( $pid < 0 )
return NULL;
if( $pid )
exit(1);
return posix_getpid();
}
(from http://www.pcomd.net/pfork)
That should completely detach a PHP process from it's Apache process... well
sort of. Because you're basically now left with an Apache process, and
fork() dupes file descriptors (read a BSD manpage for fork(2)) the child
process (even though it's a session leader and all that) needs to disconnect
from Apache's descriptors (similar to the "controlling terminal" comment in
the C example in the security link).
So what has worked typically is:
fclose(0); fclose(1); fclose(2);
You should be ok in most cases after this. However, from what I understand,
the Apache forked process inherits *all* file descriptors from Apache, which
is not good (yes, like database connections, etc). So, in an ideal world -
and especially for very long running processes - you would cycle through all
open file descriptors in the forked PHP process, and close them all. Then,
of course, reopen the ones you need.
Long story short: if you're running mod_php, don't expect to create a whole
other daemon out of Apache, but it's fine for long running batch processing
from what I've seen. And, in terms of preventing a user's browser from
"timing out", the important file descriptors to close are the magic three
(stdin/out/err).
---
Hans Zaunere / President / New York PHP
www.nyphp.org / www.nyphp.com
More information about the talk
mailing list