Augie Fackler <augie@google.com> [Wed, 16 Nov 2016 23:29:28 -0500] rev 30417
merge with stable
Jun Wu <quark@fb.com> [Sat, 12 Nov 2016 03:06:07 +0000] rev 30416
worker: stop using a separate thread waiting for children
Now that we have a SIGCHLD hander, and it could get executed when waiting
for I/O. It's no longer necessary to have a separated waitpid thread. So
just remove it.
Jun Wu <quark@fb.com> [Sat, 12 Nov 2016 03:07:22 +0000] rev 30415
worker: add a SIGCHLD handler to collect worker immediately
As planned by previous patches, add a SIGCHLD handler to get notifications
about worker exits, and deals with worker failure immediately.
Note that the SIGCHLD handler gets unregistered before killworkers(), so
SIGCHLD won't interrupt "killworkers" - making it harder to send kill
signals to waited processes.
Jun Wu <quark@fb.com> [Tue, 15 Nov 2016 02:12:16 +0000] rev 30414
worker: make waitforworkers reentrant
We are going to use it in the SIGCHLD handler. The handler will be executed
in the main thread with the non-blocking version of waitpid, while the
waitforworkers thread runs the blocking version. It's possible that one of
them collects a worker and makes the other error out (no child to wait).
This patch handles these errors: ECHILD is ignored. EINTR needs a retry.
The "pids" set is designed to be only modifiable by "waitforworkers". And we
only remove items after a successful waitpid. Since a child process can only
be "waitpid"-ed once. It's guaranteed that "pids.remove(p)" won't be called
with duplicated "p"s. And once a "p" is removed from "pids", that "p" does
not need to be killed or waited any more.
Jun Wu <quark@fb.com> [Tue, 15 Nov 2016 02:10:40 +0000] rev 30413
worker: change "pids" to a set
There is no need to keep any order of the "pids" array. A set is more
efficient for the "remove" operation. And the following patch will use that.
Jun Wu <quark@fb.com> [Thu, 28 Jul 2016 20:57:07 +0100] rev 30412
worker: allow waitforworkers to be non-blocking
This patch adds a boolean flag to waitforworkers and makes it non-blocking
if set to True.
This is to make it possible that we can reap our workers while keep other
unrelated children untouched, after receiving SIGCHLD.
Jun Wu <quark@fb.com> [Thu, 28 Jul 2016 20:51:20 +0100] rev 30411
worker: wait worker pid explicitly
Before this patch, waitforworkers uses os.wait() to collect child workers, and
only wait len(pids) processes. This can have serious issues if other code
spawns new processes and does not reap them: 1. worker.py may get wrong exit
code and kill innocent workers. 2. worker.py may continue without waiting for
all workers to complete.
This patch fixes the issue by using waitpid to wait worker pid explicitly.
However, this patch introduces a new issue: worker failure may not be handled
immediately. The issue will be addressed in next patches.
Jun Wu <quark@fb.com> [Thu, 28 Jul 2016 20:49:57 +0100] rev 30410
worker: move killworkers and waitforworkers up
We need to use them in the SIGCHLD handler and SIGCHLD handler should be
installed before fork.
Jun Wu <quark@fb.com> [Fri, 11 Nov 2016 21:11:17 +0000] rev 30409
osutil: implement setprocname to set process title for some platforms
This patch adds a simple setprocname method to osutil. The operation is not
defined by any standard and is platform-specific, the current implementation
tries to cover some major platforms (ex. Linux, OS X, FreeBSD) that is
relatively easy to support. Other platforms (Windows [4], other BSDs, ...)
can be added in the future.
The current implementation supports two methods to change process title:
a. setproctitle if available (works in FreeBSD).
b. rewrite argv in place (works in Linux [1] and Mac OS X). [2] [3]
[1]: Linux has "prctl(PR_SET_NAME, ...)" but 1) it has 16-byte limit, which
is too small; 2) it is not quite equivalent to what we want - it changes
"/proc/self/comm", not "/proc/self/cmdline" - "comm" change won't show up
in "ps" output unless "-o comm" is used.
[2]: The implementation does not rewrite the **environ buffer like some
other implementations do, just to make the code simpler and safer. However,
this also means the buffer size we can rewrite is significantly shorter. If
we are really greedy and want the "environ" space, we can change the
implementation later.
[3]: It requires a CPython private API: Py_GetArgcArgv to get the original
argv. Unfortunately Python 3 makes a copy of argv and returns the wchar_t
version, so it is not supported for now. (if we really want to, we could
count backwards from "char **environ", given known argc and argv, not sure
if that's a good idea - probably not)
[4]: The feature is aimed to make it easier for forked command server
processes to show what they are doing. Since Windows does not support
fork(), despite it's a major platform, its support is not added in this
patch.
Jun Wu <quark@fb.com> [Fri, 11 Nov 2016 20:45:40 +0000] rev 30408
setup: test setproctitle before building osutil
We are going to use setproctitle (provided by FreeBSD) if it's available in
the next patch. Therefore provide a macro to give some clues to the C
pre-processor so it could choose code path wisely.