test-worker: exercise more about "killworkers" situation
This patch adds some sleep and increases numcpus to exercise the
"killworkers" situation. So race conditions could be discovered more easily,
if someone changes worker.py incorrectly. Currently worker.py should be free
of such issues.
test-worker: capture tracebacks more reliably
The traceback test may have traceback caused by SIGTERM. Instead of grepping
"Traceback", explicitly grep the exception we care about.
This makes the test less flaky.
worker: rewrite error handling so os._exit covers all cases
Previously the worker error handling is like:
pid = os.fork() --+
if pid == 0: |
.... | problematic
.... --+
try: --+
.... | worker error handling
--+
If a signal arrives when Python is executing the "problematic" lines, an
external error handling (dispatch.py) will take over the control flow and
it's no longer guaranteed "os._exit" is called (see
86cd09bc13ba for why it
is necessary).
This patch rewrites the error handling so it covers all possible code paths
for a worker even during fork.
Note: "os.getpid() == parentpid" is used to test if the process is parent or
not intentionally, instead of checking "pid", because "pid = os.fork()" may
be not atomic - it's possible that that a signal hits the worker before the
assignment completes [1]. The newly added test replaces "os.fork" to
exercise that extreme case.
[1]: CPython compiles "pid = os.fork()" to 2 byte codes: "CALL_FUNCTION" and
"STORE_FAST", so it's probably not atomic:
def f():
pid = os.fork()
dis.dis(f)
2 0 LOAD_GLOBAL 0 (os)
3 LOAD_ATTR 1 (fork)
6 CALL_FUNCTION 0
9 STORE_FAST 0 (pid)
12 LOAD_CONST 0 (None)
15 RETURN_VALUE