worker: Use buffered input from the pickle stream
On Python 3, "pickle.load" will raise an exception ("_pickle.UnpicklingError:
pickle data was truncated") when it gets a short read, i.e. it receives fewer
bytes than it requested.
On our build machine, Mercurial seems to frequently hit this problem while
updating a mozilla-central clone iff it gets scheduled in batch mode. It is easy
to trigger with:
#wipe the workdir
rm -rf *
hg update null
chrt -b 0 hg update default
I've also written the following program, which demonstrates the core problem:
from __future__ import print_function
import io
import os
import pickle
import time
obj = {"a": 1, "b": 2}
obj_data = pickle.dumps(obj)
assert len(obj_data) > 10
rfd, wfd = os.pipe()
pid = os.fork()
if pid == 0:
os.close(rfd)
for _ in range(4):
time.sleep(0.5)
print("First write")
os.write(wfd, obj_data[:10])
time.sleep(0.5)
print("Second write")
os.write(wfd, obj_data[10:])
os._exit(0)
try:
os.close(wfd)
rfile = os.fdopen(rfd, "rb", 0)
print("Reading")
while True:
try:
obj_copy = pickle.load(rfile)
assert obj == obj_copy
except EOFError:
break
print("Success")
finally:
os.kill(pid, 15)
The program reliably fails with Python 3.8 and succeeds with Python 2.7.
Providing the unpickler with a buffered reader fixes the issue, so let
"os.fdopen" create one.
https://bugzilla.mozilla.org/show_bug.cgi?id=1604486
Differential Revision: https://phab.mercurial-scm.org/D8051
--- a/mercurial/worker.py Sat Feb 01 01:32:28 2020 -0500
+++ b/mercurial/worker.py Thu Jan 30 19:16:12 2020 +0100
@@ -226,7 +226,7 @@
selector = selectors.DefaultSelector()
for rfd, wfd in pipes:
os.close(wfd)
- selector.register(os.fdopen(rfd, 'rb', 0), selectors.EVENT_READ)
+ selector.register(os.fdopen(rfd, 'rb'), selectors.EVENT_READ)
def cleanup():
signal.signal(signal.SIGINT, oldhandler)