Mercurial > hg
view tests/test-hgweb-non-interactive.t @ 49269:395f28064826
worker: avoid potential partial write of pickled data
Previously, the code wrote the pickled data using os.write(). However,
os.write() can write less bytes than passed to it. To trigger the problem, the
pickled data had to be larger than 2147479552 bytes on my system.
Instead, open a file object and pass it to pickle.dump(). This also has the
advantage that it doesn’t buffer the whole pickled data in memory.
Note that the opened file must be buffered because pickle doesn’t support
unbuffered streams because unbuffered streams’ write() method might write less
bytes than passed to it (like os.write()) but pickle.dump() relies on that all
bytes are written (see https://github.com/python/cpython/issues/93050).
The side effect of using a file object and a with statement is that wfd is
explicitly closed now while it seems like before it was implicitly closed by
process exit.
author | Manuel Jacob <me@manueljacob.de> |
---|---|
date | Sun, 22 May 2022 03:50:34 +0200 |
parents | 42d2b31cee0b |
children |
line wrap: on
line source
Tests if hgweb can run without touching sys.stdin, as is required by the WSGI standard and strictly implemented by mod_wsgi. $ hg init repo $ cd repo $ echo foo > bar $ hg add bar $ hg commit -m "test" $ cat > request.py <<EOF > import os > import sys > from mercurial import ( > dispatch, > encoding, > hg, > ui as uimod, > util, > ) > from mercurial.utils import ( > procutil, > ) > ui = uimod.ui > from mercurial.hgweb import hgweb_mod > stringio = util.stringio > > class FileLike(object): > def __init__(self, real): > self.real = real > def fileno(self): > print >> sys.__stdout__, 'FILENO' > return self.real.fileno() > def read(self): > print >> sys.__stdout__, 'READ' > return self.real.read() > def readline(self): > print >> sys.__stdout__, 'READLINE' > return self.real.readline() > > sys.stdin = FileLike(sys.stdin) > errors = stringio() > input = stringio() > output = stringio() > > def startrsp(status, headers): > print('---- STATUS') > print(status) > print('---- HEADERS') > print([i for i in headers if i[0] != 'ETag']) > print('---- DATA') > return output.write > > env = { > 'wsgi.version': (1, 0), > 'wsgi.url_scheme': 'http', > 'wsgi.errors': errors, > 'wsgi.input': input, > 'wsgi.multithread': False, > 'wsgi.multiprocess': False, > 'wsgi.run_once': False, > 'REQUEST_METHOD': 'GET', > 'SCRIPT_NAME': '', > 'PATH_INFO': '', > 'QUERY_STRING': '', > 'SERVER_NAME': '$LOCALIP', > 'SERVER_PORT': os.environ['HGPORT'], > 'SERVER_PROTOCOL': 'HTTP/1.0' > } > > i = hgweb_mod.hgweb(b'.') > for c in i(env, startrsp): > pass > sys.stdout.flush() > procutil.stdout.write(b'---- ERRORS\n') > procutil.stdout.write(b'%s\n' % errors.getvalue()) > print('---- OS.ENVIRON wsgi variables') > print(sorted([x for x in os.environ if x.startswith('wsgi')])) > print('---- request.ENVIRON wsgi variables') > with i._obtainrepo() as repo: > print(sorted([encoding.strfromlocal(x) for x in repo.ui.environ > if x.startswith(b'wsgi')])) > EOF $ "$PYTHON" request.py ---- STATUS 200 Script output follows ---- HEADERS [('Content-Type', 'text/html; charset=ascii')] ---- DATA ---- ERRORS ---- OS.ENVIRON wsgi variables [] ---- request.ENVIRON wsgi variables ['wsgi.errors', 'wsgi.input', 'wsgi.multiprocess', 'wsgi.multithread', 'wsgi.run_once', 'wsgi.url_scheme', 'wsgi.version'] $ cd ..