contrib/debugcmdserver.py
author Matt Harbison <matt_harbison@yahoo.com>
Fri, 17 Nov 2017 00:06:45 -0500
changeset 35476 417e8e040102
parent 28353 cd03fbd5ab57
child 43076 2372284d9457
permissions -rwxr-xr-x
lfs: verify lfs object content when transferring to and from the remote store This avoids inserting corrupt files into the usercache, and local and remote stores. One down side is that the bad file won't be available locally for forensic purposes after a remote download. I'm thinking about adding an 'incoming' directory to the local lfs store to handle the download, and then move it to the 'objects' directory after it passes verification. That would have the additional benefit of not concatenating each transfer chunk in memory until the full file is transferred. Verification isn't needed when the data is passed back through the revlog interface or when the oid was just calculated, but otherwise it is on by default. The additional overhead should be well worth avoiding problems with file based remote stores, or buggy lfs servers. Having two different verify functions is a little sad, but the full data of the blob is mostly passed around in memory, because that's what the revlog interface wants. The upload function, however, chunks up the data. It would be ideal if that was how the content is always handled, but that's probably a huge project. I don't really like printing the long hash, but `hg debugdata` isn't a public interface, and is the only way to get it. The filelog and revision info is nowhere near this area, so recommending `hg verify` is the easiest thing to do.

#!/usr/bin/env python
#
# Dumps output generated by Mercurial's command server in a formatted style to a
# given file or stderr if '-' is specified. Output is also written in its raw
# format to stdout.
#
# $ ./hg serve --cmds pipe | ./contrib/debugcmdserver.py -
# o, 52   -> 'capabilities: getencoding runcommand\nencoding: UTF-8'

from __future__ import absolute_import, print_function
import struct
import sys

if len(sys.argv) != 2:
    print('usage: debugcmdserver.py FILE')
    sys.exit(1)

outputfmt = '>cI'
outputfmtsize = struct.calcsize(outputfmt)

if sys.argv[1] == '-':
    log = sys.stderr
else:
    log = open(sys.argv[1], 'a')

def read(size):
    data = sys.stdin.read(size)
    if not data:
        raise EOFError
    sys.stdout.write(data)
    sys.stdout.flush()
    return data

try:
    while True:
        header = read(outputfmtsize)
        channel, length = struct.unpack(outputfmt, header)
        log.write('%s, %-4d' % (channel, length))
        if channel in 'IL':
            log.write(' -> waiting for input\n')
        else:
            data = read(length)
            log.write(' -> %r\n' % data)
        log.flush()
except EOFError:
    pass
finally:
    if log != sys.stderr:
        log.close()