view contrib/dumprevlog @ 33927:853574db5b12

encoding: add fast path of from/tolocal() for ASCII strings This is micro optimization, but seems not bad since to/fromlocal() is called lots of times and isasciistr() is cheap and simple. We boldly assume that any non-ASCII characters have at least one 8-bit byte. This isn't true for some email character sets (e.g. ISO-2022-JP and UTF-7), but I believe no such encodings are used as a platform default. Shift_JIS, a major crap, is okay as it should have a leading byte in 0x80-0xff range. (with mercurial repo) $ export HGRCPATH=/dev/null HGPLAIN= $ hg log --time --config experimental.stabilization=all > /dev/null (original) time: real 7.460 secs (user 7.420+0.000 sys 0.030+0.000) time: real 7.670 secs (user 7.590+0.000 sys 0.080+0.000) time: real 7.560 secs (user 7.510+0.000 sys 0.040+0.000) (this patch) time: real 7.340 secs (user 7.260+0.000 sys 0.060+0.000) time: real 7.260 secs (user 7.210+0.000 sys 0.030+0.000) time: real 7.310 secs (user 7.260+0.000 sys 0.060+0.000)
author Yuya Nishihara <yuya@tcha.org>
date Sun, 23 Apr 2017 13:06:23 +0900
parents 6359b80f15fb
children a915465a731e
line wrap: on
line source

#!/usr/bin/env python
# Dump revlogs as raw data stream
# $ find .hg/store/ -name "*.i" | xargs dumprevlog > repo.dump

from __future__ import absolute_import, print_function

import sys
from mercurial import (
    node,
    revlog,
    util,
)

for fp in (sys.stdin, sys.stdout, sys.stderr):
    util.setbinary(fp)

for f in sys.argv[1:]:
    binopen = lambda fn: open(fn, 'rb')
    r = revlog.revlog(binopen, f)
    print("file:", f)
    for i in r:
        n = r.node(i)
        p = r.parents(n)
        d = r.revision(n)
        print("node:", node.hex(n))
        print("linkrev:", r.linkrev(i))
        print("parents:", node.hex(p[0]), node.hex(p[1]))
        print("length:", len(d))
        print("-start-")
        print(d)
        print("-end-")