view contrib/dumprevlog @ 35599:af25237be091

perf: add threading capability to perfbdiff Since we are releasing the GIL during diffing, it is interesting to see how a thread pool would perform on diffing. We add a new `--threads` argument to commands. Synchronizing the thread pool is a bit complex because we want to be able to reuse it from one run to another. On my computer (i7 with 4 cores + hyperthreading), I get the following data for about 12000 revisions: threads wall comb wall gain comb overhead none 31.596715 31.59 0.00% 0.00% 1 31.621228 31.62 -0.08% 0.09% 2 16.406202 32.8 48.08% 3.83% 3 11.598334 34.76 63.29% 10.03% 4 9.205421 36.77 70.87% 16.40% 5 8.517604 42.51 73.04% 34.57% 6 7.94645 47.58 74.85% 50.62% 7 7.434972 51.92 76.47% 64.36% 8 7.070638 55.34 77.62% 75.18% Compared to the feature disabled (threads=0), the overhead is negligible with the threading code (threads=1), and the gain is already 48% with two threads.
author Boris Feld <boris.feld@octobus.net>
date Sun, 17 Dec 2017 04:31:27 +0100
parents 6359b80f15fb
children a915465a731e
line wrap: on
line source

#!/usr/bin/env python
# Dump revlogs as raw data stream
# $ find .hg/store/ -name "*.i" | xargs dumprevlog > repo.dump

from __future__ import absolute_import, print_function

import sys
from mercurial import (
    node,
    revlog,
    util,
)

for fp in (sys.stdin, sys.stdout, sys.stderr):
    util.setbinary(fp)

for f in sys.argv[1:]:
    binopen = lambda fn: open(fn, 'rb')
    r = revlog.revlog(binopen, f)
    print("file:", f)
    for i in r:
        n = r.node(i)
        p = r.parents(n)
        d = r.revision(n)
        print("node:", node.hex(n))
        print("linkrev:", r.linkrev(i))
        print("parents:", node.hex(p[0]), node.hex(p[1]))
        print("length:", len(d))
        print("-start-")
        print(d)
        print("-end-")