Mercurial > hg
view contrib/dumprevlog @ 40978:42f59d3f714d
delta: exclude base candidate much smaller than the target
If a revision's full text is that much bigger than a base candidate full text,
we no longer consider that candidate.
This solves a pathological case we encountered on a very specify repository.
It contains a long series of changesets with a very small manifest (one file)
co-existing with others changesets using a very large manifest.
Without this filtering, we ended up considering a large number of tiny full
snapshots as a potential base. It resulted in very large delta (the size of
the full text) and mercurial spending 99% of its time compressing these
deltas.
The timing of a commit moved from about 400s to about 10s (still slow, but not
ridiculously slow).
author | Boris Feld <boris.feld@octobus.net> |
---|---|
date | Mon, 17 Dec 2018 10:42:19 +0100 |
parents | a063b84ce064 |
children | 3518da504303 |
line wrap: on
line source
#!/usr/bin/env python # Dump revlogs as raw data stream # $ find .hg/store/ -name "*.i" | xargs dumprevlog > repo.dump from __future__ import absolute_import, print_function import sys from mercurial import ( encoding, node, pycompat, revlog, ) from mercurial.utils import ( procutil, ) for fp in (sys.stdin, sys.stdout, sys.stderr): procutil.setbinary(fp) def binopen(path, mode=b'rb'): if b'b' not in mode: mode = mode + b'b' return open(path, pycompat.sysstr(mode)) def printb(data, end=b'\n'): sys.stdout.flush() pycompat.stdout.write(data + end) for f in sys.argv[1:]: r = revlog.revlog(binopen, encoding.strtolocal(f)) print("file:", f) for i in r: n = r.node(i) p = r.parents(n) d = r.revision(n) printb(b"node: %s" % node.hex(n)) printb(b"linkrev: %d" % r.linkrev(i)) printb(b"parents: %s %s" % (node.hex(p[0]), node.hex(p[1]))) printb(b"length: %d" % len(d)) printb(b"-start-") printb(d) printb(b"-end-")