streamclone: use backgroundfilecloser (issue4889)
Closing files that have been appended to is slow on Windows/NTFS.
CloseHandle() calls on this platform often take 1-10ms - and that's
on my i7-6700K Skylake processor with a modern and fast SSD. Contrast
with other I/O operations, such as writing data, which take <100us.
This means that creating/appending thousands of files can add
significant overhead. For example, cloning mozilla-central creates
~232,000 revlog files. Assuming 1ms per CloseHandle(), that yields
232s (3:52) of wall time waiting for file closes!
The impact of this overhead can be measured most directly when applying
stream clone bundles. Applying these files is effectively uncompressing
a tar archive (read: it's very fast).
Using a RAM disk (read: no I/O wait), the difference in wall time for a
`hg debugapplystreamclonebundle` for a ~1731 MB mozilla-central bundle
between Windows and Linux from the same machine is drastic:
Linux: ~12.8s (128MB/s)
Windows: ~352.0s (4.7MB/s)
Windows is ~27.5x slower. Yikes!
After this patch:
Linux: ~12.8s (128MB/s)
Windows: ~102.1s (16.1MB/s)
Windows is now ~3.4x faster. Unfortunately, it is still ~8x slower than
Linux. Profiling reveals a few hot code paths that could likely be
improved. But those are for other patches.
This patch introduces test-clone-uncompressed.t because existing tests
of `clone --uncompressed` are scattered about and adding a variation for
background thread closing to e.g. test-http.t doesn't feel correct.
import os
from mercurial import hg, ui, merge
u = ui.ui()
repo = hg.repository(u, 'test1', create=1)
os.chdir('test1')
def commit(text, time):
repo.commit(text=text, date="%d 0" % time)
def addcommit(name, time):
f = open(name, 'w')
f.write('%s\n' % name)
f.close()
repo[None].add([name])
commit(name, time)
def update(rev):
merge.update(repo, rev, False, True)
def merge_(rev):
merge.update(repo, rev, True, False)
if __name__ == '__main__':
addcommit("A", 0)
addcommit("B", 1)
update(0)
addcommit("C", 2)
merge_(1)
commit("D", 3)
update(2)
addcommit("E", 4)
addcommit("F", 5)
update(3)
addcommit("G", 6)
merge_(5)
commit("H", 7)
update(5)
addcommit("I", 8)
# Ancestors
print 'Ancestors of 5'
for r in repo.changelog.ancestors([5]):
print r,
print '\nAncestors of 6 and 5'
for r in repo.changelog.ancestors([6, 5]):
print r,
print '\nAncestors of 5 and 4'
for r in repo.changelog.ancestors([5, 4]):
print r,
print '\nAncestors of 7, stop at 6'
for r in repo.changelog.ancestors([7], 6):
print r,
print '\nAncestors of 7, including revs'
for r in repo.changelog.ancestors([7], inclusive=True):
print r,
print '\nAncestors of 7, 5 and 3, including revs'
for r in repo.changelog.ancestors([7, 5, 3], inclusive=True):
print r,
# Descendants
print '\n\nDescendants of 5'
for r in repo.changelog.descendants([5]):
print r,
print '\nDescendants of 5 and 3'
for r in repo.changelog.descendants([5, 3]):
print r,
print '\nDescendants of 5 and 4'
for r in repo.changelog.descendants([5, 4]):
print r,