graft: clarify in help that `-r` is not just optional
Positional parameters are also treated as revisions, but the order of revisions
matters and it will often be wrong if the user understands it as `-r` taking
multiple revisions as `-r REV1 REV2`.
(Alternatively, `-r` could be turned into a no-op flag as the documentation
suggests. That would however be less "semantic markup" and I agree with the
implementation in
55e7f352b1d3 but not the documentation.)
streamclone: use backgroundfilecloser (
issue4889)
Closing files that have been appended to is slow on Windows/NTFS.
CloseHandle() calls on this platform often take 1-10ms - and that's
on my i7-6700K Skylake processor with a modern and fast SSD. Contrast
with other I/O operations, such as writing data, which take <100us.
This means that creating/appending thousands of files can add
significant overhead. For example, cloning mozilla-central creates
~232,000 revlog files. Assuming 1ms per CloseHandle(), that yields
232s (3:52) of wall time waiting for file closes!
The impact of this overhead can be measured most directly when applying
stream clone bundles. Applying these files is effectively uncompressing
a tar archive (read: it's very fast).
Using a RAM disk (read: no I/O wait), the difference in wall time for a
`hg debugapplystreamclonebundle` for a ~1731 MB mozilla-central bundle
between Windows and Linux from the same machine is drastic:
Linux: ~12.8s (128MB/s)
Windows: ~352.0s (4.7MB/s)
Windows is ~27.5x slower. Yikes!
After this patch:
Linux: ~12.8s (128MB/s)
Windows: ~102.1s (16.1MB/s)
Windows is now ~3.4x faster. Unfortunately, it is still ~8x slower than
Linux. Profiling reveals a few hot code paths that could likely be
improved. But those are for other patches.
This patch introduces test-clone-uncompressed.t because existing tests
of `clone --uncompressed` are scattered about and adding a variation for
background thread closing to e.g. test-http.t doesn't feel correct.
scmutil: support background file closing
Closing files that have been appended to is relatively slow on
Windows/NTFS. This makes several Mercurial operations slower on
Windows.
The workaround to this issue is conceptually simple: use multiple
threads for I/O. Unfortunately, Python doesn't scale well to multiple
threads because of the GIL. And, refactoring our code to use threads
everywhere would be a huge undertaking. So, we decide to tackle this
problem by starting small: establishing a thread pool for closing
files.
This patch establishes a mechanism for closing file handles on separate
threads. The coordinator object is basically a queue of file handles to
operate on and a thread pool consuming from the queue.
When files are opened through the VFS layer, the caller can specify
that delay closing is allowed.
A proxy class for file handles has been added. We must use a proxy
because it isn't possible to modify __class__ on built-in types. This
adds some overhead. But as future patches will show, this overhead
is cancelled out by the benefit of closing file handles on background
threads.
templatekw: add {namespaces} keyword
This provides a general-purpose interface to all custom namespaces.
The {namespaces} keyword honors the definition order of namespaces as they
are kept by sortdict.
templatekw: move shownames() helper to be sorted alphabetically
I'll add shownamespaces(), which is similar to this function. I want to put
them nearby.
templater: make get(dict, key) return a single value
This is necessary to obtain a _hybrid object from a dict. If get() yields
a value, it would be stringified.
I see no benefit to make get() lazy, so this patch just changes "yield" to
"return".