changeset 25402:0c2ded041d10

exchange: support transferring .hgtags fnodes mapping On Mozilla's mozilla-beta repository .hgtags fnodes resolution takes ~18s from a clean cache on my machine. This means that the first time a user runs `hg tags`, `hg log`, or any other command that displays or accesses tags data, a ~18s pause will occur. There is no output during this pause. This results in a poor user experience and perception that Mercurial is slow. The .hgtags changeset to filenode mapping is deterministic. This patch takes advantage of that property by implementing support for transferring .hgtags filenodes mappings in a dedicated bundle2 part. When a client advertising support for the "hgtagsfnodes" capability requests a bundle, a mapping of changesets to .hgtags filenodes will be sent to the client. Only mappings of head changesets included in the bundle will be sent. The transfer of this mapping effectively eliminates one time tags cache related pauses after initial clone. The mappings are sent as binary data. So, 40 bytes per pair of SHA-1s. On the aforementioned mozilla-beta repository, 659 * 40 = 26,360 raw bytes of mappings are sent over the wire (in addition to the bundle part headers). Assuming 18s to populate the cache, we only need to transfer this extra data faster than 1.5 KB/s for overall clone + tags cache population time to be shorter. Put into perspective, the mozilla-beta repository is ~1 GB in size. So, this additional data constitutes <0.01% of the cloned data. The marginal overhead for a multi-second performance win on clones in my opinion justifies an on-by-default behavior.
author Gregory Szorc <gregory.szorc@gmail.com>
date Mon, 25 May 2015 17:14:11 -0700
parents d29201352af7
children 30ab130af221
files mercurial/exchange.py tests/test-tags.t
diffstat 2 files changed, 118 insertions(+), 0 deletions(-) [+]
line wrap: on
line diff
--- a/mercurial/exchange.py	Mon Jun 01 20:23:22 2015 -0700
+++ b/mercurial/exchange.py	Mon May 25 17:14:11 2015 -0700
@@ -12,6 +12,7 @@
 import util, scmutil, changegroup, base85, error, store
 import discovery, phases, obsolete, bookmarks as bookmod, bundle2, pushkey
 import lock as lockmod
+import tags
 
 def readbundle(ui, fh, fname, vfs=None):
     header = changegroup.readexactly(fh, 4)
@@ -1285,6 +1286,49 @@
         markers = sorted(markers)
         buildobsmarkerspart(bundler, markers)
 
+@getbundle2partsgenerator('hgtagsfnodes')
+def _getbundletagsfnodes(bundler, repo, source, bundlecaps=None,
+                         b2caps=None, heads=None, common=None,
+                         **kwargs):
+    """Transfer the .hgtags filenodes mapping.
+
+    Only values for heads in this bundle will be transferred.
+
+    The part data consists of pairs of 20 byte changeset node and .hgtags
+    filenodes raw values.
+    """
+    # Don't send unless:
+    # - changeset are being exchanged,
+    # - the client supports it.
+    if not (kwargs.get('cg', True) and 'hgtagsfnodes' in b2caps):
+        return
+
+    outgoing = changegroup.computeoutgoing(repo, heads, common)
+
+    if not outgoing.missingheads:
+        return
+
+    cache = tags.hgtagsfnodescache(repo.unfiltered())
+    chunks = []
+
+    # .hgtags fnodes are only relevant for head changesets. While we could
+    # transfer values for all known nodes, there will likely be little to
+    # no benefit.
+    #
+    # We don't bother using a generator to produce output data because
+    # a) we only have 40 bytes per head and even esoteric numbers of heads
+    # consume little memory (1M heads is 40MB) b) we don't want to send the
+    # part if we don't have entries and knowing if we have entries requires
+    # cache lookups.
+    for node in outgoing.missingheads:
+        # Don't compute missing, as this may slow down serving.
+        fnode = cache.getfnode(node, computemissing=False)
+        if fnode is not None:
+            chunks.extend([node, fnode])
+
+    if chunks:
+        bundler.newpart('hgtagsfnodes', data=''.join(chunks))
+
 def check_heads(repo, their_heads, context):
     """check if the heads of a repo have been modified
 
--- a/tests/test-tags.t	Mon Jun 01 20:23:22 2015 -0700
+++ b/tests/test-tags.t	Mon May 25 17:14:11 2015 -0700
@@ -625,3 +625,77 @@
   globaltag                          0:bbd179dfa0a7
 
   $ cd ..
+
+Create a repository with tags data to test .hgtags fnodes transfer
+
+  $ hg init tagsserver
+  $ cd tagsserver
+  $ cat > .hg/hgrc << EOF
+  > [experimental]
+  > bundle2-exp=True
+  > EOF
+  $ touch foo
+  $ hg -q commit -A -m initial
+  $ hg tag -m 'tag 0.1' 0.1
+  $ echo second > foo
+  $ hg commit -m second
+  $ hg tag -m 'tag 0.2' 0.2
+  $ hg tags
+  tip                                3:40f0358cb314
+  0.2                                2:f63cc8fe54e4
+  0.1                                0:96ee1d7354c4
+  $ cd ..
+
+Cloning should pull down hgtags fnodes mappings and write the cache file
+
+  $ hg --config experimental.bundle2-exp=True clone --pull tagsserver tagsclient
+  requesting all changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 4 changesets with 4 changes to 2 files
+  updating to branch default
+  2 files updated, 0 files merged, 0 files removed, 0 files unresolved
+
+Missing tags2* files means the cache wasn't written through the normal mechanism.
+
+  $ ls tagsclient/.hg/cache
+  branch2-served
+  hgtagsfnodes1
+  rbc-names-v1
+  rbc-revs-v1
+
+Cache should contain the head only, even though other nodes have tags data
+
+  $ f --size --hexdump tagsclient/.hg/cache/hgtagsfnodes1
+  tagsclient/.hg/cache/hgtagsfnodes1: size=96
+  0000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
+  0010: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
+  0020: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
+  0030: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
+  0040: ff ff ff ff ff ff ff ff 40 f0 35 8c 19 e0 a7 d3 |........@.5.....|
+  0050: 8a 5c 6a 82 4d cf fb a5 87 d0 2f a3 1e 4f 2f 8a |.\j.M...../..O/.|
+
+Running hg tags should produce tags2* file and not change cache
+
+  $ hg -R tagsclient tags
+  tip                                3:40f0358cb314
+  0.2                                2:f63cc8fe54e4
+  0.1                                0:96ee1d7354c4
+
+  $ ls tagsclient/.hg/cache
+  branch2-served
+  hgtagsfnodes1
+  rbc-names-v1
+  rbc-revs-v1
+  tags2-visible
+
+  $ f --size --hexdump tagsclient/.hg/cache/hgtagsfnodes1
+  tagsclient/.hg/cache/hgtagsfnodes1: size=96
+  0000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
+  0010: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
+  0020: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
+  0030: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
+  0040: ff ff ff ff ff ff ff ff 40 f0 35 8c 19 e0 a7 d3 |........@.5.....|
+  0050: 8a 5c 6a 82 4d cf fb a5 87 d0 2f a3 1e 4f 2f 8a |.\j.M...../..O/.|
+