view mercurial/cacheutil.py @ 35787:a84dbc87dae9

exchange: send bundle2 stream clones uncompressed Stream clones don't compress well. And compression undermines a point of stream clones which is to trade significant CPU reductions by increasing size. Building upon our introduction of metadata to communicate bundle information back to callers of exchange.getbundlechunks(), we add an attribute to the bundler that communicates whether the bundle is best left uncompressed. We return this attribute as part of the bundle metadata. And the wire protocol honors it when determining whether to compress the wire protocol response. The added test demonstrates that the raw result from the wire protocol is not compressed. It also demonstrates that the server will serve stream responses when the feature isn't enabled. We'll address that in another commit. The effect of this change is that server-side CPU usage for bundle2 stream clones is significantly reduced by removing zstd compression. For the mozilla-unified repository: before: 37.69 user 8.01 system after: 27.38 user 7.34 system Assuming things are CPU bound, that ~10s reduction would translate to faster clones on the client. zstd can decompress at >1 GB/s. So the overhead from decompression on the client is small in the grand scheme of things. But if zlib compression were being used, the overhead would be much greater. Differential Revision: https://phab.mercurial-scm.org/D1926
author Gregory Szorc <gregory.szorc@gmail.com>
date Mon, 22 Jan 2018 12:12:29 -0800
parents 72fdd99eb526
children 57875cf423c9
line wrap: on
line source

# scmutil.py - Mercurial core utility functions
#
#  Copyright Matt Mackall <mpm@selenic.com> and other
#
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.
from __future__ import absolute_import

from . import repoview

def cachetocopy(srcrepo):
    """return the list of cache file valuable to copy during a clone"""
    # In local clones we're copying all nodes, not just served
    # ones. Therefore copy all branch caches over.
    cachefiles = ['branch2']
    cachefiles += ['branch2-%s' % f for f in repoview.filtertable]
    cachefiles += ['rbc-names-v1', 'rbc-revs-v1']
    cachefiles += ['tags2']
    cachefiles += ['tags2-%s' % f for f in repoview.filtertable]
    cachefiles += ['hgtagsfnodes1']
    return cachefiles