tests/wireprotosimplecache.py
author Gregory Szorc <gregory.szorc@gmail.com>
Wed, 26 Sep 2018 17:16:56 -0700
changeset 40022 c537144fdbef
child 40024 10cf8b116dd8
permissions -rw-r--r--
wireprotov2: support response caching One of the things I've learned from managing VCS servers over the years is that they are hard to scale. It is well known that some companies have very beefy (read: very expensive) servers to power their VCS needs. It is also known that specialized servers for various VCS exist in order to facilitate scaling servers. (Mercurial is in this boat.) One of the aspects that make a VCS server hard to scale is the high CPU load incurred by constant client clone/pull operations. To alleviate the scaling pain associated with data retrieval operations, I want to integrate caching into the Mercurial wire protocol server as robustly as possible such that servers can aggressively cache responses and defer as much server load as possible. This commit represents the initial implementation of a general caching layer in wire protocol version 2. We define a new interface and behavior for a wire protocol cacher in repository.py. (This is probably where a reviewer should look first to understand what is going on.) The bulk of the added code is in wireprotov2server.py, where we define how a command can opt in to being cached and integrate caching into command dispatching. From a very high-level: * A command can declare itself as cacheable by providing a callable that can be used to derive a cache key. * At dispatch time, if a command is cacheable, we attempt to construct a cacher and use it for serving the request and/or caching the request. * The dispatch layer handles the bulk of the business logic for caching, making cachers mostly "dumb content stores." * The mechanism for invalidating cached entries (one of the harder parts about caching in general) is by varying the cache key when state changes. As such, cachers don't need to be concerned with cache invalidation. Initially, we've hooked up support for caching "manifestdata" and "filedata" commands. These are the simplest to cache, as they should be immutable over time. Caching of commands related to changeset data is a bit harder (because cache validation is impacted by changes to bookmarks, phases, etc). This will be implemented later. (Strictly speaking, censoring a file should invalidate caches. I've added an inline TODO to track this edge case.) To prove it works, this commit implements a test-only extension providing in-memory caching backed by an lrucachedict. A new test showing this extension behaving properly is added. FWIW, the cacher is ~50 lines of code, demonstrating the relative ease with which a cache can be added to a server. While the test cacher is not suitable for production workloads, just for kicks I performed a clone of just the changeset and manifest data for the mozilla-unified repository. With a fully warmed cache (of just the manifest data since changeset data is not cached), server-side CPU usage dropped from ~73s to ~28s. That's pretty significant and demonstrates the potential that response caching has on server scalability! Differential Revision: https://phab.mercurial-scm.org/D4773
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
40022
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
     1
# wireprotosimplecache.py - Extension providing in-memory wire protocol cache
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
     2
#
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
     3
# Copyright 2018 Gregory Szorc <gregory.szorc@gmail.com>
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
     4
#
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
     5
# This software may be used and distributed according to the terms of the
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
     6
# GNU General Public License version 2 or any later version.
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
     7
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
     8
from __future__ import absolute_import
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
     9
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    10
from mercurial import (
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    11
    extensions,
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    12
    registrar,
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    13
    repository,
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    14
    util,
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    15
    wireprototypes,
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    16
    wireprotov2server,
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    17
)
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    18
from mercurial.utils import (
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    19
    interfaceutil,
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    20
)
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    21
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    22
CACHE = None
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    23
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    24
configtable = {}
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    25
configitem = registrar.configitem(configtable)
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    26
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    27
configitem('simplecache', 'cacheobjects',
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    28
           default=False)
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    29
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    30
@interfaceutil.implementer(repository.iwireprotocolcommandcacher)
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    31
class memorycacher(object):
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    32
    def __init__(self, ui, command, encodefn):
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    33
        self.ui = ui
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    34
        self.encodefn = encodefn
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    35
        self.key = None
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    36
        self.cacheobjects = ui.configbool('simplecache', 'cacheobjects')
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    37
        self.buffered = []
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    38
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    39
        ui.log('simplecache', 'cacher constructed for %s\n', command)
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    40
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    41
    def __enter__(self):
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    42
        return self
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    43
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    44
    def __exit__(self, exctype, excvalue, exctb):
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    45
        if exctype:
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    46
            self.ui.log('simplecache', 'cacher exiting due to error\n')
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    47
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    48
    def adjustcachekeystate(self, state):
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    49
        # Needed in order to make tests deterministic. Don't copy this
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    50
        # pattern for production caches!
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    51
        del state[b'repo']
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    52
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    53
    def setcachekey(self, key):
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    54
        self.key = key
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    55
        return True
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    56
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    57
    def lookup(self):
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    58
        if self.key not in CACHE:
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    59
            self.ui.log('simplecache', 'cache miss for %s\n', self.key)
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    60
            return None
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    61
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    62
        entry = CACHE[self.key]
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    63
        self.ui.log('simplecache', 'cache hit for %s\n', self.key)
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    64
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    65
        if self.cacheobjects:
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    66
            return {
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    67
                'objs': entry,
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    68
            }
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    69
        else:
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    70
            return {
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    71
                'objs': [wireprototypes.encodedresponse(entry)],
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    72
            }
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    73
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    74
    def onobject(self, obj):
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    75
        if self.cacheobjects:
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    76
            self.buffered.append(obj)
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    77
        else:
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    78
            self.buffered.extend(self.encodefn(obj))
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    79
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    80
        yield obj
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    81
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    82
    def onfinished(self):
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    83
        self.ui.log('simplecache', 'storing cache entry for %s\n', self.key)
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    84
        if self.cacheobjects:
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    85
            CACHE[self.key] = self.buffered
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    86
        else:
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    87
            CACHE[self.key] = b''.join(self.buffered)
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    88
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    89
        return []
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    90
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    91
def makeresponsecacher(orig, repo, proto, command, args, objencoderfn):
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    92
    return memorycacher(repo.ui, command, objencoderfn)
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    93
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    94
def extsetup(ui):
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    95
    global CACHE
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    96
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    97
    CACHE = util.lrucachedict(10000)
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    98
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    99
    extensions.wrapfunction(wireprotov2server, 'makeresponsecacher',
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   100
                            makeresponsecacher)