annotate tests/wireprotosimplecache.py @ 40021:c537144fdbef

wireprotov2: support response caching One of the things I've learned from managing VCS servers over the years is that they are hard to scale. It is well known that some companies have very beefy (read: very expensive) servers to power their VCS needs. It is also known that specialized servers for various VCS exist in order to facilitate scaling servers. (Mercurial is in this boat.) One of the aspects that make a VCS server hard to scale is the high CPU load incurred by constant client clone/pull operations. To alleviate the scaling pain associated with data retrieval operations, I want to integrate caching into the Mercurial wire protocol server as robustly as possible such that servers can aggressively cache responses and defer as much server load as possible. This commit represents the initial implementation of a general caching layer in wire protocol version 2. We define a new interface and behavior for a wire protocol cacher in repository.py. (This is probably where a reviewer should look first to understand what is going on.) The bulk of the added code is in wireprotov2server.py, where we define how a command can opt in to being cached and integrate caching into command dispatching. From a very high-level: * A command can declare itself as cacheable by providing a callable that can be used to derive a cache key. * At dispatch time, if a command is cacheable, we attempt to construct a cacher and use it for serving the request and/or caching the request. * The dispatch layer handles the bulk of the business logic for caching, making cachers mostly "dumb content stores." * The mechanism for invalidating cached entries (one of the harder parts about caching in general) is by varying the cache key when state changes. As such, cachers don't need to be concerned with cache invalidation. Initially, we've hooked up support for caching "manifestdata" and "filedata" commands. These are the simplest to cache, as they should be immutable over time. Caching of commands related to changeset data is a bit harder (because cache validation is impacted by changes to bookmarks, phases, etc). This will be implemented later. (Strictly speaking, censoring a file should invalidate caches. I've added an inline TODO to track this edge case.) To prove it works, this commit implements a test-only extension providing in-memory caching backed by an lrucachedict. A new test showing this extension behaving properly is added. FWIW, the cacher is ~50 lines of code, demonstrating the relative ease with which a cache can be added to a server. While the test cacher is not suitable for production workloads, just for kicks I performed a clone of just the changeset and manifest data for the mozilla-unified repository. With a fully warmed cache (of just the manifest data since changeset data is not cached), server-side CPU usage dropped from ~73s to ~28s. That's pretty significant and demonstrates the potential that response caching has on server scalability! Differential Revision: https://phab.mercurial-scm.org/D4773
author Gregory Szorc <gregory.szorc@gmail.com>
date Wed, 26 Sep 2018 17:16:56 -0700
parents
children 10cf8b116dd8
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
40021
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1 # wireprotosimplecache.py - Extension providing in-memory wire protocol cache
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
2 #
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
3 # Copyright 2018 Gregory Szorc <gregory.szorc@gmail.com>
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
4 #
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
5 # This software may be used and distributed according to the terms of the
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
6 # GNU General Public License version 2 or any later version.
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
7
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
8 from __future__ import absolute_import
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
9
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
10 from mercurial import (
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
11 extensions,
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
12 registrar,
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
13 repository,
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
14 util,
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
15 wireprototypes,
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
16 wireprotov2server,
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
17 )
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
18 from mercurial.utils import (
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
19 interfaceutil,
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
20 )
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
21
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
22 CACHE = None
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
23
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
24 configtable = {}
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
25 configitem = registrar.configitem(configtable)
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
26
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
27 configitem('simplecache', 'cacheobjects',
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
28 default=False)
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
29
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
30 @interfaceutil.implementer(repository.iwireprotocolcommandcacher)
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
31 class memorycacher(object):
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
32 def __init__(self, ui, command, encodefn):
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
33 self.ui = ui
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
34 self.encodefn = encodefn
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
35 self.key = None
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
36 self.cacheobjects = ui.configbool('simplecache', 'cacheobjects')
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
37 self.buffered = []
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
38
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
39 ui.log('simplecache', 'cacher constructed for %s\n', command)
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
40
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
41 def __enter__(self):
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
42 return self
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
43
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
44 def __exit__(self, exctype, excvalue, exctb):
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
45 if exctype:
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
46 self.ui.log('simplecache', 'cacher exiting due to error\n')
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
47
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
48 def adjustcachekeystate(self, state):
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
49 # Needed in order to make tests deterministic. Don't copy this
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
50 # pattern for production caches!
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
51 del state[b'repo']
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
52
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
53 def setcachekey(self, key):
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
54 self.key = key
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
55 return True
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
56
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
57 def lookup(self):
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
58 if self.key not in CACHE:
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
59 self.ui.log('simplecache', 'cache miss for %s\n', self.key)
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
60 return None
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
61
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
62 entry = CACHE[self.key]
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
63 self.ui.log('simplecache', 'cache hit for %s\n', self.key)
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
64
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
65 if self.cacheobjects:
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
66 return {
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
67 'objs': entry,
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
68 }
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
69 else:
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
70 return {
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
71 'objs': [wireprototypes.encodedresponse(entry)],
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
72 }
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
73
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
74 def onobject(self, obj):
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
75 if self.cacheobjects:
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
76 self.buffered.append(obj)
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
77 else:
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
78 self.buffered.extend(self.encodefn(obj))
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
79
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
80 yield obj
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
81
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
82 def onfinished(self):
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
83 self.ui.log('simplecache', 'storing cache entry for %s\n', self.key)
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
84 if self.cacheobjects:
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
85 CACHE[self.key] = self.buffered
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
86 else:
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
87 CACHE[self.key] = b''.join(self.buffered)
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
88
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
89 return []
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
90
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
91 def makeresponsecacher(orig, repo, proto, command, args, objencoderfn):
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
92 return memorycacher(repo.ui, command, objencoderfn)
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
93
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
94 def extsetup(ui):
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
95 global CACHE
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
96
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
97 CACHE = util.lrucachedict(10000)
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
98
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
99 extensions.wrapfunction(wireprotov2server, 'makeresponsecacher',
c537144fdbef wireprotov2: support response caching
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
100 makeresponsecacher)