Mercurial > hg
view hgext/acl.py @ 40021:c537144fdbef
wireprotov2: support response caching
One of the things I've learned from managing VCS servers over the
years is that they are hard to scale. It is well known that some
companies have very beefy (read: very expensive) servers to power
their VCS needs. It is also known that specialized servers for
various VCS exist in order to facilitate scaling servers. (Mercurial
is in this boat.)
One of the aspects that make a VCS server hard to scale is the
high CPU load incurred by constant client clone/pull operations.
To alleviate the scaling pain associated with data retrieval
operations, I want to integrate caching into the Mercurial wire
protocol server as robustly as possible such that servers can
aggressively cache responses and defer as much server load as
possible.
This commit represents the initial implementation of a general
caching layer in wire protocol version 2.
We define a new interface and behavior for a wire protocol cacher
in repository.py. (This is probably where a reviewer should look
first to understand what is going on.)
The bulk of the added code is in wireprotov2server.py, where we
define how a command can opt in to being cached and integrate
caching into command dispatching.
From a very high-level:
* A command can declare itself as cacheable by providing a callable
that can be used to derive a cache key.
* At dispatch time, if a command is cacheable, we attempt to
construct a cacher and use it for serving the request and/or
caching the request.
* The dispatch layer handles the bulk of the business logic for
caching, making cachers mostly "dumb content stores."
* The mechanism for invalidating cached entries (one of the harder
parts about caching in general) is by varying the cache key when
state changes. As such, cachers don't need to be concerned with
cache invalidation.
Initially, we've hooked up support for caching "manifestdata" and
"filedata" commands. These are the simplest to cache, as they should
be immutable over time. Caching of commands related to changeset
data is a bit harder (because cache validation is impacted by
changes to bookmarks, phases, etc). This will be implemented later.
(Strictly speaking, censoring a file should invalidate caches. I've
added an inline TODO to track this edge case.)
To prove it works, this commit implements a test-only extension
providing in-memory caching backed by an lrucachedict. A new test
showing this extension behaving properly is added. FWIW, the
cacher is ~50 lines of code, demonstrating the relative ease with
which a cache can be added to a server.
While the test cacher is not suitable for production workloads, just
for kicks I performed a clone of just the changeset and manifest data
for the mozilla-unified repository. With a fully warmed cache (of just
the manifest data since changeset data is not cached), server-side
CPU usage dropped from ~73s to ~28s. That's pretty significant and
demonstrates the potential that response caching has on server
scalability!
Differential Revision: https://phab.mercurial-scm.org/D4773
author | Gregory Szorc <gregory.szorc@gmail.com> |
---|---|
date | Wed, 26 Sep 2018 17:16:56 -0700 |
parents | 3694c9aaf5e4 |
children | aaad36b88298 |
line wrap: on
line source
# acl.py - changeset access control for mercurial # # Copyright 2006 Vadim Gelfer <vadim.gelfer@gmail.com> # # This software may be used and distributed according to the terms of the # GNU General Public License version 2 or any later version. '''hooks for controlling repository access This hook makes it possible to allow or deny write access to given branches and paths of a repository when receiving incoming changesets via pretxnchangegroup and pretxncommit. The authorization is matched based on the local user name on the system where the hook runs, and not the committer of the original changeset (since the latter is merely informative). The acl hook is best used along with a restricted shell like hgsh, preventing authenticating users from doing anything other than pushing or pulling. The hook is not safe to use if users have interactive shell access, as they can then disable the hook. Nor is it safe if remote users share an account, because then there is no way to distinguish them. The order in which access checks are performed is: 1) Deny list for branches (section ``acl.deny.branches``) 2) Allow list for branches (section ``acl.allow.branches``) 3) Deny list for paths (section ``acl.deny``) 4) Allow list for paths (section ``acl.allow``) The allow and deny sections take key-value pairs. Branch-based Access Control --------------------------- Use the ``acl.deny.branches`` and ``acl.allow.branches`` sections to have branch-based access control. Keys in these sections can be either: - a branch name, or - an asterisk, to match any branch; The corresponding values can be either: - a comma-separated list containing users and groups, or - an asterisk, to match anyone; You can add the "!" prefix to a user or group name to invert the sense of the match. Path-based Access Control ------------------------- Use the ``acl.deny`` and ``acl.allow`` sections to have path-based access control. Keys in these sections accept a subtree pattern (with a glob syntax by default). The corresponding values follow the same syntax as the other sections above. Bookmark-based Access Control ----------------------------- Use the ``acl.deny.bookmarks`` and ``acl.allow.bookmarks`` sections to have bookmark-based access control. Keys in these sections can be either: - a bookmark name, or - an asterisk, to match any bookmark; The corresponding values can be either: - a comma-separated list containing users and groups, or - an asterisk, to match anyone; You can add the "!" prefix to a user or group name to invert the sense of the match. Note: for interactions between clients and servers using Mercurial 3.6+ a rejection will generally reject the entire push, for interactions involving older clients, the commit transactions will already be accepted, and only the bookmark movement will be rejected. Groups ------ Group names must be prefixed with an ``@`` symbol. Specifying a group name has the same effect as specifying all the users in that group. You can define group members in the ``acl.groups`` section. If a group name is not defined there, and Mercurial is running under a Unix-like system, the list of users will be taken from the OS. Otherwise, an exception will be raised. Example Configuration --------------------- :: [hooks] # Use this if you want to check access restrictions at commit time pretxncommit.acl = python:hgext.acl.hook # Use this if you want to check access restrictions for pull, push, # bundle and serve. pretxnchangegroup.acl = python:hgext.acl.hook [acl] # Allow or deny access for incoming changes only if their source is # listed here, let them pass otherwise. Source is "serve" for all # remote access (http or ssh), "push", "pull" or "bundle" when the # related commands are run locally. # Default: serve sources = serve [acl.deny.branches] # Everyone is denied to the frozen branch: frozen-branch = * # A bad user is denied on all branches: * = bad-user [acl.allow.branches] # A few users are allowed on branch-a: branch-a = user-1, user-2, user-3 # Only one user is allowed on branch-b: branch-b = user-1 # The super user is allowed on any branch: * = super-user # Everyone is allowed on branch-for-tests: branch-for-tests = * [acl.deny] # This list is checked first. If a match is found, acl.allow is not # checked. All users are granted access if acl.deny is not present. # Format for both lists: glob pattern = user, ..., @group, ... # To match everyone, use an asterisk for the user: # my/glob/pattern = * # user6 will not have write access to any file: ** = user6 # Group "hg-denied" will not have write access to any file: ** = @hg-denied # Nobody will be able to change "DONT-TOUCH-THIS.txt", despite # everyone being able to change all other files. See below. src/main/resources/DONT-TOUCH-THIS.txt = * [acl.allow] # if acl.allow is not present, all users are allowed by default # empty acl.allow = no users allowed # User "doc_writer" has write access to any file under the "docs" # folder: docs/** = doc_writer # User "jack" and group "designers" have write access to any file # under the "images" folder: images/** = jack, @designers # Everyone (except for "user6" and "@hg-denied" - see acl.deny above) # will have write access to any file under the "resources" folder # (except for 1 file. See acl.deny): src/main/resources/** = * .hgtags = release_engineer Examples using the "!" prefix ............................. Suppose there's a branch that only a given user (or group) should be able to push to, and you don't want to restrict access to any other branch that may be created. The "!" prefix allows you to prevent anyone except a given user or group to push changesets in a given branch or path. In the examples below, we will: 1) Deny access to branch "ring" to anyone but user "gollum" 2) Deny access to branch "lake" to anyone but members of the group "hobbit" 3) Deny access to a file to anyone but user "gollum" :: [acl.allow.branches] # Empty [acl.deny.branches] # 1) only 'gollum' can commit to branch 'ring'; # 'gollum' and anyone else can still commit to any other branch. ring = !gollum # 2) only members of the group 'hobbit' can commit to branch 'lake'; # 'hobbit' members and anyone else can still commit to any other branch. lake = !@hobbit # You can also deny access based on file paths: [acl.allow] # Empty [acl.deny] # 3) only 'gollum' can change the file below; # 'gollum' and anyone else can still change any other file. /misty/mountains/cave/ring = !gollum ''' from __future__ import absolute_import from mercurial.i18n import _ from mercurial import ( error, extensions, match, pycompat, registrar, util, ) from mercurial.utils import ( procutil, ) urlreq = util.urlreq # Note for extension authors: ONLY specify testedwith = 'ships-with-hg-core' for # extensions which SHIP WITH MERCURIAL. Non-mainline extensions should # be specifying the version(s) of Mercurial they are tested with, or # leave the attribute unspecified. testedwith = 'ships-with-hg-core' configtable = {} configitem = registrar.configitem(configtable) # deprecated config: acl.config configitem('acl', 'config', default=None, ) configitem('acl.groups', '.*', default=None, generic=True, ) configitem('acl.deny.branches', '.*', default=None, generic=True, ) configitem('acl.allow.branches', '.*', default=None, generic=True, ) configitem('acl.deny', '.*', default=None, generic=True, ) configitem('acl.allow', '.*', default=None, generic=True, ) configitem('acl', 'sources', default=lambda: ['serve'], ) def _getusers(ui, group): # First, try to use group definition from section [acl.groups] hgrcusers = ui.configlist('acl.groups', group) if hgrcusers: return hgrcusers ui.debug('acl: "%s" not defined in [acl.groups]\n' % group) # If no users found in group definition, get users from OS-level group try: return util.groupmembers(group) except KeyError: raise error.Abort(_("group '%s' is undefined") % group) def _usermatch(ui, user, usersorgroups): if usersorgroups == '*': return True for ug in usersorgroups.replace(',', ' ').split(): if ug.startswith('!'): # Test for excluded user or group. Format: # if ug is a user name: !username # if ug is a group name: !@groupname ug = ug[1:] if not ug.startswith('@') and user != ug \ or ug.startswith('@') and user not in _getusers(ui, ug[1:]): return True # Test for user or group. Format: # if ug is a user name: username # if ug is a group name: @groupname elif user == ug \ or ug.startswith('@') and user in _getusers(ui, ug[1:]): return True return False def buildmatch(ui, repo, user, key): '''return tuple of (match function, list enabled).''' if not ui.has_section(key): ui.debug('acl: %s not enabled\n' % key) return None pats = [pat for pat, users in ui.configitems(key) if _usermatch(ui, user, users)] ui.debug('acl: %s enabled, %d entries for user %s\n' % (key, len(pats), user)) # Branch-based ACL if not repo: if pats: # If there's an asterisk (meaning "any branch"), always return True; # Otherwise, test if b is in pats if '*' in pats: return util.always return lambda b: b in pats return util.never # Path-based ACL if pats: return match.match(repo.root, '', pats) return util.never def ensureenabled(ui): """make sure the extension is enabled when used as hook When acl is used through hooks, the extension is never formally loaded and enabled. This has some side effect, for example the config declaration is never loaded. This function ensure the extension is enabled when running hooks. """ if 'acl' in ui._knownconfig: return ui.setconfig('extensions', 'acl', '', source='internal') extensions.loadall(ui, ['acl']) def hook(ui, repo, hooktype, node=None, source=None, **kwargs): ensureenabled(ui) if hooktype not in ['pretxnchangegroup', 'pretxncommit', 'prepushkey']: raise error.Abort( _('config error - hook type "%s" cannot stop ' 'incoming changesets, commits, nor bookmarks') % hooktype) if (hooktype == 'pretxnchangegroup' and source not in ui.configlist('acl', 'sources')): ui.debug('acl: changes have source "%s" - skipping\n' % source) return user = None if source == 'serve' and r'url' in kwargs: url = kwargs[r'url'].split(':') if url[0] == 'remote' and url[1].startswith('http'): user = urlreq.unquote(url[3]) if user is None: user = procutil.getuser() ui.debug('acl: checking access for user "%s"\n' % user) if hooktype == 'prepushkey': _pkhook(ui, repo, hooktype, node, source, user, **kwargs) else: _txnhook(ui, repo, hooktype, node, source, user, **kwargs) def _pkhook(ui, repo, hooktype, node, source, user, **kwargs): if kwargs[r'namespace'] == 'bookmarks': bookmark = kwargs[r'key'] ctx = kwargs[r'new'] allowbookmarks = buildmatch(ui, None, user, 'acl.allow.bookmarks') denybookmarks = buildmatch(ui, None, user, 'acl.deny.bookmarks') if denybookmarks and denybookmarks(bookmark): raise error.Abort(_('acl: user "%s" denied on bookmark "%s"' ' (changeset "%s")') % (user, bookmark, ctx)) if allowbookmarks and not allowbookmarks(bookmark): raise error.Abort(_('acl: user "%s" not allowed on bookmark "%s"' ' (changeset "%s")') % (user, bookmark, ctx)) ui.debug('acl: bookmark access granted: "%s" on bookmark "%s"\n' % (ctx, bookmark)) def _txnhook(ui, repo, hooktype, node, source, user, **kwargs): # deprecated config: acl.config cfg = ui.config('acl', 'config') if cfg: ui.readconfig(cfg, sections=['acl.groups', 'acl.allow.branches', 'acl.deny.branches', 'acl.allow', 'acl.deny']) allowbranches = buildmatch(ui, None, user, 'acl.allow.branches') denybranches = buildmatch(ui, None, user, 'acl.deny.branches') allow = buildmatch(ui, repo, user, 'acl.allow') deny = buildmatch(ui, repo, user, 'acl.deny') for rev in pycompat.xrange(repo[node].rev(), len(repo)): ctx = repo[rev] branch = ctx.branch() if denybranches and denybranches(branch): raise error.Abort(_('acl: user "%s" denied on branch "%s"' ' (changeset "%s")') % (user, branch, ctx)) if allowbranches and not allowbranches(branch): raise error.Abort(_('acl: user "%s" not allowed on branch "%s"' ' (changeset "%s")') % (user, branch, ctx)) ui.debug('acl: branch access granted: "%s" on branch "%s"\n' % (ctx, branch)) for f in ctx.files(): if deny and deny(f): raise error.Abort(_('acl: user "%s" denied on "%s"' ' (changeset "%s")') % (user, f, ctx)) if allow and not allow(f): raise error.Abort(_('acl: user "%s" not allowed on "%s"' ' (changeset "%s")') % (user, f, ctx)) ui.debug('acl: path access granted: "%s"\n' % ctx)