Mercurial > hg
view hgext/uncommit.py @ 38732:be4984261611
merge: mark file gets as not thread safe (issue5933)
In default installs, this has the effect of disabling the thread-based
worker on Windows when manifesting files in the working directory. My
measurements have shown that with revlog-based repositories, Mercurial
spends a lot of CPU time in revlog code resolving file data. This ends
up incurring a lot of context switching across threads and slows down
`hg update` operations when going from an empty working directory to
the tip of the repo.
On mozilla-unified (246,351 files) on an i7-6700K (4+4 CPUs):
before: 487s wall
after: 360s wall (equivalent to worker.enabled=false)
cpus=2: 379s wall
Even with only 2 threads, the thread pool is still slower.
The introduction of the thread-based worker (02b36e860e0b) states that
it resulted in a "~50%" speedup for `hg sparse --enable-profile` and
`hg sparse --disable-profile`. This disagrees with my measurement
above. I theorize a few reasons for this:
1) Removal of files from the working directory is I/O - not CPU - bound
and should benefit from a thread pool (unless I/O is insanely fast
and the GIL release is near instantaneous). So tests like `hg sparse
--enable-profile` may exercise deletion throughput and aren't good
benchmarks for worker tasks that are CPU heavy.
2) The patch was authored by someone at Facebook. The results were
likely measured against a repository using remotefilelog. And I
believe that revision retrieval during working directory updates with
remotefilelog will often use a remote store, thus being I/O and not
CPU bound. This probably resulted in an overstated performance gain.
Since there appears to be a need to enable the thread-based worker with
some stores, I've made the flagging of file gets as thread safe
configurable. I've made it experimental because I don't want to formalize
a boolean flag for this option and because this attribute is best
captured against the store implementation. But we don't have a proper
store API for this yet. I'd rather cross this bridge later.
It is possible there are revlog-based repositories that do benefit from
a thread-based worker. I didn't do very comprehensive testing. If there
are, we may want to devise a more proper algorithm for whether to use
the thread-based worker, including possibly config options to limit the
number of threads to use. But until I see evidence that justifies
complexity, simplicity wins.
Differential Revision: https://phab.mercurial-scm.org/D3963
author | Gregory Szorc <gregory.szorc@gmail.com> |
---|---|
date | Wed, 18 Jul 2018 09:49:34 -0700 |
parents | 32fba6fe893d |
children | 3bd22c4d3711 |
line wrap: on
line source
# uncommit - undo the actions of a commit # # Copyright 2011 Peter Arrenbrecht <peter.arrenbrecht@gmail.com> # Logilab SA <contact@logilab.fr> # Pierre-Yves David <pierre-yves.david@ens-lyon.org> # Patrick Mezard <patrick@mezard.eu> # Copyright 2016 Facebook, Inc. # # This software may be used and distributed according to the terms of the # GNU General Public License version 2 or any later version. """uncommit part or all of a local changeset (EXPERIMENTAL) This command undoes the effect of a local commit, returning the affected files to their uncommitted state. This means that files modified, added or removed in the changeset will be left unchanged, and so will remain modified, added and removed in the working directory. """ from __future__ import absolute_import from mercurial.i18n import _ from mercurial import ( cmdutil, commands, context, copies, error, node, obsutil, pycompat, registrar, rewriteutil, scmutil, ) cmdtable = {} command = registrar.command(cmdtable) configtable = {} configitem = registrar.configitem(configtable) configitem('experimental', 'uncommitondirtywdir', default=False, ) # Note for extension authors: ONLY specify testedwith = 'ships-with-hg-core' for # extensions which SHIP WITH MERCURIAL. Non-mainline extensions should # be specifying the version(s) of Mercurial they are tested with, or # leave the attribute unspecified. testedwith = 'ships-with-hg-core' def _commitfiltered(repo, ctx, match, keepcommit): """Recommit ctx with changed files not in match. Return the new node identifier, or None if nothing changed. """ base = ctx.p1() # ctx initialfiles = set(ctx.files()) exclude = set(f for f in initialfiles if match(f)) # No files matched commit, so nothing excluded if not exclude: return None files = (initialfiles - exclude) # return the p1 so that we don't create an obsmarker later if not keepcommit: return ctx.parents()[0].node() # Filter copies copied = copies.pathcopies(base, ctx) copied = dict((dst, src) for dst, src in copied.iteritems() if dst in files) def filectxfn(repo, memctx, path, contentctx=ctx, redirect=()): if path not in contentctx: return None fctx = contentctx[path] mctx = context.memfilectx(repo, memctx, fctx.path(), fctx.data(), fctx.islink(), fctx.isexec(), copied=copied.get(path)) return mctx new = context.memctx(repo, parents=[base.node(), node.nullid], text=ctx.description(), files=files, filectxfn=filectxfn, user=ctx.user(), date=ctx.date(), extra=ctx.extra()) return repo.commitctx(new) def _fixdirstate(repo, oldctx, newctx, status): """ fix the dirstate after switching the working directory from oldctx to newctx which can be result of either unamend or uncommit. """ ds = repo.dirstate copies = dict(ds.copies()) s = status for f in s.modified: if ds[f] == 'r': # modified + removed -> removed continue ds.normallookup(f) for f in s.added: if ds[f] == 'r': # added + removed -> unknown ds.drop(f) elif ds[f] != 'a': ds.add(f) for f in s.removed: if ds[f] == 'a': # removed + added -> normal ds.normallookup(f) elif ds[f] != 'r': ds.remove(f) # Merge old parent and old working dir copies oldcopies = {} for f in (s.modified + s.added): src = oldctx[f].renamed() if src: oldcopies[f] = src[0] oldcopies.update(copies) copies = dict((dst, oldcopies.get(src, src)) for dst, src in oldcopies.iteritems()) # Adjust the dirstate copies for dst, src in copies.iteritems(): if (src not in newctx or dst in newctx or ds[dst] != 'a'): src = None ds.copy(src, dst) @command('uncommit', [('', 'keep', False, _('allow an empty commit after uncommiting')), ] + commands.walkopts, _('[OPTION]... [FILE]...')) def uncommit(ui, repo, *pats, **opts): """uncommit part or all of a local changeset This command undoes the effect of a local commit, returning the affected files to their uncommitted state. This means that files modified or deleted in the changeset will be left unchanged, and so will remain modified in the working directory. If no files are specified, the commit will be pruned, unless --keep is given. """ opts = pycompat.byteskwargs(opts) with repo.wlock(), repo.lock(): if not pats and not repo.ui.configbool('experimental', 'uncommitondirtywdir'): cmdutil.bailifchanged(repo) old = repo['.'] rewriteutil.precheck(repo, [old.rev()], 'uncommit') if len(old.parents()) > 1: raise error.Abort(_("cannot uncommit merge changeset")) with repo.transaction('uncommit'): match = scmutil.match(old, pats, opts) keepcommit = opts.get('keep') or pats newid = _commitfiltered(repo, old, match, keepcommit) if newid is None: ui.status(_("nothing to uncommit\n")) return 1 mapping = {} if newid != old.p1().node(): # Move local changes on filtered changeset mapping[old.node()] = (newid,) else: # Fully removed the old commit mapping[old.node()] = () scmutil.cleanupnodes(repo, mapping, 'uncommit', fixphase=True) with repo.dirstate.parentchange(): repo.dirstate.setparents(newid, node.nullid) s = repo.status(old.p1(), old, match=match) _fixdirstate(repo, old, repo[newid], s) def predecessormarkers(ctx): """yields the obsolete markers marking the given changeset as a successor""" for data in ctx.repo().obsstore.predecessors.get(ctx.node(), ()): yield obsutil.marker(ctx.repo(), data) @command('^unamend', []) def unamend(ui, repo, **opts): """undo the most recent amend operation on a current changeset This command will roll back to the previous version of a changeset, leaving working directory in state in which it was before running `hg amend` (e.g. files modified as part of an amend will be marked as modified `hg status`) """ unfi = repo.unfiltered() with repo.wlock(), repo.lock(), repo.transaction('unamend'): # identify the commit from which to unamend curctx = repo['.'] rewriteutil.precheck(repo, [curctx.rev()], 'unamend') # identify the commit to which to unamend markers = list(predecessormarkers(curctx)) if len(markers) != 1: e = _("changeset must have one predecessor, found %i predecessors") raise error.Abort(e % len(markers)) prednode = markers[0].prednode() predctx = unfi[prednode] # add an extra so that we get a new hash # note: allowing unamend to undo an unamend is an intentional feature extras = predctx.extra() extras['unamend_source'] = curctx.hex() def filectxfn(repo, ctx_, path): try: return predctx.filectx(path) except KeyError: return None # Make a new commit same as predctx newctx = context.memctx(repo, parents=(predctx.p1(), predctx.p2()), text=predctx.description(), files=predctx.files(), filectxfn=filectxfn, user=predctx.user(), date=predctx.date(), extra=extras) newprednode = repo.commitctx(newctx) newpredctx = repo[newprednode] dirstate = repo.dirstate with dirstate.parentchange(): dirstate.setparents(newprednode, node.nullid) s = repo.status(predctx, curctx) _fixdirstate(repo, curctx, newpredctx, s) mapping = {curctx.node(): (newprednode,)} scmutil.cleanupnodes(repo, mapping, 'unamend', fixphase=True)