Mercurial > hg
view mercurial/phases.py @ 43297:8a2925265402
sidedatacopies: fast path data fetching if revision has no sidedata
When using the side data mode, we know their won't be any copy information
sidedata. Skipping revision restoration give an important speed boost.
In the future, there will be other user of sidedata, reducing the efficiency of
this. We should consider adding a dedicated flag in revlog V2 to preserve this
optimisation. The current situation is good enough for now.
revision: large amount; added files: large amount; rename small amount; c3b14617fbd7 9ba6ab77fd29
before: ! wall 2.401569 comb 2.400000 user 2.390000 sys 0.010000 (median of 10)
after: ! wall 1.429294 comb 1.430000 user 1.410000 sys 0.020000 (median of 10)
revision: large amount; added files: small amount; rename small amount; c3b14617fbd7 f650a9b140d2
before: ! wall 3.519140 comb 3.520000 user 3.470000 sys 0.050000 (median of 10)
after: ! wall 1.963332 comb 1.960000 user 1.960000 sys 0.000000 (median of 10)
revision: large amount; added files: large amount; rename large amount; 08ea3258278e d9fa043f30c0
before: ! wall 0.593880 comb 0.600000 user 0.590000 sys 0.010000 (median of 15)
after: ! wall 0.251679 comb 0.250000 user 0.250000 sys 0.000000 (median of 38)
revision: small amount; added files: large amount; rename large amount; df6f7a526b60 a83dc6a2d56f
before: ! wall 0.013414 comb 0.020000 user 0.020000 sys 0.000000 (median of 220)
after: ! wall 0.013222 comb 0.020000 user 0.020000 sys 0.000000 (median of 223)
revision: small amount; added files: large amount; rename small amount; 4aa4e1f8e19a 169138063d63
before: ! wall 0.002711 comb 0.000000 user 0.000000 sys 0.000000 (median of 1000)
after: ! wall 0.001631 comb 0.000000 user 0.000000 sys 0.000000 (median of 1000)
revision: small amount; added files: small amount; rename small amount; 4bc173b045a6 964879152e2e
before: ! wall 0.000077 comb 0.000000 user 0.000000 sys 0.000000 (median of 12208)
after: ! wall 0.000078 comb 0.000000 user 0.000000 sys 0.000000 (median of 12012)
revision: medium amount; added files: large amount; rename medium amount; c95f1ced15f2 2c68e87c3efe
before: ! wall 0.410067 comb 0.410000 user 0.410000 sys 0.000000 (median of 23)
after: ! wall 0.207786 comb 0.200000 user 0.200000 sys 0.000000 (median of 46)
revision: medium amount; added files: medium amount; rename small amount; d343da0c55a8 d7746d32bf9d
before: ! wall 0.097004 comb 0.090000 user 0.090000 sys 0.000000 (median of 100)
after: ! wall 0.038495 comb 0.030000 user 0.030000 sys 0.000000 (median of 100)
Differential Revision: https://phab.mercurial-scm.org/D7074
author | Pierre-Yves David <pierre-yves.david@octobus.net> |
---|---|
date | Wed, 02 Oct 2019 18:16:02 -0400 |
parents | d783f945a701 |
children | 9970412d2ce3 |
line wrap: on
line source
""" Mercurial phases support code --- Copyright 2011 Pierre-Yves David <pierre-yves.david@ens-lyon.org> Logilab SA <contact@logilab.fr> Augie Fackler <durin42@gmail.com> This software may be used and distributed according to the terms of the GNU General Public License version 2 or any later version. --- This module implements most phase logic in mercurial. Basic Concept ============= A 'changeset phase' is an indicator that tells us how a changeset is manipulated and communicated. The details of each phase is described below, here we describe the properties they have in common. Like bookmarks, phases are not stored in history and thus are not permanent and leave no audit trail. First, no changeset can be in two phases at once. Phases are ordered, so they can be considered from lowest to highest. The default, lowest phase is 'public' - this is the normal phase of existing changesets. A child changeset can not be in a lower phase than its parents. These phases share a hierarchy of traits: immutable shared public: X X draft: X secret: Local commits are draft by default. Phase Movement and Exchange =========================== Phase data is exchanged by pushkey on pull and push. Some servers have a publish option set, we call such a server a "publishing server". Pushing a draft changeset to a publishing server changes the phase to public. A small list of fact/rules define the exchange of phase: * old client never changes server states * pull never changes server states * publish and old server changesets are seen as public by client * any secret changeset seen in another repository is lowered to at least draft Here is the final table summing up the 49 possible use cases of phase exchange: server old publish non-publish N X N D P N D P old client pull N - X/X - X/D X/P - X/D X/P X - X/X - X/D X/P - X/D X/P push X X/X X/X X/P X/P X/P X/D X/D X/P new client pull N - P/X - P/D P/P - D/D P/P D - P/X - P/D P/P - D/D P/P P - P/X - P/D P/P - P/D P/P push D P/X P/X P/P P/P P/P D/D D/D P/P P P/X P/X P/P P/P P/P P/P P/P P/P Legend: A/B = final state on client / state on server * N = new/not present, * P = public, * D = draft, * X = not tracked (i.e., the old client or server has no internal way of recording the phase.) passive = only pushes A cell here can be read like this: "When a new client pushes a draft changeset (D) to a publishing server where it's not present (N), it's marked public on both sides (P/P)." Note: old client behave as a publishing server with draft only content - other people see it as public - content is pushed as draft """ from __future__ import absolute_import import errno import struct from .i18n import _ from .node import ( bin, hex, nullid, nullrev, short, ) from .pycompat import ( getattr, setattr, ) from . import ( error, pycompat, smartset, txnutil, util, ) _fphasesentry = struct.Struct(b'>i20s') INTERNAL_FLAG = 64 # Phases for mercurial internal usage only HIDEABLE_FLAG = 32 # Phases that are hideable # record phase index public, draft, secret = range(3) internal = INTERNAL_FLAG | HIDEABLE_FLAG archived = HIDEABLE_FLAG allphases = range(internal + 1) trackedphases = allphases[1:] # record phase names cmdphasenames = [b'public', b'draft', b'secret'] # known to `hg phase` command phasenames = [None] * len(allphases) phasenames[: len(cmdphasenames)] = cmdphasenames phasenames[archived] = b'archived' phasenames[internal] = b'internal' # record phase property mutablephases = tuple(allphases[1:]) remotehiddenphases = tuple(allphases[2:]) localhiddenphases = tuple(p for p in allphases if p & HIDEABLE_FLAG) def supportinternal(repo): """True if the internal phase can be used on a repository""" return b'internal-phase' in repo.requirements def _readroots(repo, phasedefaults=None): """Read phase roots from disk phasedefaults is a list of fn(repo, roots) callable, which are executed if the phase roots file does not exist. When phases are being initialized on an existing repository, this could be used to set selected changesets phase to something else than public. Return (roots, dirty) where dirty is true if roots differ from what is being stored. """ repo = repo.unfiltered() dirty = False roots = [set() for i in allphases] try: f, pending = txnutil.trypending(repo.root, repo.svfs, b'phaseroots') try: for line in f: phase, nh = line.split() roots[int(phase)].add(bin(nh)) finally: f.close() except IOError as inst: if inst.errno != errno.ENOENT: raise if phasedefaults: for f in phasedefaults: roots = f(repo, roots) dirty = True return roots, dirty def binaryencode(phasemapping): """encode a 'phase -> nodes' mapping into a binary stream Since phases are integer the mapping is actually a python list: [[PUBLIC_HEADS], [DRAFTS_HEADS], [SECRET_HEADS]] """ binarydata = [] for phase, nodes in enumerate(phasemapping): for head in nodes: binarydata.append(_fphasesentry.pack(phase, head)) return b''.join(binarydata) def binarydecode(stream): """decode a binary stream into a 'phase -> nodes' mapping Since phases are integer the mapping is actually a python list.""" headsbyphase = [[] for i in allphases] entrysize = _fphasesentry.size while True: entry = stream.read(entrysize) if len(entry) < entrysize: if entry: raise error.Abort(_(b'bad phase-heads stream')) break phase, node = _fphasesentry.unpack(entry) headsbyphase[phase].append(node) return headsbyphase def _trackphasechange(data, rev, old, new): """add a phase move the <data> dictionnary If data is None, nothing happens. """ if data is None: return existing = data.get(rev) if existing is not None: old = existing[0] data[rev] = (old, new) class phasecache(object): def __init__(self, repo, phasedefaults, _load=True): if _load: # Cheap trick to allow shallow-copy without copy module self.phaseroots, self.dirty = _readroots(repo, phasedefaults) self._loadedrevslen = 0 self._phasesets = None self.filterunknown(repo) self.opener = repo.svfs def getrevset(self, repo, phases, subset=None): """return a smartset for the given phases""" self.loadphaserevs(repo) # ensure phase's sets are loaded phases = set(phases) if public not in phases: # fast path: _phasesets contains the interesting sets, # might only need a union and post-filtering. if len(phases) == 1: [p] = phases revs = self._phasesets[p] else: revs = set.union(*[self._phasesets[p] for p in phases]) if repo.changelog.filteredrevs: revs = revs - repo.changelog.filteredrevs if subset is None: return smartset.baseset(revs) else: return subset & smartset.baseset(revs) else: phases = set(allphases).difference(phases) if not phases: return smartset.fullreposet(repo) if len(phases) == 1: [p] = phases revs = self._phasesets[p] else: revs = set.union(*[self._phasesets[p] for p in phases]) if subset is None: subset = smartset.fullreposet(repo) if not revs: return subset return subset.filter(lambda r: r not in revs) def copy(self): # Shallow copy meant to ensure isolation in # advance/retractboundary(), nothing more. ph = self.__class__(None, None, _load=False) ph.phaseroots = self.phaseroots[:] ph.dirty = self.dirty ph.opener = self.opener ph._loadedrevslen = self._loadedrevslen ph._phasesets = self._phasesets return ph def replace(self, phcache): """replace all values in 'self' with content of phcache""" for a in ( b'phaseroots', b'dirty', b'opener', b'_loadedrevslen', b'_phasesets', ): setattr(self, a, getattr(phcache, a)) def _getphaserevsnative(self, repo): repo = repo.unfiltered() nativeroots = [] for phase in trackedphases: nativeroots.append( pycompat.maplist(repo.changelog.rev, self.phaseroots[phase]) ) return repo.changelog.computephases(nativeroots) def _computephaserevspure(self, repo): repo = repo.unfiltered() cl = repo.changelog self._phasesets = [set() for phase in allphases] lowerroots = set() for phase in reversed(trackedphases): roots = pycompat.maplist(cl.rev, self.phaseroots[phase]) if roots: ps = set(cl.descendants(roots)) for root in roots: ps.add(root) ps.difference_update(lowerroots) lowerroots.update(ps) self._phasesets[phase] = ps self._loadedrevslen = len(cl) def loadphaserevs(self, repo): """ensure phase information is loaded in the object""" if self._phasesets is None: try: res = self._getphaserevsnative(repo) self._loadedrevslen, self._phasesets = res except AttributeError: self._computephaserevspure(repo) def invalidate(self): self._loadedrevslen = 0 self._phasesets = None def phase(self, repo, rev): # We need a repo argument here to be able to build _phasesets # if necessary. The repository instance is not stored in # phasecache to avoid reference cycles. The changelog instance # is not stored because it is a filecache() property and can # be replaced without us being notified. if rev == nullrev: return public if rev < nullrev: raise ValueError(_(b'cannot lookup negative revision')) if rev >= self._loadedrevslen: self.invalidate() self.loadphaserevs(repo) for phase in trackedphases: if rev in self._phasesets[phase]: return phase return public def write(self): if not self.dirty: return f = self.opener(b'phaseroots', b'w', atomictemp=True, checkambig=True) try: self._write(f) finally: f.close() def _write(self, fp): for phase, roots in enumerate(self.phaseroots): for h in sorted(roots): fp.write(b'%i %s\n' % (phase, hex(h))) self.dirty = False def _updateroots(self, phase, newroots, tr): self.phaseroots[phase] = newroots self.invalidate() self.dirty = True tr.addfilegenerator(b'phase', (b'phaseroots',), self._write) tr.hookargs[b'phases_moved'] = b'1' def registernew(self, repo, tr, targetphase, nodes): repo = repo.unfiltered() self._retractboundary(repo, tr, targetphase, nodes) if tr is not None and b'phases' in tr.changes: phasetracking = tr.changes[b'phases'] torev = repo.changelog.rev phase = self.phase for n in nodes: rev = torev(n) revphase = phase(repo, rev) _trackphasechange(phasetracking, rev, None, revphase) repo.invalidatevolatilesets() def advanceboundary(self, repo, tr, targetphase, nodes, dryrun=None): """Set all 'nodes' to phase 'targetphase' Nodes with a phase lower than 'targetphase' are not affected. If dryrun is True, no actions will be performed Returns a set of revs whose phase is changed or should be changed """ # Be careful to preserve shallow-copied values: do not update # phaseroots values, replace them. if tr is None: phasetracking = None else: phasetracking = tr.changes.get(b'phases') repo = repo.unfiltered() changes = set() # set of revisions to be changed delroots = [] # set of root deleted by this path for phase in pycompat.xrange(targetphase + 1, len(allphases)): # filter nodes that are not in a compatible phase already nodes = [ n for n in nodes if self.phase(repo, repo[n].rev()) >= phase ] if not nodes: break # no roots to move anymore olds = self.phaseroots[phase] affected = repo.revs(b'%ln::%ln', olds, nodes) changes.update(affected) if dryrun: continue for r in affected: _trackphasechange( phasetracking, r, self.phase(repo, r), targetphase ) roots = set( ctx.node() for ctx in repo.set(b'roots((%ln::) - %ld)', olds, affected) ) if olds != roots: self._updateroots(phase, roots, tr) # some roots may need to be declared for lower phases delroots.extend(olds - roots) if not dryrun: # declare deleted root in the target phase if targetphase != 0: self._retractboundary(repo, tr, targetphase, delroots) repo.invalidatevolatilesets() return changes def retractboundary(self, repo, tr, targetphase, nodes): oldroots = self.phaseroots[: targetphase + 1] if tr is None: phasetracking = None else: phasetracking = tr.changes.get(b'phases') repo = repo.unfiltered() if ( self._retractboundary(repo, tr, targetphase, nodes) and phasetracking is not None ): # find the affected revisions new = self.phaseroots[targetphase] old = oldroots[targetphase] affected = set(repo.revs(b'(%ln::) - (%ln::)', new, old)) # find the phase of the affected revision for phase in pycompat.xrange(targetphase, -1, -1): if phase: roots = oldroots[phase] revs = set(repo.revs(b'%ln::%ld', roots, affected)) affected -= revs else: # public phase revs = affected for r in revs: _trackphasechange(phasetracking, r, phase, targetphase) repo.invalidatevolatilesets() def _retractboundary(self, repo, tr, targetphase, nodes): # Be careful to preserve shallow-copied values: do not update # phaseroots values, replace them. if targetphase in (archived, internal) and not supportinternal(repo): name = phasenames[targetphase] msg = b'this repository does not support the %s phase' % name raise error.ProgrammingError(msg) repo = repo.unfiltered() currentroots = self.phaseroots[targetphase] finalroots = oldroots = set(currentroots) newroots = [ n for n in nodes if self.phase(repo, repo[n].rev()) < targetphase ] if newroots: if nullid in newroots: raise error.Abort(_(b'cannot change null revision phase')) currentroots = currentroots.copy() currentroots.update(newroots) # Only compute new roots for revs above the roots that are being # retracted. minnewroot = min(repo[n].rev() for n in newroots) aboveroots = [ n for n in currentroots if repo[n].rev() >= minnewroot ] updatedroots = repo.set(b'roots(%ln::)', aboveroots) finalroots = set( n for n in currentroots if repo[n].rev() < minnewroot ) finalroots.update(ctx.node() for ctx in updatedroots) if finalroots != oldroots: self._updateroots(targetphase, finalroots, tr) return True return False def filterunknown(self, repo): """remove unknown nodes from the phase boundary Nothing is lost as unknown nodes only hold data for their descendants. """ filtered = False nodemap = repo.changelog.nodemap # to filter unknown nodes for phase, nodes in enumerate(self.phaseroots): missing = sorted(node for node in nodes if node not in nodemap) if missing: for mnode in missing: repo.ui.debug( b'removing unknown node %s from %i-phase boundary\n' % (short(mnode), phase) ) nodes.symmetric_difference_update(missing) filtered = True if filtered: self.dirty = True # filterunknown is called by repo.destroyed, we may have no changes in # root but _phasesets contents is certainly invalid (or at least we # have not proper way to check that). related to issue 3858. # # The other caller is __init__ that have no _phasesets initialized # anyway. If this change we should consider adding a dedicated # "destroyed" function to phasecache or a proper cache key mechanism # (see branchmap one) self.invalidate() def advanceboundary(repo, tr, targetphase, nodes, dryrun=None): """Add nodes to a phase changing other nodes phases if necessary. This function move boundary *forward* this means that all nodes are set in the target phase or kept in a *lower* phase. Simplify boundary to contains phase roots only. If dryrun is True, no actions will be performed Returns a set of revs whose phase is changed or should be changed """ phcache = repo._phasecache.copy() changes = phcache.advanceboundary( repo, tr, targetphase, nodes, dryrun=dryrun ) if not dryrun: repo._phasecache.replace(phcache) return changes def retractboundary(repo, tr, targetphase, nodes): """Set nodes back to a phase changing other nodes phases if necessary. This function move boundary *backward* this means that all nodes are set in the target phase or kept in a *higher* phase. Simplify boundary to contains phase roots only.""" phcache = repo._phasecache.copy() phcache.retractboundary(repo, tr, targetphase, nodes) repo._phasecache.replace(phcache) def registernew(repo, tr, targetphase, nodes): """register a new revision and its phase Code adding revisions to the repository should use this function to set new changeset in their target phase (or higher). """ phcache = repo._phasecache.copy() phcache.registernew(repo, tr, targetphase, nodes) repo._phasecache.replace(phcache) def listphases(repo): """List phases root for serialization over pushkey""" # Use ordered dictionary so behavior is deterministic. keys = util.sortdict() value = b'%i' % draft cl = repo.unfiltered().changelog for root in repo._phasecache.phaseroots[draft]: if repo._phasecache.phase(repo, cl.rev(root)) <= draft: keys[hex(root)] = value if repo.publishing(): # Add an extra data to let remote know we are a publishing # repo. Publishing repo can't just pretend they are old repo. # When pushing to a publishing repo, the client still need to # push phase boundary # # Push do not only push changeset. It also push phase data. # New phase data may apply to common changeset which won't be # push (as they are common). Here is a very simple example: # # 1) repo A push changeset X as draft to repo B # 2) repo B make changeset X public # 3) repo B push to repo A. X is not pushed but the data that # X as now public should # # The server can't handle it on it's own as it has no idea of # client phase data. keys[b'publishing'] = b'True' return keys def pushphase(repo, nhex, oldphasestr, newphasestr): """List phases root for serialization over pushkey""" repo = repo.unfiltered() with repo.lock(): currentphase = repo[nhex].phase() newphase = abs(int(newphasestr)) # let's avoid negative index surprise oldphase = abs(int(oldphasestr)) # let's avoid negative index surprise if currentphase == oldphase and newphase < oldphase: with repo.transaction(b'pushkey-phase') as tr: advanceboundary(repo, tr, newphase, [bin(nhex)]) return True elif currentphase == newphase: # raced, but got correct result return True else: return False def subsetphaseheads(repo, subset): """Finds the phase heads for a subset of a history Returns a list indexed by phase number where each item is a list of phase head nodes. """ cl = repo.changelog headsbyphase = [[] for i in allphases] # No need to keep track of secret phase; any heads in the subset that # are not mentioned are implicitly secret. for phase in allphases[:secret]: revset = b"heads(%%ln & %s())" % phasenames[phase] headsbyphase[phase] = [cl.node(r) for r in repo.revs(revset, subset)] return headsbyphase def updatephases(repo, trgetter, headsbyphase): """Updates the repo with the given phase heads""" # Now advance phase boundaries of all but secret phase # # run the update (and fetch transaction) only if there are actually things # to update. This avoid creating empty transaction during no-op operation. for phase in allphases[:-1]: revset = b'%ln - _phase(%s)' heads = [c.node() for c in repo.set(revset, headsbyphase[phase], phase)] if heads: advanceboundary(repo, trgetter(), phase, heads) def analyzeremotephases(repo, subset, roots): """Compute phases heads and root in a subset of node from root dict * subset is heads of the subset * roots is {<nodeid> => phase} mapping. key and value are string. Accept unknown element input """ repo = repo.unfiltered() # build list from dictionary draftroots = [] nodemap = repo.changelog.nodemap # to filter unknown nodes for nhex, phase in pycompat.iteritems(roots): if nhex == b'publishing': # ignore data related to publish option continue node = bin(nhex) phase = int(phase) if phase == public: if node != nullid: repo.ui.warn( _( b'ignoring inconsistent public root' b' from remote: %s\n' ) % nhex ) elif phase == draft: if node in nodemap: draftroots.append(node) else: repo.ui.warn( _(b'ignoring unexpected root from remote: %i %s\n') % (phase, nhex) ) # compute heads publicheads = newheads(repo, subset, draftroots) return publicheads, draftroots class remotephasessummary(object): """summarize phase information on the remote side :publishing: True is the remote is publishing :publicheads: list of remote public phase heads (nodes) :draftheads: list of remote draft phase heads (nodes) :draftroots: list of remote draft phase root (nodes) """ def __init__(self, repo, remotesubset, remoteroots): unfi = repo.unfiltered() self._allremoteroots = remoteroots self.publishing = remoteroots.get(b'publishing', False) ana = analyzeremotephases(repo, remotesubset, remoteroots) self.publicheads, self.draftroots = ana # Get the list of all "heads" revs draft on remote dheads = unfi.set(b'heads(%ln::%ln)', self.draftroots, remotesubset) self.draftheads = [c.node() for c in dheads] def newheads(repo, heads, roots): """compute new head of a subset minus another * `heads`: define the first subset * `roots`: define the second we subtract from the first""" # prevent an import cycle # phases > dagop > patch > copies > scmutil > obsolete > obsutil > phases from . import dagop repo = repo.unfiltered() cl = repo.changelog rev = cl.nodemap.get if not roots: return heads if not heads or heads == [nullid]: return [] # The logic operated on revisions, convert arguments early for convenience new_heads = set(rev(n) for n in heads if n != nullid) roots = [rev(n) for n in roots] # compute the area we need to remove affected_zone = repo.revs(b"(%ld::%ld)", roots, new_heads) # heads in the area are no longer heads new_heads.difference_update(affected_zone) # revisions in the area have children outside of it, # They might be new heads candidates = repo.revs( b"parents(%ld + (%ld and merge())) and not null", roots, affected_zone ) candidates -= affected_zone if new_heads or candidates: # remove candidate that are ancestors of other heads new_heads.update(candidates) prunestart = repo.revs(b"parents(%ld) and not null", new_heads) pruned = dagop.reachableroots(repo, candidates, prunestart) new_heads.difference_update(pruned) return pycompat.maplist(cl.node, sorted(new_heads)) def newcommitphase(ui): """helper to get the target phase of new commit Handle all possible values for the phases.new-commit options. """ v = ui.config(b'phases', b'new-commit') try: return phasenames.index(v) except ValueError: try: return int(v) except ValueError: msg = _(b"phases.new-commit: not a valid phase name ('%s')") raise error.ConfigError(msg % v) def hassecret(repo): """utility function that check if a repo have any secret changeset.""" return bool(repo._phasecache.phaseroots[2]) def preparehookargs(node, old, new): if old is None: old = b'' else: old = phasenames[old] return {b'node': node, b'oldphase': old, b'phase': phasenames[new]}