Mercurial: mercurial/graphmod.py comparison

comparison mercurial/graphmod.py @ 23564:f7ce0837eefd

graphmod: add a function for topological iteration This changeset introduces a function to perform topological (one branch after the other) iteration over a set of changesets. This first version has a lot of limitations, but the approach should be flexible enough to allow many improvements in the future. This changeset aims to set the first stone more than providing a complete solution. The algorithm does not need to know the whole set of nodes involved before emitting revision. This makes it a good candidate for usage in place like `hg log` or graphical tools that need a fast first result time.

author	Pierre-Yves David <pierre-yves.david@fb.com>
date	Thu, 04 Sep 2014 18:19:32 +0200
parents	bb1bd9ee323d
children	996c01bfbec4

comparison

equal deleted inserted replaced

-:114992041625
+:f7ce0837eefd
 from mercurial.node import nullrev
 import util
 CHANGESET = 'C'
+def groupbranchiter(revs, parentsfunc):
+"""yield revision from heads to roots one (topo) branch after the other.
+This function aims to be used by a graph generator that wishes to minimize
+the amount of parallel branches and their interleaving.
+Example iteration order:
+o  4
+|
+o  1
+|
+| o  3
+| |
+| o  2
+|/
+o  0
+Currently does not handle non-contiguous <revs> input.
+Currently consider every changeset under a merge to be on the same branch
+using revision number to sort them.
+Could be easily extend to give priority to an initial branch."""
+### Quick summary of the algorithm
+#
+# This function is based around a "retention" principle. We keep revisions
+# in memory until we are ready to emit a whole branch that immediately
+# "merge" into an existing one. This reduce the number of branch "ongoing"
+# at the same time.
+#
+# During iteration revs are split into two groups:
+# A) revision already emitted
+# B) revision in "retention". They are stored as different subgroups.
+#
+# for each REV, we do the follow logic:
+#
+#   a) if REV is a parent of (A), we will emit it. But before emitting it,
+#   we'll "free" all the revs from subgroup in (B) that were waiting for
+#   REV to be available. So we emit all revision of such subgroup before
+#   emitting REV
+#
+#   b) else, we'll search for a subgroup in (B) awaiting for REV to be
+#   available, if such subgroup exist, we add REV to it and the subgroup is
+#   now awaiting for REV.parents() to be available.
+#
+#   c) finally if no such group existed in (B), we create a new subgroup.
+#
+#
+# To bootstrap the algorithm, we emit the tipmost revision.
+revs.sort(reverse=True)
+# Set of parents of revision that have been yield. They can be considered
+# unblocked as the graph generator is already aware of them so there is no
+# need to delay the one that reference them.
+unblocked = set()
+# list of group waiting to be displayed, each group is defined by:
+#
+#   (revs:    lists of revs waiting to be displayed,
+#    blocked: set of that cannot be displayed before those in 'revs')
+#
+# The second value ('blocked') correspond to parents of any revision in the
+# group ('revs') that is not itself contained in the group. The main idea
+# of this algorithm is to delay as much as possible the emission of any
+# revision.  This means waiting for the moment we are about to display
+# theses parents to display the revs in a group.
+#
+# This first implementation is smart until it meet a merge: it will emit
+# revs as soon as any parents is about to be emitted and can grow an
+# arbitrary number of revs in 'blocked'. In practice this mean we properly
+# retains new branches but give up on any special ordering for ancestors of
+# merges. The implementation can be improved to handle this better.
+#
+# The first subgroup is special. It correspond to all the revision that
+# were already emitted. The 'revs' lists is expected to be empty and the
+# 'blocked' set contains the parents revisions of already emitted revision.
+#
+# You could pre-seed the <parents> set of groups[0] to a specific
+# changesets to select what the first emitted branch should be.
+#
+# We do not support revisions will hole yet, but adding such support would
+# be easy. The iteration will have to be done using both input revision and
+# parents (see cl.ancestors function + a few tweaks) but only revisions
+# parts of the initial set should be emitted.
+groups = [([], unblocked)]
+for current in revs:
+# Look for a subgroup blocked, waiting for the current revision.
+matching = [i for i, g in enumerate(groups) if current in g[1]]
+if matching:
+# The main idea is to gather together all sets that await on the
+# same revision.
+#
+# This merging is done at the time we are about to add this common
+# awaited to the subgroup for simplicity purpose. Such merge could
+# happen sooner when we update the "blocked" set of revision.
+#
+# We also always keep the oldest subgroup first. We can probably
+# improve the behavior by having the longuest set first. That way,
+# graph algorythms could minimise the length of parallele lines
+# their draw. This is currently not done.
+targetidx = matching.pop(0)
+trevs, tparents = groups[targetidx]
+for i in matching:
+gr = groups[i]
+trevs.extend(gr[0])
+tparents |= gr[1]
+# delete all merged subgroups (but the one we keep)
+# (starting from the last subgroup for performance and sanity reason)
+for i in reversed(matching):
+del groups[i]
+else:
+# This is a new head. We create a new subgroup for it.
+targetidx = len(groups)
+groups.append(([], set([current])))
+gr = groups[targetidx]
+# We now adds the current nodes to this subgroups. This is done after
+# the subgroup merging because all elements from a subgroup that relied
+# on this rev must preceed it.
+#
+# we also update the <parents> set to includes the parents on the
+# new nodes.
+gr[0].append(current)
+gr[1].remove(current)
+gr[1].update([p for p in parentsfunc(current) if p > nullrev])
+# Look for a subgroup to display
+#
+# When unblocked is empty (if clause), We are not waiting over any
+# revision during the first iteration (if no priority was given) or if
+# we outputed a whole disconnected sets of the graph (reached a root).
+# In that case we arbitrarily takes the oldest known subgroup. The
+# heuristique could probably be better.
+#
+# Otherwise (elif clause) this mean we have some emitted revision.  if
+# the subgroup awaits on the same revision that the outputed ones, we
+# can safely output it.
+if not unblocked:
+if len(groups) > 1:  # display other subset
+targetidx = 1
+gr = groups[1]
+elif not gr[1] & unblocked:
+gr = None
+if gr is not None:
+# update the set of awaited revisions with the one from the
+# subgroup
+unblocked |= gr[1]
+# output all revisions in the subgroup
+for r in gr[0]:
+yield r
+# delete the subgroup that you just output
+# unless it is groups[0] in which case you just empty it.
+if targetidx:
+del groups[targetidx]
+else:
+gr[0][:] = []
 def dagwalker(repo, revs):
 """cset DAG generator yielding (id, CHANGESET, ctx, [parentids]) tuples
 This generator function walks through revisions (which should be ordered

Mercurial > hg

comparison mercurial/graphmod.py @ 23564:f7ce0837eefd