comparison mercurial/graphmod.py @ 23564:f7ce0837eefd

graphmod: add a function for topological iteration This changeset introduces a function to perform topological (one branch after the other) iteration over a set of changesets. This first version has a lot of limitations, but the approach should be flexible enough to allow many improvements in the future. This changeset aims to set the first stone more than providing a complete solution. The algorithm does not need to know the whole set of nodes involved before emitting revision. This makes it a good candidate for usage in place like `hg log` or graphical tools that need a fast first result time.
author Pierre-Yves David <pierre-yves.david@fb.com>
date Thu, 04 Sep 2014 18:19:32 +0200
parents bb1bd9ee323d
children 996c01bfbec4
comparison
equal deleted inserted replaced
23563:114992041625 23564:f7ce0837eefd
19 19
20 from mercurial.node import nullrev 20 from mercurial.node import nullrev
21 import util 21 import util
22 22
23 CHANGESET = 'C' 23 CHANGESET = 'C'
24
25 def groupbranchiter(revs, parentsfunc):
26 """yield revision from heads to roots one (topo) branch after the other.
27
28 This function aims to be used by a graph generator that wishes to minimize
29 the amount of parallel branches and their interleaving.
30
31 Example iteration order:
32
33 o 4
34 |
35 o 1
36 |
37 | o 3
38 | |
39 | o 2
40 |/
41 o 0
42
43 Currently does not handle non-contiguous <revs> input.
44
45 Currently consider every changeset under a merge to be on the same branch
46 using revision number to sort them.
47
48 Could be easily extend to give priority to an initial branch."""
49 ### Quick summary of the algorithm
50 #
51 # This function is based around a "retention" principle. We keep revisions
52 # in memory until we are ready to emit a whole branch that immediately
53 # "merge" into an existing one. This reduce the number of branch "ongoing"
54 # at the same time.
55 #
56 # During iteration revs are split into two groups:
57 # A) revision already emitted
58 # B) revision in "retention". They are stored as different subgroups.
59 #
60 # for each REV, we do the follow logic:
61 #
62 # a) if REV is a parent of (A), we will emit it. But before emitting it,
63 # we'll "free" all the revs from subgroup in (B) that were waiting for
64 # REV to be available. So we emit all revision of such subgroup before
65 # emitting REV
66 #
67 # b) else, we'll search for a subgroup in (B) awaiting for REV to be
68 # available, if such subgroup exist, we add REV to it and the subgroup is
69 # now awaiting for REV.parents() to be available.
70 #
71 # c) finally if no such group existed in (B), we create a new subgroup.
72 #
73 #
74 # To bootstrap the algorithm, we emit the tipmost revision.
75
76 revs.sort(reverse=True)
77
78 # Set of parents of revision that have been yield. They can be considered
79 # unblocked as the graph generator is already aware of them so there is no
80 # need to delay the one that reference them.
81 unblocked = set()
82
83 # list of group waiting to be displayed, each group is defined by:
84 #
85 # (revs: lists of revs waiting to be displayed,
86 # blocked: set of that cannot be displayed before those in 'revs')
87 #
88 # The second value ('blocked') correspond to parents of any revision in the
89 # group ('revs') that is not itself contained in the group. The main idea
90 # of this algorithm is to delay as much as possible the emission of any
91 # revision. This means waiting for the moment we are about to display
92 # theses parents to display the revs in a group.
93 #
94 # This first implementation is smart until it meet a merge: it will emit
95 # revs as soon as any parents is about to be emitted and can grow an
96 # arbitrary number of revs in 'blocked'. In practice this mean we properly
97 # retains new branches but give up on any special ordering for ancestors of
98 # merges. The implementation can be improved to handle this better.
99 #
100 # The first subgroup is special. It correspond to all the revision that
101 # were already emitted. The 'revs' lists is expected to be empty and the
102 # 'blocked' set contains the parents revisions of already emitted revision.
103 #
104 # You could pre-seed the <parents> set of groups[0] to a specific
105 # changesets to select what the first emitted branch should be.
106 #
107 # We do not support revisions will hole yet, but adding such support would
108 # be easy. The iteration will have to be done using both input revision and
109 # parents (see cl.ancestors function + a few tweaks) but only revisions
110 # parts of the initial set should be emitted.
111 groups = [([], unblocked)]
112 for current in revs:
113 # Look for a subgroup blocked, waiting for the current revision.
114 matching = [i for i, g in enumerate(groups) if current in g[1]]
115
116 if matching:
117 # The main idea is to gather together all sets that await on the
118 # same revision.
119 #
120 # This merging is done at the time we are about to add this common
121 # awaited to the subgroup for simplicity purpose. Such merge could
122 # happen sooner when we update the "blocked" set of revision.
123 #
124 # We also always keep the oldest subgroup first. We can probably
125 # improve the behavior by having the longuest set first. That way,
126 # graph algorythms could minimise the length of parallele lines
127 # their draw. This is currently not done.
128 targetidx = matching.pop(0)
129 trevs, tparents = groups[targetidx]
130 for i in matching:
131 gr = groups[i]
132 trevs.extend(gr[0])
133 tparents |= gr[1]
134 # delete all merged subgroups (but the one we keep)
135 # (starting from the last subgroup for performance and sanity reason)
136 for i in reversed(matching):
137 del groups[i]
138 else:
139 # This is a new head. We create a new subgroup for it.
140 targetidx = len(groups)
141 groups.append(([], set([current])))
142
143 gr = groups[targetidx]
144
145 # We now adds the current nodes to this subgroups. This is done after
146 # the subgroup merging because all elements from a subgroup that relied
147 # on this rev must preceed it.
148 #
149 # we also update the <parents> set to includes the parents on the
150 # new nodes.
151 gr[0].append(current)
152 gr[1].remove(current)
153 gr[1].update([p for p in parentsfunc(current) if p > nullrev])
154
155 # Look for a subgroup to display
156 #
157 # When unblocked is empty (if clause), We are not waiting over any
158 # revision during the first iteration (if no priority was given) or if
159 # we outputed a whole disconnected sets of the graph (reached a root).
160 # In that case we arbitrarily takes the oldest known subgroup. The
161 # heuristique could probably be better.
162 #
163 # Otherwise (elif clause) this mean we have some emitted revision. if
164 # the subgroup awaits on the same revision that the outputed ones, we
165 # can safely output it.
166 if not unblocked:
167 if len(groups) > 1: # display other subset
168 targetidx = 1
169 gr = groups[1]
170 elif not gr[1] & unblocked:
171 gr = None
172
173 if gr is not None:
174 # update the set of awaited revisions with the one from the
175 # subgroup
176 unblocked |= gr[1]
177 # output all revisions in the subgroup
178 for r in gr[0]:
179 yield r
180 # delete the subgroup that you just output
181 # unless it is groups[0] in which case you just empty it.
182 if targetidx:
183 del groups[targetidx]
184 else:
185 gr[0][:] = []
24 186
25 def dagwalker(repo, revs): 187 def dagwalker(repo, revs):
26 """cset DAG generator yielding (id, CHANGESET, ctx, [parentids]) tuples 188 """cset DAG generator yielding (id, CHANGESET, ctx, [parentids]) tuples
27 189
28 This generator function walks through revisions (which should be ordered 190 This generator function walks through revisions (which should be ordered