copies: iterate over children directly (instead of parents)
Before this change we would gather all parent → child edges and iterate on
all parent, gathering copy information for children and aggregating them from
there.
They are not strict requirement for edges to be processed in that specific
order. We could also simply iterate over all "children" revision and aggregate
data from both parents at the same time. This patch does that.
It make various things simpler:
* since both parents are processed at the same time, we no longer need to
cache data for merge (see next changeset for details),
* we no longer need nested loop to process data,
* we no longer need to store partial merge data for a rev from distinct loop
interaction to another when processing merges,
* we no longer need to build a full parent -> children mapping (we only rely on
a simpler "parent -> number of children" map (for memory efficiency),
* the data access pattern is now simpler (from lower revisions to higher
revisions) and entirely predicable. That predictability open the way to
prefetching and parallel processing.
So that new iterations order requires simpler code and open the way to
interesting optimisation.
The effect on performance is quite good. In the worse case, we don't see any
significant negative impact. And in the best case, the reduction of roundtrip
to Python provide us with a significant speed. Some example below:
Repo Case Source-Rev Dest-Rev # of revisions old time new time Difference Factor time per rev
---------------------------------------------------------------------------------------------------------------------------------------------------------------
mozilla-try x00000_revs_x00000_added_0_copies
dc8a3ca7010e d16fde900c9c : 34414 revs, 0.962867 s, 0.502584 s, -0.460283 s, × 0.5220, 14 µs/rev
mozilla-try x0000_revs_xx000_added_x_copies
156f6e2674f2 4d0f2c178e66 : 8598 revs, 0.110717 s, 0.076323 s, -0.034394 s, × 0.6894, 8 µs/rev
# full comparison between the previous changeset and this one
Repo Case Source-Rev Dest-Rev # of revisions old time new time Difference Factor time per rev
---------------------------------------------------------------------------------------------------------------------------------------------------------------
mercurial x_revs_x_added_0_copies
ad6b123de1c7 39cfcef4f463 : 1 revs, 0.000048 s, 0.000041 s, -0.000007 s, × 0.8542, 41 µs/rev
mercurial x_revs_x_added_x_copies
2b1c78674230 0c1d10351869 : 6 revs, 0.000153 s, 0.000102 s, -0.000051 s, × 0.6667, 17 µs/rev
mercurial x000_revs_x000_added_x_copies
81f8ff2a9bf2 dd3267698d84 : 1032 revs, 0.004209 s, 0.004254 s, +0.000045 s, × 1.0107, 4 µs/rev
pypy x_revs_x_added_0_copies
aed021ee8ae8 099ed31b181b : 9 revs, 0.000203 s, 0.000282 s, +0.000079 s, × 1.3892, 31 µs/rev
pypy x_revs_x000_added_0_copies
4aa4e1f8e19a 359343b9ac0e : 1 revs, 0.000059 s, 0.000048 s, -0.000011 s, × 0.8136, 48 µs/rev
pypy x_revs_x_added_x_copies
ac52eb7bbbb0 72e022663155 : 7 revs, 0.000194 s, 0.000211 s, +0.000017 s, × 1.0876, 30 µs/rev
pypy x_revs_x00_added_x_copies
c3b14617fbd7 ace7255d9a26 : 1 revs, 0.000380 s, 0.000375 s, -0.000005 s, × 0.9868, 375 µs/rev
pypy x_revs_x000_added_x000_copies
df6f7a526b60 a83dc6a2d56f : 6 revs, 0.010588 s, 0.010574 s, -0.000014 s, × 0.9987, 1762 µs/rev
pypy x000_revs_xx00_added_0_copies
89a76aede314 2f22446ff07e : 4785 revs, 0.048961 s, 0.049974 s, +0.001013 s, × 1.0207, 10 µs/rev
pypy x000_revs_x000_added_x_copies
8a3b5bfd266e 2c68e87c3efe : 6780 revs, 0.083612 s, 0.084300 s, +0.000688 s, × 1.0082, 12 µs/rev
pypy x000_revs_x000_added_x000_copies
89a76aede314 7b3dda341c84 : 5441 revs, 0.058579 s, 0.060128 s, +0.001549 s, × 1.0264, 11 µs/rev
pypy x0000_revs_x_added_0_copies
d1defd0dc478 c9cb1334cc78 : 43645 revs, 0.736783 s, 0.686542 s, -0.050241 s, × 0.9318, 15 µs/rev
pypy x0000_revs_xx000_added_0_copies
bf2c629d0071 4ffed77c095c : 2 revs, 0.022050 s, 0.009277 s, -0.012773 s, × 0.4207, 4638 µs/rev
pypy x0000_revs_xx000_added_x000_copies
08ea3258278e d9fa043f30c0 : 11316 revs, 0.120800 s, 0.114733 s, -0.006067 s, × 0.9498, 10 µs/rev
netbeans x_revs_x_added_0_copies
fb0955ffcbcd a01e9239f9e7 : 2 revs, 0.000140 s, 0.000081 s, -0.000059 s, × 0.5786, 40 µs/rev
netbeans x_revs_x000_added_0_copies
6f360122949f 20eb231cc7d0 : 2 revs, 0.000114 s, 0.000107 s, -0.000007 s, × 0.9386, 53 µs/rev
netbeans x_revs_x_added_x_copies
1ada3faf6fb6 5a39d12eecf4 : 3 revs, 0.000224 s, 0.000173 s, -0.000051 s, × 0.7723, 57 µs/rev
netbeans x_revs_x00_added_x_copies
35be93ba1e2c 9eec5e90c05f : 9 revs, 0.000723 s, 0.000698 s, -0.000025 s, × 0.9654, 77 µs/rev
netbeans x000_revs_xx00_added_0_copies
eac3045b4fdd 51d4ae7f1290 : 1421 revs, 0.009665 s, 0.009248 s, -0.000417 s, × 0.9569, 6 µs/rev
netbeans x000_revs_x000_added_x_copies
e2063d266acd 6081d72689dc : 1533 revs, 0.014820 s, 0.015446 s, +0.000626 s, × 1.0422, 10 µs/rev
netbeans x000_revs_x000_added_x000_copies
ff453e9fee32 411350406ec2 : 5750 revs, 0.076049 s, 0.074373 s, -0.001676 s, × 0.9780, 12 µs/rev
netbeans x0000_revs_xx000_added_x000_copies
588c2d1ced70 1aad62e59ddd : 66949 revs, 0.683603 s, 0.639870 s, -0.043733 s, × 0.9360, 9 µs/rev
mozilla-central x_revs_x_added_0_copies
3697f962bb7b 7015fcdd43a2 : 2 revs, 0.000161 s, 0.000088 s, -0.000073 s, × 0.5466, 44 µs/rev
mozilla-central x_revs_x000_added_0_copies
dd390860c6c9 40d0c5bed75d : 8 revs, 0.000234 s, 0.000199 s, -0.000035 s, × 0.8504, 24 µs/rev
mozilla-central x_revs_x_added_x_copies
8d198483ae3b 14207ffc2b2f : 9 revs, 0.000247 s, 0.000171 s, -0.000076 s, × 0.6923, 19 µs/rev
mozilla-central x_revs_x00_added_x_copies
98cbc58cc6bc 446a150332c3 : 7 revs, 0.000630 s, 0.000592 s, -0.000038 s, × 0.9397, 84 µs/rev
mozilla-central x_revs_x000_added_x000_copies
3c684b4b8f68 0a5e72d1b479 : 3 revs, 0.003286 s, 0.003151 s, -0.000135 s, × 0.9589, 1050 µs/rev
mozilla-central x_revs_x0000_added_x0000_copies
effb563bb7e5 c07a39dc4e80 : 6 revs, 0.062441 s, 0.061612 s, -0.000829 s, × 0.9867, 10268 µs/rev
mozilla-central x000_revs_xx00_added_0_copies
6100d773079a 04a55431795e : 1593 revs, 0.005423 s, 0.005381 s, -0.000042 s, × 0.9923, 3 µs/rev
mozilla-central x000_revs_x000_added_x_copies
9f17a6fc04f9 2d37b966abed : 41 revs, 0.005919 s, 0.003742 s, -0.002177 s, × 0.6322, 91 µs/rev
mozilla-central x000_revs_x000_added_x000_copies
7c97034feb78 4407bd0c6330 : 7839 revs, 0.062597 s, 0.061983 s, -0.000614 s, × 0.9902, 7 µs/rev
mozilla-central x0000_revs_xx000_added_0_copies
9eec5917337d 67118cc6dcad : 615 revs, 0.043551 s, 0.019861 s, -0.023690 s, × 0.4560, 32 µs/rev
mozilla-central x0000_revs_xx000_added_x000_copies
f78c615a656c 96a38b690156 : 30263 revs, 0.192475 s, 0.188101 s, -0.004374 s, × 0.9773, 6 µs/rev
mozilla-central x00000_revs_x0000_added_x0000_copies
6832ae71433c 4c222a1d9a00 : 153721 revs, 1.955575 s, 1.806696 s, -0.148879 s, × 0.9239, 11 µs/rev
mozilla-central x00000_revs_x00000_added_x000_copies
76caed42cf7c 1daa622bbe42 : 204976 revs, 2.886501 s, 2.682987 s, -0.203514 s, × 0.9295, 13 µs/rev
mozilla-try x_revs_x_added_0_copies
aaf6dde0deb8 9790f499805a : 2 revs, 0.001181 s, 0.000852 s, -0.000329 s, × 0.7214, 426 µs/rev
mozilla-try x_revs_x000_added_0_copies
d8d0222927b4 5bb8ce8c7450 : 2 revs, 0.001189 s, 0.000859 s, -0.000330 s, × 0.7225, 429 µs/rev
mozilla-try x_revs_x_added_x_copies
092fcca11bdb 936255a0384a : 4 revs, 0.000563 s, 0.000150 s, -0.000413 s, × 0.2664, 37 µs/rev
mozilla-try x_revs_x00_added_x_copies
b53d2fadbdb5 017afae788ec : 2 revs, 0.001548 s, 0.001158 s, -0.000390 s, × 0.7481, 579 µs/rev
mozilla-try x_revs_x000_added_x000_copies
20408ad61ce5 6f0ee96e21ad : 1 revs, 0.027782 s, 0.027240 s, -0.000542 s, × 0.9805, 27240 µs/rev
mozilla-try x_revs_x0000_added_x0000_copies
effb563bb7e5 c07a39dc4e80 : 6 revs, 0.062781 s, 0.062824 s, +0.000043 s, × 1.0007, 10470 µs/rev
mozilla-try x000_revs_xx00_added_0_copies
6100d773079a 04a55431795e : 1593 revs, 0.005778 s, 0.005463 s, -0.000315 s, × 0.9455, 3 µs/rev
mozilla-try x000_revs_x000_added_x_copies
9f17a6fc04f9 2d37b966abed : 41 revs, 0.006192 s, 0.004238 s, -0.001954 s, × 0.6844, 103 µs/rev
mozilla-try x000_revs_x000_added_x000_copies
1346fd0130e4 4c65cbdabc1f : 6657 revs, 0.065391 s, 0.064113 s, -0.001278 s, × 0.9805, 9 µs/rev
mozilla-try x0000_revs_x_added_0_copies
63519bfd42ee a36a2a865d92 : 40314 revs, 0.317216 s, 0.294063 s, -0.023153 s, × 0.9270, 7 µs/rev
mozilla-try x0000_revs_x_added_x_copies
9fe69ff0762d bcabf2a78927 : 38690 revs, 0.303119 s, 0.281493 s, -0.021626 s, × 0.9287, 7 µs/rev
mozilla-try x0000_revs_xx000_added_x_copies
156f6e2674f2 4d0f2c178e66 : 8598 revs, 0.110717 s, 0.076323 s, -0.034394 s, × 0.6894, 8 µs/rev
mozilla-try x0000_revs_xx000_added_0_copies
9eec5917337d 67118cc6dcad : 615 revs, 0.045739 s, 0.020390 s, -0.025349 s, × 0.4458, 33 µs/rev
mozilla-try x0000_revs_xx000_added_x000_copies
89294cd501d9 7ccb2fc7ccb5 : 97052 revs, 3.098021 s, 3.023879 s, -0.074142 s, × 0.9761, 31 µs/rev
mozilla-try x0000_revs_x0000_added_x0000_copies
e928c65095ed e951f4ad123a : 52031 revs, 0.771480 s, 0.735549 s, -0.035931 s, × 0.9534, 14 µs/rev
mozilla-try x00000_revs_x_added_0_copies
6a320851d377 1ebb79acd503 : 363753 revs, 18.813422 s, 18.568900 s, -0.244522 s, × 0.9870, 51 µs/rev
mozilla-try x00000_revs_x00000_added_0_copies
dc8a3ca7010e d16fde900c9c : 34414 revs, 0.962867 s, 0.502584 s, -0.460283 s, × 0.5220, 14 µs/rev
mozilla-try x00000_revs_x_added_x_copies
5173c4b6f97c 95d83ee7242d : 362229 revs, 18.684923 s, 18.356645 s, -0.328278 s, × 0.9824, 50 µs/rev
mozilla-try x00000_revs_x000_added_x_copies
9126823d0e9c ca82787bb23c : 359344 revs, 18.296305 s, 18.250393 s, -0.045912 s, × 0.9975, 50 µs/rev
mozilla-try x00000_revs_x0000_added_x0000_copies
8d3fafa80d4b eb884023b810 : 192665 revs, 3.061887 s, 2.792459 s, -0.269428 s, × 0.9120, 14 µs/rev
mozilla-try x00000_revs_x00000_added_x0000_copies
1b661134e2ca 1ae03d022d6d : 228985 revs, 103.869641 s, 107.697264 s, +3.827623 s, × 1.0369, 470 µs/rev
mozilla-try x00000_revs_x00000_added_x000_copies
9b2a99adc05e 8e29777b48e6 : 382065 revs, 64.262957 s, 63.961040 s, -0.301917 s, × 0.9953, 167 µs/rev
Differential Revision: https://phab.mercurial-scm.org/D9422
# smartset.py - data structure for revision set
#
# Copyright 2010 Matt Mackall <mpm@selenic.com>
#
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.
from __future__ import absolute_import
from .pycompat import getattr
from . import (
encoding,
error,
pycompat,
util,
)
from .utils import stringutil
def _typename(o):
return pycompat.sysbytes(type(o).__name__).lstrip(b'_')
class abstractsmartset(object):
def __nonzero__(self):
"""True if the smartset is not empty"""
raise NotImplementedError()
__bool__ = __nonzero__
def __contains__(self, rev):
"""provide fast membership testing"""
raise NotImplementedError()
def __iter__(self):
"""iterate the set in the order it is supposed to be iterated"""
raise NotImplementedError()
# Attributes containing a function to perform a fast iteration in a given
# direction. A smartset can have none, one, or both defined.
#
# Default value is None instead of a function returning None to avoid
# initializing an iterator just for testing if a fast method exists.
fastasc = None
fastdesc = None
def isascending(self):
"""True if the set will iterate in ascending order"""
raise NotImplementedError()
def isdescending(self):
"""True if the set will iterate in descending order"""
raise NotImplementedError()
def istopo(self):
"""True if the set will iterate in topographical order"""
raise NotImplementedError()
def min(self):
"""return the minimum element in the set"""
if self.fastasc is None:
v = min(self)
else:
for v in self.fastasc():
break
else:
raise ValueError(b'arg is an empty sequence')
self.min = lambda: v
return v
def max(self):
"""return the maximum element in the set"""
if self.fastdesc is None:
return max(self)
else:
for v in self.fastdesc():
break
else:
raise ValueError(b'arg is an empty sequence')
self.max = lambda: v
return v
def first(self):
"""return the first element in the set (user iteration perspective)
Return None if the set is empty"""
raise NotImplementedError()
def last(self):
"""return the last element in the set (user iteration perspective)
Return None if the set is empty"""
raise NotImplementedError()
def __len__(self):
"""return the length of the smartsets
This can be expensive on smartset that could be lazy otherwise."""
raise NotImplementedError()
def reverse(self):
"""reverse the expected iteration order"""
raise NotImplementedError()
def sort(self, reverse=False):
"""get the set to iterate in an ascending or descending order"""
raise NotImplementedError()
def __and__(self, other):
"""Returns a new object with the intersection of the two collections.
This is part of the mandatory API for smartset."""
if isinstance(other, fullreposet):
return self
return self.filter(other.__contains__, condrepr=other, cache=False)
def __add__(self, other):
"""Returns a new object with the union of the two collections.
This is part of the mandatory API for smartset."""
return addset(self, other)
def __sub__(self, other):
"""Returns a new object with the substraction of the two collections.
This is part of the mandatory API for smartset."""
c = other.__contains__
return self.filter(
lambda r: not c(r), condrepr=(b'<not %r>', other), cache=False
)
def filter(self, condition, condrepr=None, cache=True):
"""Returns this smartset filtered by condition as a new smartset.
`condition` is a callable which takes a revision number and returns a
boolean. Optional `condrepr` provides a printable representation of
the given `condition`.
This is part of the mandatory API for smartset."""
# builtin cannot be cached. but do not needs to
if cache and util.safehasattr(condition, b'__code__'):
condition = util.cachefunc(condition)
return filteredset(self, condition, condrepr)
def slice(self, start, stop):
"""Return new smartset that contains selected elements from this set"""
if start < 0 or stop < 0:
raise error.ProgrammingError(b'negative index not allowed')
return self._slice(start, stop)
def _slice(self, start, stop):
# sub classes may override this. start and stop must not be negative,
# but start > stop is allowed, which should be an empty set.
ys = []
it = iter(self)
for x in pycompat.xrange(start):
y = next(it, None)
if y is None:
break
for x in pycompat.xrange(stop - start):
y = next(it, None)
if y is None:
break
ys.append(y)
return baseset(ys, datarepr=(b'slice=%d:%d %r', start, stop, self))
class baseset(abstractsmartset):
"""Basic data structure that represents a revset and contains the basic
operation that it should be able to perform.
Every method in this class should be implemented by any smartset class.
This class could be constructed by an (unordered) set, or an (ordered)
list-like object. If a set is provided, it'll be sorted lazily.
>>> x = [4, 0, 7, 6]
>>> y = [5, 6, 7, 3]
Construct by a set:
>>> xs = baseset(set(x))
>>> ys = baseset(set(y))
>>> [list(i) for i in [xs + ys, xs & ys, xs - ys]]
[[0, 4, 6, 7, 3, 5], [6, 7], [0, 4]]
>>> [type(i).__name__ for i in [xs + ys, xs & ys, xs - ys]]
['addset', 'baseset', 'baseset']
Construct by a list-like:
>>> xs = baseset(x)
>>> ys = baseset(i for i in y)
>>> [list(i) for i in [xs + ys, xs & ys, xs - ys]]
[[4, 0, 7, 6, 5, 3], [7, 6], [4, 0]]
>>> [type(i).__name__ for i in [xs + ys, xs & ys, xs - ys]]
['addset', 'filteredset', 'filteredset']
Populate "_set" fields in the lists so set optimization may be used:
>>> [1 in xs, 3 in ys]
[False, True]
Without sort(), results won't be changed:
>>> [list(i) for i in [xs + ys, xs & ys, xs - ys]]
[[4, 0, 7, 6, 5, 3], [7, 6], [4, 0]]
>>> [type(i).__name__ for i in [xs + ys, xs & ys, xs - ys]]
['addset', 'filteredset', 'filteredset']
With sort(), set optimization could be used:
>>> xs.sort(reverse=True)
>>> [list(i) for i in [xs + ys, xs & ys, xs - ys]]
[[7, 6, 4, 0, 5, 3], [7, 6], [4, 0]]
>>> [type(i).__name__ for i in [xs + ys, xs & ys, xs - ys]]
['addset', 'baseset', 'baseset']
>>> ys.sort()
>>> [list(i) for i in [xs + ys, xs & ys, xs - ys]]
[[7, 6, 4, 0, 3, 5], [7, 6], [4, 0]]
>>> [type(i).__name__ for i in [xs + ys, xs & ys, xs - ys]]
['addset', 'baseset', 'baseset']
istopo is preserved across set operations
>>> xs = baseset(set(x), istopo=True)
>>> rs = xs & ys
>>> type(rs).__name__
'baseset'
>>> rs._istopo
True
"""
def __init__(self, data=(), datarepr=None, istopo=False):
"""
datarepr: a tuple of (format, obj, ...), a function or an object that
provides a printable representation of the given data.
"""
self._ascending = None
self._istopo = istopo
if isinstance(data, set):
# converting set to list has a cost, do it lazily
self._set = data
# set has no order we pick one for stability purpose
self._ascending = True
else:
if not isinstance(data, list):
data = list(data)
self._list = data
self._datarepr = datarepr
@util.propertycache
def _set(self):
return set(self._list)
@util.propertycache
def _asclist(self):
asclist = self._list[:]
asclist.sort()
return asclist
@util.propertycache
def _list(self):
# _list is only lazily constructed if we have _set
assert '_set' in self.__dict__
return list(self._set)
def __iter__(self):
if self._ascending is None:
return iter(self._list)
elif self._ascending:
return iter(self._asclist)
else:
return reversed(self._asclist)
def fastasc(self):
return iter(self._asclist)
def fastdesc(self):
return reversed(self._asclist)
@util.propertycache
def __contains__(self):
return self._set.__contains__
def __nonzero__(self):
return bool(len(self))
__bool__ = __nonzero__
def sort(self, reverse=False):
self._ascending = not bool(reverse)
self._istopo = False
def reverse(self):
if self._ascending is None:
self._list.reverse()
else:
self._ascending = not self._ascending
self._istopo = False
def __len__(self):
if '_list' in self.__dict__:
return len(self._list)
else:
return len(self._set)
def isascending(self):
"""Returns True if the collection is ascending order, False if not.
This is part of the mandatory API for smartset."""
if len(self) <= 1:
return True
return self._ascending is not None and self._ascending
def isdescending(self):
"""Returns True if the collection is descending order, False if not.
This is part of the mandatory API for smartset."""
if len(self) <= 1:
return True
return self._ascending is not None and not self._ascending
def istopo(self):
"""Is the collection is in topographical order or not.
This is part of the mandatory API for smartset."""
if len(self) <= 1:
return True
return self._istopo
def first(self):
if self:
if self._ascending is None:
return self._list[0]
elif self._ascending:
return self._asclist[0]
else:
return self._asclist[-1]
return None
def last(self):
if self:
if self._ascending is None:
return self._list[-1]
elif self._ascending:
return self._asclist[-1]
else:
return self._asclist[0]
return None
def _fastsetop(self, other, op):
# try to use native set operations as fast paths
if (
type(other) is baseset
and '_set' in other.__dict__
and '_set' in self.__dict__
and self._ascending is not None
):
s = baseset(
data=getattr(self._set, op)(other._set), istopo=self._istopo
)
s._ascending = self._ascending
else:
s = getattr(super(baseset, self), op)(other)
return s
def __and__(self, other):
return self._fastsetop(other, b'__and__')
def __sub__(self, other):
return self._fastsetop(other, b'__sub__')
def _slice(self, start, stop):
# creating new list should be generally cheaper than iterating items
if self._ascending is None:
return baseset(self._list[start:stop], istopo=self._istopo)
data = self._asclist
if not self._ascending:
start, stop = max(len(data) - stop, 0), max(len(data) - start, 0)
s = baseset(data[start:stop], istopo=self._istopo)
s._ascending = self._ascending
return s
@encoding.strmethod
def __repr__(self):
d = {None: b'', False: b'-', True: b'+'}[self._ascending]
s = stringutil.buildrepr(self._datarepr)
if not s:
l = self._list
# if _list has been built from a set, it might have a different
# order from one python implementation to another.
# We fallback to the sorted version for a stable output.
if self._ascending is not None:
l = self._asclist
s = pycompat.byterepr(l)
return b'<%s%s %s>' % (_typename(self), d, s)
class filteredset(abstractsmartset):
"""Duck type for baseset class which iterates lazily over the revisions in
the subset and contains a function which tests for membership in the
revset
"""
def __init__(self, subset, condition=lambda x: True, condrepr=None):
"""
condition: a function that decide whether a revision in the subset
belongs to the revset or not.
condrepr: a tuple of (format, obj, ...), a function or an object that
provides a printable representation of the given condition.
"""
self._subset = subset
self._condition = condition
self._condrepr = condrepr
def __contains__(self, x):
return x in self._subset and self._condition(x)
def __iter__(self):
return self._iterfilter(self._subset)
def _iterfilter(self, it):
cond = self._condition
for x in it:
if cond(x):
yield x
@property
def fastasc(self):
it = self._subset.fastasc
if it is None:
return None
return lambda: self._iterfilter(it())
@property
def fastdesc(self):
it = self._subset.fastdesc
if it is None:
return None
return lambda: self._iterfilter(it())
def __nonzero__(self):
fast = None
candidates = [
self.fastasc if self.isascending() else None,
self.fastdesc if self.isdescending() else None,
self.fastasc,
self.fastdesc,
]
for candidate in candidates:
if candidate is not None:
fast = candidate
break
if fast is not None:
it = fast()
else:
it = self
for r in it:
return True
return False
__bool__ = __nonzero__
def __len__(self):
# Basic implementation to be changed in future patches.
# until this gets improved, we use generator expression
# here, since list comprehensions are free to call __len__ again
# causing infinite recursion
l = baseset(r for r in self)
return len(l)
def sort(self, reverse=False):
self._subset.sort(reverse=reverse)
def reverse(self):
self._subset.reverse()
def isascending(self):
return self._subset.isascending()
def isdescending(self):
return self._subset.isdescending()
def istopo(self):
return self._subset.istopo()
def first(self):
for x in self:
return x
return None
def last(self):
it = None
if self.isascending():
it = self.fastdesc
elif self.isdescending():
it = self.fastasc
if it is not None:
for x in it():
return x
return None # empty case
else:
x = None
for x in self:
pass
return x
@encoding.strmethod
def __repr__(self):
xs = [pycompat.byterepr(self._subset)]
s = stringutil.buildrepr(self._condrepr)
if s:
xs.append(s)
return b'<%s %s>' % (_typename(self), b', '.join(xs))
def _iterordered(ascending, iter1, iter2):
"""produce an ordered iteration from two iterators with the same order
The ascending is used to indicated the iteration direction.
"""
choice = max
if ascending:
choice = min
val1 = None
val2 = None
try:
# Consume both iterators in an ordered way until one is empty
while True:
if val1 is None:
val1 = next(iter1)
if val2 is None:
val2 = next(iter2)
n = choice(val1, val2)
yield n
if val1 == n:
val1 = None
if val2 == n:
val2 = None
except StopIteration:
# Flush any remaining values and consume the other one
it = iter2
if val1 is not None:
yield val1
it = iter1
elif val2 is not None:
# might have been equality and both are empty
yield val2
for val in it:
yield val
class addset(abstractsmartset):
"""Represent the addition of two sets
Wrapper structure for lazily adding two structures without losing much
performance on the __contains__ method
If the ascending attribute is set, that means the two structures are
ordered in either an ascending or descending way. Therefore, we can add
them maintaining the order by iterating over both at the same time
>>> xs = baseset([0, 3, 2])
>>> ys = baseset([5, 2, 4])
>>> rs = addset(xs, ys)
>>> bool(rs), 0 in rs, 1 in rs, 5 in rs, rs.first(), rs.last()
(True, True, False, True, 0, 4)
>>> rs = addset(xs, baseset([]))
>>> bool(rs), 0 in rs, 1 in rs, rs.first(), rs.last()
(True, True, False, 0, 2)
>>> rs = addset(baseset([]), baseset([]))
>>> bool(rs), 0 in rs, rs.first(), rs.last()
(False, False, None, None)
iterate unsorted:
>>> rs = addset(xs, ys)
>>> # (use generator because pypy could call len())
>>> list(x for x in rs) # without _genlist
[0, 3, 2, 5, 4]
>>> assert not rs._genlist
>>> len(rs)
5
>>> [x for x in rs] # with _genlist
[0, 3, 2, 5, 4]
>>> assert rs._genlist
iterate ascending:
>>> rs = addset(xs, ys, ascending=True)
>>> # (use generator because pypy could call len())
>>> list(x for x in rs), list(x for x in rs.fastasc()) # without _asclist
([0, 2, 3, 4, 5], [0, 2, 3, 4, 5])
>>> assert not rs._asclist
>>> len(rs)
5
>>> [x for x in rs], [x for x in rs.fastasc()]
([0, 2, 3, 4, 5], [0, 2, 3, 4, 5])
>>> assert rs._asclist
iterate descending:
>>> rs = addset(xs, ys, ascending=False)
>>> # (use generator because pypy could call len())
>>> list(x for x in rs), list(x for x in rs.fastdesc()) # without _asclist
([5, 4, 3, 2, 0], [5, 4, 3, 2, 0])
>>> assert not rs._asclist
>>> len(rs)
5
>>> [x for x in rs], [x for x in rs.fastdesc()]
([5, 4, 3, 2, 0], [5, 4, 3, 2, 0])
>>> assert rs._asclist
iterate ascending without fastasc:
>>> rs = addset(xs, generatorset(ys), ascending=True)
>>> assert rs.fastasc is None
>>> [x for x in rs]
[0, 2, 3, 4, 5]
iterate descending without fastdesc:
>>> rs = addset(generatorset(xs), ys, ascending=False)
>>> assert rs.fastdesc is None
>>> [x for x in rs]
[5, 4, 3, 2, 0]
"""
def __init__(self, revs1, revs2, ascending=None):
self._r1 = revs1
self._r2 = revs2
self._iter = None
self._ascending = ascending
self._genlist = None
self._asclist = None
def __len__(self):
return len(self._list)
def __nonzero__(self):
return bool(self._r1) or bool(self._r2)
__bool__ = __nonzero__
@util.propertycache
def _list(self):
if not self._genlist:
self._genlist = baseset(iter(self))
return self._genlist
def __iter__(self):
"""Iterate over both collections without repeating elements
If the ascending attribute is not set, iterate over the first one and
then over the second one checking for membership on the first one so we
dont yield any duplicates.
If the ascending attribute is set, iterate over both collections at the
same time, yielding only one value at a time in the given order.
"""
if self._ascending is None:
if self._genlist:
return iter(self._genlist)
def arbitraryordergen():
for r in self._r1:
yield r
inr1 = self._r1.__contains__
for r in self._r2:
if not inr1(r):
yield r
return arbitraryordergen()
# try to use our own fast iterator if it exists
self._trysetasclist()
if self._ascending:
attr = b'fastasc'
else:
attr = b'fastdesc'
it = getattr(self, attr)
if it is not None:
return it()
# maybe half of the component supports fast
# get iterator for _r1
iter1 = getattr(self._r1, attr)
if iter1 is None:
# let's avoid side effect (not sure it matters)
iter1 = iter(sorted(self._r1, reverse=not self._ascending))
else:
iter1 = iter1()
# get iterator for _r2
iter2 = getattr(self._r2, attr)
if iter2 is None:
# let's avoid side effect (not sure it matters)
iter2 = iter(sorted(self._r2, reverse=not self._ascending))
else:
iter2 = iter2()
return _iterordered(self._ascending, iter1, iter2)
def _trysetasclist(self):
"""populate the _asclist attribute if possible and necessary"""
if self._genlist is not None and self._asclist is None:
self._asclist = sorted(self._genlist)
@property
def fastasc(self):
self._trysetasclist()
if self._asclist is not None:
return self._asclist.__iter__
iter1 = self._r1.fastasc
iter2 = self._r2.fastasc
if None in (iter1, iter2):
return None
return lambda: _iterordered(True, iter1(), iter2())
@property
def fastdesc(self):
self._trysetasclist()
if self._asclist is not None:
return self._asclist.__reversed__
iter1 = self._r1.fastdesc
iter2 = self._r2.fastdesc
if None in (iter1, iter2):
return None
return lambda: _iterordered(False, iter1(), iter2())
def __contains__(self, x):
return x in self._r1 or x in self._r2
def sort(self, reverse=False):
"""Sort the added set
For this we use the cached list with all the generated values and if we
know they are ascending or descending we can sort them in a smart way.
"""
self._ascending = not reverse
def isascending(self):
return self._ascending is not None and self._ascending
def isdescending(self):
return self._ascending is not None and not self._ascending
def istopo(self):
# not worth the trouble asserting if the two sets combined are still
# in topographical order. Use the sort() predicate to explicitly sort
# again instead.
return False
def reverse(self):
if self._ascending is None:
self._list.reverse()
else:
self._ascending = not self._ascending
def first(self):
for x in self:
return x
return None
def last(self):
self.reverse()
val = self.first()
self.reverse()
return val
@encoding.strmethod
def __repr__(self):
d = {None: b'', False: b'-', True: b'+'}[self._ascending]
return b'<%s%s %r, %r>' % (_typename(self), d, self._r1, self._r2)
class generatorset(abstractsmartset):
"""Wrap a generator for lazy iteration
Wrapper structure for generators that provides lazy membership and can
be iterated more than once.
When asked for membership it generates values until either it finds the
requested one or has gone through all the elements in the generator
>>> xs = generatorset([0, 1, 4], iterasc=True)
>>> assert xs.last() == xs.last()
>>> xs.last() # cached
4
"""
def __new__(cls, gen, iterasc=None):
if iterasc is None:
typ = cls
elif iterasc:
typ = _generatorsetasc
else:
typ = _generatorsetdesc
return super(generatorset, cls).__new__(typ)
def __init__(self, gen, iterasc=None):
"""
gen: a generator producing the values for the generatorset.
"""
self._gen = gen
self._asclist = None
self._cache = {}
self._genlist = []
self._finished = False
self._ascending = True
def __nonzero__(self):
# Do not use 'for r in self' because it will enforce the iteration
# order (default ascending), possibly unrolling a whole descending
# iterator.
if self._genlist:
return True
for r in self._consumegen():
return True
return False
__bool__ = __nonzero__
def __contains__(self, x):
if x in self._cache:
return self._cache[x]
# Use new values only, as existing values would be cached.
for l in self._consumegen():
if l == x:
return True
self._cache[x] = False
return False
def __iter__(self):
if self._ascending:
it = self.fastasc
else:
it = self.fastdesc
if it is not None:
return it()
# we need to consume the iterator
for x in self._consumegen():
pass
# recall the same code
return iter(self)
def _iterator(self):
if self._finished:
return iter(self._genlist)
# We have to use this complex iteration strategy to allow multiple
# iterations at the same time. We need to be able to catch revision
# removed from _consumegen and added to genlist in another instance.
#
# Getting rid of it would provide an about 15% speed up on this
# iteration.
genlist = self._genlist
nextgen = self._consumegen()
_len, _next = len, next # cache global lookup
def gen():
i = 0
while True:
if i < _len(genlist):
yield genlist[i]
else:
try:
yield _next(nextgen)
except StopIteration:
return
i += 1
return gen()
def _consumegen(self):
cache = self._cache
genlist = self._genlist.append
for item in self._gen:
cache[item] = True
genlist(item)
yield item
if not self._finished:
self._finished = True
asc = self._genlist[:]
asc.sort()
self._asclist = asc
self.fastasc = asc.__iter__
self.fastdesc = asc.__reversed__
def __len__(self):
for x in self._consumegen():
pass
return len(self._genlist)
def sort(self, reverse=False):
self._ascending = not reverse
def reverse(self):
self._ascending = not self._ascending
def isascending(self):
return self._ascending
def isdescending(self):
return not self._ascending
def istopo(self):
# not worth the trouble asserting if the two sets combined are still
# in topographical order. Use the sort() predicate to explicitly sort
# again instead.
return False
def first(self):
if self._ascending:
it = self.fastasc
else:
it = self.fastdesc
if it is None:
# we need to consume all and try again
for x in self._consumegen():
pass
return self.first()
return next(it(), None)
def last(self):
if self._ascending:
it = self.fastdesc
else:
it = self.fastasc
if it is None:
# we need to consume all and try again
for x in self._consumegen():
pass
return self.last()
return next(it(), None)
@encoding.strmethod
def __repr__(self):
d = {False: b'-', True: b'+'}[self._ascending]
return b'<%s%s>' % (_typename(self), d)
class _generatorsetasc(generatorset):
"""Special case of generatorset optimized for ascending generators."""
fastasc = generatorset._iterator
def __contains__(self, x):
if x in self._cache:
return self._cache[x]
# Use new values only, as existing values would be cached.
for l in self._consumegen():
if l == x:
return True
if l > x:
break
self._cache[x] = False
return False
class _generatorsetdesc(generatorset):
"""Special case of generatorset optimized for descending generators."""
fastdesc = generatorset._iterator
def __contains__(self, x):
if x in self._cache:
return self._cache[x]
# Use new values only, as existing values would be cached.
for l in self._consumegen():
if l == x:
return True
if l < x:
break
self._cache[x] = False
return False
def spanset(repo, start=0, end=None):
"""Create a spanset that represents a range of repository revisions
start: first revision included the set (default to 0)
end: first revision excluded (last+1) (default to len(repo))
Spanset will be descending if `end` < `start`.
"""
if end is None:
end = len(repo)
ascending = start <= end
if not ascending:
start, end = end + 1, start + 1
return _spanset(start, end, ascending, repo.changelog.filteredrevs)
class _spanset(abstractsmartset):
"""Duck type for baseset class which represents a range of revisions and
can work lazily and without having all the range in memory
Note that spanset(x, y) behave almost like xrange(x, y) except for two
notable points:
- when x < y it will be automatically descending,
- revision filtered with this repoview will be skipped.
"""
def __init__(self, start, end, ascending, hiddenrevs):
self._start = start
self._end = end
self._ascending = ascending
self._hiddenrevs = hiddenrevs
def sort(self, reverse=False):
self._ascending = not reverse
def reverse(self):
self._ascending = not self._ascending
def istopo(self):
# not worth the trouble asserting if the two sets combined are still
# in topographical order. Use the sort() predicate to explicitly sort
# again instead.
return False
def _iterfilter(self, iterrange):
s = self._hiddenrevs
for r in iterrange:
if r not in s:
yield r
def __iter__(self):
if self._ascending:
return self.fastasc()
else:
return self.fastdesc()
def fastasc(self):
iterrange = pycompat.xrange(self._start, self._end)
if self._hiddenrevs:
return self._iterfilter(iterrange)
return iter(iterrange)
def fastdesc(self):
iterrange = pycompat.xrange(self._end - 1, self._start - 1, -1)
if self._hiddenrevs:
return self._iterfilter(iterrange)
return iter(iterrange)
def __contains__(self, rev):
hidden = self._hiddenrevs
return (self._start <= rev < self._end) and not (
hidden and rev in hidden
)
def __nonzero__(self):
for r in self:
return True
return False
__bool__ = __nonzero__
def __len__(self):
if not self._hiddenrevs:
return abs(self._end - self._start)
else:
count = 0
start = self._start
end = self._end
for rev in self._hiddenrevs:
if (end < rev <= start) or (start <= rev < end):
count += 1
return abs(self._end - self._start) - count
def isascending(self):
return self._ascending
def isdescending(self):
return not self._ascending
def first(self):
if self._ascending:
it = self.fastasc
else:
it = self.fastdesc
for x in it():
return x
return None
def last(self):
if self._ascending:
it = self.fastdesc
else:
it = self.fastasc
for x in it():
return x
return None
def _slice(self, start, stop):
if self._hiddenrevs:
# unoptimized since all hidden revisions in range has to be scanned
return super(_spanset, self)._slice(start, stop)
if self._ascending:
x = min(self._start + start, self._end)
y = min(self._start + stop, self._end)
else:
x = max(self._end - stop, self._start)
y = max(self._end - start, self._start)
return _spanset(x, y, self._ascending, self._hiddenrevs)
@encoding.strmethod
def __repr__(self):
d = {False: b'-', True: b'+'}[self._ascending]
return b'<%s%s %d:%d>' % (_typename(self), d, self._start, self._end)
class fullreposet(_spanset):
"""a set containing all revisions in the repo
This class exists to host special optimization and magic to handle virtual
revisions such as "null".
"""
def __init__(self, repo):
super(fullreposet, self).__init__(
0, len(repo), True, repo.changelog.filteredrevs
)
def __and__(self, other):
"""As self contains the whole repo, all of the other set should also be
in self. Therefore `self & other = other`.
This boldly assumes the other contains valid revs only.
"""
# other not a smartset, make is so
if not util.safehasattr(other, b'isascending'):
# filter out hidden revision
# (this boldly assumes all smartset are pure)
#
# `other` was used with "&", let's assume this is a set like
# object.
other = baseset(other - self._hiddenrevs)
other.sort(reverse=self.isdescending())
return other