Mercurial > hg
changeset 20828:3210b7930899
revset: improve performance of _generatorset.__contains__ (issue 4201)
_generatorset.__contains__ and __contains__ from child classes were
calling into __iter__ to look for values. Since all
previously-encountered values from the generator were cached and checked
in __contains__ before this iteration, __contains__ was effectively
performing iteration busy work which could lead to an explosion of
redundant work.
This patch changes __contains__ to be more intelligent. Instead of
looking at all values via __iter__, __contains__ will instead go
straight to "new" values from the underlying generator.
On a clone of the Firefox repository with around 200,000 changesets,
this patch decreases the execution time of the revset '::(200067)::'
from ~100s to ~4s on the author's machine. Rebase operations (which use
the aforementioned revset), speed up accordingly.
author | Gregory Szorc <gregory.szorc@gmail.com> |
---|---|
date | Mon, 24 Mar 2014 20:00:18 -0700 |
parents | ca5dd216cb62 |
children | 9a09a625bc93 |
files | mercurial/revset.py |
diffstat | 1 files changed, 12 insertions(+), 9 deletions(-) [+] |
line wrap: on
line diff
--- a/mercurial/revset.py Sat Mar 22 23:39:51 2014 +0900 +++ b/mercurial/revset.py Mon Mar 24 20:00:18 2014 -0700 @@ -2630,8 +2630,8 @@ if x in self._cache: return self._cache[x] - # Use __iter__ which caches values and stores them into self._genlist - for l in self: + # Use new values only, as existing values would be cached. + for l in self._consumegen(): if l == x: return True @@ -2645,17 +2645,18 @@ # started over the generatorset. for l in self._genlist: yield l - else: - # Starting iteration over the generatorset. - self._iterated = True + + for item in self._consumegen(): + yield item + + def _consumegen(self): + self._iterated = True for item in self._gen: self._cache[item] = True self._genlist.append(item) yield item - # Iteration over the generator has finished. Whole value list should be - # cached in self._genlist self._finished = True def set(self): @@ -2680,7 +2681,8 @@ if x in self._cache: return self._cache[x] - for l in self: + # Use new values only, as existing values would be cached. + for l in self._consumegen(): if l == x: return True if l > x: @@ -2702,7 +2704,8 @@ if x in self._cache: return self._cache[x] - for l in self: + # Use new values only, as existing values would be cached. + for l in self._consumegen(): if l == x: return True if l < x: