revset: make internal _list() expression remove duplicated revisions
This allows us to optimize chained 'or' operations to _list() expression.
Unlike _intlist() or _hexlist(), it's difficult to remove duplicates by the
caller of _list() because different symbols can point to the same revision.
If the caller knows all symbols are unique, that probably means revisions or
nodes are known, therefore, _intlist() or _hexlist() should be used instead.
So, it makes sense to check duplicates by _list() function.
'%ls' is no longer used in core, this won't cause performance regression.
--- a/mercurial/revset.py Sun May 24 14:34:12 2015 +0900
+++ b/mercurial/revset.py Sun May 24 14:49:41 2015 +0900
@@ -1920,9 +1920,18 @@
s = getstring(x, "internal error")
if not s:
return baseset()
- ls = [repo[r].rev() for r in s.split('\0')]
- s = subset
- return baseset([r for r in ls if r in s])
+ # remove duplicates here. it's difficult for caller to deduplicate sets
+ # because different symbols can point to the same rev.
+ ls = []
+ seen = set()
+ for t in s.split('\0'):
+ r = repo[t].rev()
+ if r in seen:
+ continue
+ if r in subset:
+ ls.append(r)
+ seen.add(r)
+ return baseset(ls)
# for internal use
def _intlist(repo, subset, x):