changeset 25341:9d6cc87bd507

revset: make internal _list() expression remove duplicated revisions This allows us to optimize chained 'or' operations to _list() expression. Unlike _intlist() or _hexlist(), it's difficult to remove duplicates by the caller of _list() because different symbols can point to the same revision. If the caller knows all symbols are unique, that probably means revisions or nodes are known, therefore, _intlist() or _hexlist() should be used instead. So, it makes sense to check duplicates by _list() function. '%ls' is no longer used in core, this won't cause performance regression.
author Yuya Nishihara <yuya@tcha.org>
date Sun, 24 May 2015 14:49:41 +0900
parents 28800ab40395
children 5dde117269b6
files mercurial/revset.py
diffstat 1 files changed, 12 insertions(+), 3 deletions(-) [+]
line wrap: on
line diff
--- a/mercurial/revset.py	Sun May 24 14:34:12 2015 +0900
+++ b/mercurial/revset.py	Sun May 24 14:49:41 2015 +0900
@@ -1920,9 +1920,18 @@
     s = getstring(x, "internal error")
     if not s:
         return baseset()
-    ls = [repo[r].rev() for r in s.split('\0')]
-    s = subset
-    return baseset([r for r in ls if r in s])
+    # remove duplicates here. it's difficult for caller to deduplicate sets
+    # because different symbols can point to the same rev.
+    ls = []
+    seen = set()
+    for t in s.split('\0'):
+        r = repo[t].rev()
+        if r in seen:
+            continue
+        if r in subset:
+            ls.append(r)
+        seen.add(r)
+    return baseset(ls)
 
 # for internal use
 def _intlist(repo, subset, x):