changeset 7883:c63c30ae9e39

revlog: faster hash computation when one of the parent node is null Because we often compute sha1(nullid), it's interesting to copy a precomputed hash of nullid instead of computing everytime the same hash. Similarly, when one of the parents is null, we can avoid a < comparison (sort). Overall, this change adds a string equality comparison on each hash() call, but when p2 is null, we drop one string < comparison, and copy a hash instead of computing it. Since it is common to have revisions with only one parent, this change makes hash() 25% faster when cloning a big repository.
author Nicolas Dumazet <nicdumz.commits@gmail.com>
date Mon, 23 Mar 2009 15:32:29 +0100
parents 8d78fc991b71
children 38de4b36bcdd
files mercurial/revlog.py
diffstat 1 files changed, 13 insertions(+), 4 deletions(-) [+]
line wrap: on
line diff
--- a/mercurial/revlog.py	Mon Mar 23 15:36:30 2009 +0100
+++ b/mercurial/revlog.py	Mon Mar 23 15:32:29 2009 +0100
@@ -42,6 +42,8 @@
 def offset_type(offset, type):
     return long(long(offset) << 16 | type)
 
+nullhash = _sha(nullid)
+
 def hash(text, p1, p2):
     """generate a hash from the given text and its parent hashes
 
@@ -49,10 +51,17 @@
     in a manner that makes it easy to distinguish nodes with the same
     content in the revision graph.
     """
-    l = [p1, p2]
-    l.sort()
-    s = _sha(l[0])
-    s.update(l[1])
+    # As of now, if one of the parent node is null, p2 is null
+    if p2 == nullid:
+        # deep copy of a hash is faster than creating one
+        s = nullhash.copy()
+        s.update(p1)
+    else:
+        # none of the parent nodes are nullid
+        l = [p1, p2]
+        l.sort()
+        s = _sha(l[0])
+        s.update(l[1])
     s.update(text)
     return s.digest()