filelog: cmp: don't read data if hashes are identical (
issue2273)
filelog.renamed() is an expensive call as it reads the filelog if p1 == nullid.
It's more efficient to first compute the hash, and to bail early if
the computed hash is the same as the stored nodeid.
'samehashes' variable is not strictly necessary, but helps for comprehension.
--- a/mercurial/filelog.py Mon Jul 05 18:43:46 2010 +0900
+++ b/mercurial/filelog.py Mon Jul 05 19:49:54 2010 +0900
@@ -62,9 +62,18 @@
returns True if text is different than what is stored.
"""
- # for renames, we have to go the slow way
- if text.startswith('\1\n') or self.renamed(node):
+ t = text
+ if text.startswith('\1\n'):
+ t = '\1\n\1\n' + text
+
+ samehashes = not revlog.revlog.cmp(self, node, t)
+ if samehashes:
+ return False
+
+ # renaming a file produces a different hash, even if the data
+ # remains unchanged. Check if it's the case (slow):
+ if self.renamed(node):
t2 = self.read(node)
return t2 != text
- return revlog.revlog.cmp(self, node, text)
+ return True