changelog: keep track of file end in appender (issue5444)
authorDurham Goode <durham@fb.com>
Thu, 15 Dec 2016 11:00:18 -0800
changeset 30596 be520fe3a3e9
parent 30595 99bd5479d58b
child 30597 fa2d2c8ac398
changelog: keep track of file end in appender (issue5444) Previously, changelog.appender.end() would compute the end of the file by joining all the current appended data and checking the length. This is an O(n) operation. e240e914d226 introduced a seek call before every revlog write, which means we are hitting this O(n) behavior n times, which causes changelog writes during a pull to be n^2. In our large repo, this caused pulling 100k commits to go from 17s to 130s. With this fix, it's back to 17s.
mercurial/changelog.py
--- a/mercurial/changelog.py	Thu Dec 15 11:14:00 2016 -0500
+++ b/mercurial/changelog.py	Thu Dec 15 11:00:18 2016 -0800
@@ -79,9 +79,10 @@
         self.fp = fp
         self.offset = fp.tell()
         self.size = vfs.fstat(fp).st_size
+        self._end = self.size
 
     def end(self):
-        return self.size + len("".join(self.data))
+        return self._end
     def tell(self):
         return self.offset
     def flush(self):
@@ -121,6 +122,7 @@
     def write(self, s):
         self.data.append(str(s))
         self.offset += len(s)
+        self._end += len(s)
 
 def _divertopener(opener, target):
     """build an opener that writes in 'target.a' instead of 'target'"""