revlog: use single file handle when de-inlining revlog
_getsegmentforrevs() will eventually call into _datareadfp() to
resolve a file handle to read revision data. If no file handle
is passed into _getsegmentforrevs(), it opens a new one.
Explicit is better than implicit.
This commit changes _enforceinlinesize() to open a file handle
explicitly when converting inline revlogs to split revlogs and
to pass this file handle into _getsegmentforrevs().
I haven't measured, but this change should improve performance,
as we no longer reopen the revlog for reading for every revision
in the revlog when it is converted from inline to split. Instead,
we open it at most once and use it for the duration of the
operation. That being said, I /think/ the chunk cache may mitigate
the number of file opens required.
Differential Revision: https://phab.mercurial-scm.org/D5265
--- a/mercurial/revlog.py Tue Nov 13 18:44:09 2018 +0300
+++ b/mercurial/revlog.py Tue Oct 30 16:50:05 2018 -0700
@@ -1732,9 +1732,9 @@
fp.flush()
fp.close()
- with self._datafp('w') as df:
+ with self._indexfp('r') as ifh, self._datafp('w') as dfh:
for r in self:
- df.write(self._getsegmentforrevs(r, r)[1])
+ dfh.write(self._getsegmentforrevs(r, r, df=ifh)[1])
with self._indexfp('w') as fp:
self.version &= ~FLAG_INLINE_DATA