revlog: split the content verification of a node into a separate method
This will be used by LFS to tune what is skipped.
In the future, this could also be used by LFS to indicate which nodes tagged
with `skipread` are simply in need of a blob fetch, so that they can be done in
a batch later. (Currently, `skipread` also indicates censored data and errors.)
Additionally, it could be used to cache the sha1 hash value for each blob so
that large blobs don't need to be re-read and re-hashed if they are used by
multiple nodes.
Differential Revision: https://phab.mercurial-scm.org/D7710
--- a/mercurial/revlog.py Sun Dec 22 00:47:33 2019 -0500
+++ b/mercurial/revlog.py Sun Dec 22 16:36:09 2019 -0500
@@ -148,6 +148,16 @@
return int(int(offset) << 16 | type)
+def _verify_revision(rl, skipflags, state, node):
+ """Verify the integrity of the given revlog ``node`` while providing a hook
+ point for extensions to influence the operation."""
+ if skipflags:
+ state[b'skipread'].add(node)
+ else:
+ # Side-effect: read content and verify hash.
+ rl.revision(node)
+
+
@attr.s(slots=True, frozen=True)
class _revisioninfo(object):
"""Information about a revision that allows building its fulltext
@@ -2914,11 +2924,7 @@
if skipflags:
skipflags &= self.flags(rev)
- if skipflags:
- state[b'skipread'].add(node)
- else:
- # Side-effect: read content and verify hash.
- self.revision(node)
+ _verify_revision(self, skipflags, state, node)
l1 = self.rawsize(rev)
l2 = len(self.rawdata(node))