verify: allow the storage to signal when renames can be tested on `skipread`
This applies the new marker in the lfs handler to show it in action, and adds
the test mentioned at the beginning of the series to show that fulltext isn't
necessary in the LFS case.
The existing `skipread` isn't enough, because it is also set if an error occurs
reading the revlog data, or the data is censored. It could probably be cleared,
but then it technically violates the interface contract. That wouldn't matter
for the existing verify algorithm, but it isn't clear how that will change as
alternate storage support is added.
The flag is probably pretty revlog specific, given the comments in verify.py.
But there's already filelog specific stuff in there and I'm not sure what future
storage will bring, so I don't want to over-engineer this. Likewise, I'm not
sure that we want the verify method for each storage type to completely drive
the bus when it comes to detecting renames, so I don't want to go down the
rabbithole of having verifyintegrity() return metadata hints at this point.
Differential Revision: https://phab.mercurial-scm.org/D7713
--- a/hgext/lfs/wrapper.py Sun Dec 22 23:50:19 2019 -0500
+++ b/hgext/lfs/wrapper.py Mon Dec 23 01:12:20 2019 -0500
@@ -236,6 +236,10 @@
# the revlog.
if rl.opener.lfslocalblobstore.has(metadata.oid()):
skipflags &= ~revlog.REVIDX_EXTSTORED
+ elif skipflags & revlog.REVIDX_EXTSTORED:
+ # The wrapped method will set `skipread`, but there's enough local
+ # info to check renames.
+ state[b'safe_renamed'].add(node)
orig(rl, skipflags, state, node)
--- a/mercurial/interfaces/repository.py Sun Dec 22 23:50:19 2019 -0500
+++ b/mercurial/interfaces/repository.py Mon Dec 23 01:12:20 2019 -0500
@@ -878,7 +878,9 @@
If individual revisions cannot have their revision content resolved,
the method is expected to set the ``skipread`` key to a set of nodes
- that encountered problems.
+ that encountered problems. If set, the method can also add the node(s)
+ to ``safe_renamed`` in order to indicate nodes that may perform the
+ rename checks with currently accessible data.
The method yields objects conforming to the ``iverifyproblem``
interface.
--- a/mercurial/revlog.py Sun Dec 22 23:50:19 2019 -0500
+++ b/mercurial/revlog.py Mon Dec 23 01:12:20 2019 -0500
@@ -2874,6 +2874,7 @@
)
state[b'skipread'] = set()
+ state[b'safe_renamed'] = set()
for rev in self:
node = self.node(rev)
--- a/mercurial/verify.py Sun Dec 22 23:50:19 2019 -0500
+++ b/mercurial/verify.py Mon Dec 23 01:12:20 2019 -0500
@@ -529,6 +529,8 @@
else:
# Guard against implementations not setting this.
state[b'skipread'] = set()
+ state[b'safe_renamed'] = set()
+
for problem in fl.verifyintegrity(state):
if problem.node is not None:
linkrev = fl.linkrev(fl.rev(problem.node))
@@ -560,7 +562,7 @@
else:
del filenodes[f][n]
- if n in state[b'skipread']:
+ if n in state[b'skipread'] and n not in state[b'safe_renamed']:
continue
# check renames
--- a/tests/test-lfs.t Sun Dec 22 23:50:19 2019 -0500
+++ b/tests/test-lfs.t Mon Dec 23 01:12:20 2019 -0500
@@ -218,6 +218,15 @@
R large
$ hg commit -m 'renames'
+ $ hg cat -r . l -T '{rawdata}\n'
+ version https://git-lfs.github.com/spec/v1
+ oid sha256:66100b384bf761271b407d79fc30cdd0554f3b2c5d944836e936d584b88ce88e
+ size 39
+ x-hg-copy large
+ x-hg-copyrev 2c531e0992ff3107c511b53cb82a91b6436de8b2
+ x-is-binary 0
+
+
$ hg files -r . 'set:copied()'
l
s
@@ -796,27 +805,57 @@
$ test -f fromcorrupt/.hg/store/lfs/objects/66/100b384bf761271b407d79fc30cdd0554f3b2c5d944836e936d584b88ce88e
[1]
-Verify will not try to download lfs blobs, if told not to process lfs content
+Verify will not try to download lfs blobs, if told not to process lfs content.
+The extension makes sure that the filelog.renamed() path is taken on a missing
+blob, and the output shows that it isn't fetched.
- $ hg -R fromcorrupt --config lfs.usercache=emptycache verify -v --no-lfs
+ $ cat > $TESTTMP/lfsrename.py <<EOF
+ > from mercurial import (
+ > exthelper,
+ > )
+ >
+ > from hgext.lfs import (
+ > pointer,
+ > wrapper,
+ > )
+ >
+ > eh = exthelper.exthelper()
+ > uisetup = eh.finaluisetup
+ >
+ > @eh.wrapfunction(wrapper, b'filelogrenamed')
+ > def filelogrenamed(orig, orig1, self, node):
+ > ret = orig(orig1, self, node)
+ > if wrapper._islfs(self._revlog, node) and ret:
+ > rawtext = self._revlog.rawdata(node)
+ > metadata = pointer.deserialize(rawtext)
+ > print('lfs blob %s renamed %s -> %s'
+ > % (metadata[b'oid'], ret[0], self._revlog.filename))
+ > return ret
+ > EOF
+
+ $ hg -R fromcorrupt --config lfs.usercache=emptycache verify -v --no-lfs \
+ > --config extensions.x=$TESTTMP/lfsrename.py
repository uses revlog format 1
checking changesets
checking manifests
crosschecking files in changesets and manifests
checking files
lfs: found 22f66a3fc0b9bf3f012c814303995ec07099b3a9ce02a7af84b5970811074a3b in the local lfs store
+ lfs blob sha256:66100b384bf761271b407d79fc30cdd0554f3b2c5d944836e936d584b88ce88e renamed large -> l
checked 5 changesets with 10 changes to 4 files
Verify will not try to download lfs blobs, if told not to by the config option
$ hg -R fromcorrupt --config lfs.usercache=emptycache verify -v \
- > --config verify.skipflags=8192
+ > --config verify.skipflags=8192 \
+ > --config extensions.x=$TESTTMP/lfsrename.py
repository uses revlog format 1
checking changesets
checking manifests
crosschecking files in changesets and manifests
checking files
lfs: found 22f66a3fc0b9bf3f012c814303995ec07099b3a9ce02a7af84b5970811074a3b in the local lfs store
+ lfs blob sha256:66100b384bf761271b407d79fc30cdd0554f3b2c5d944836e936d584b88ce88e renamed large -> l
checked 5 changesets with 10 changes to 4 files
Verify will copy/link all lfs objects into the local store that aren't already