revlog: update data file record before index rename
authorJoerg Sonnenberger <joerg@bec.de>
Wed, 19 May 2021 13:46:19 +0200
changeset 47285 46b828b85eb7
parent 47284 21ed126bab53
child 47286 18415fc918a1
revlog: update data file record before index rename When migrating from inline to non-inline data storage, the data file is recorded initially as zero sized so that it is removed on failure. But the record has to be updated before the index is renamed, otherwise data is lost on rollback. Differential Revision: https://phab.mercurial-scm.org/D10725
mercurial/revlog.py
tests/test-transaction-rollback-on-revlog-split.t
--- a/mercurial/revlog.py	Tue May 18 02:35:27 2021 +0200
+++ b/mercurial/revlog.py	Wed May 19 13:46:19 2021 +0200
@@ -2190,6 +2190,13 @@
                     fp.write(e)
                 if self._docket is not None:
                     self._docket.index_end = fp.tell()
+
+                # There is a small transactional race here. If the rename of
+                # the index fails, we should remove the datafile. It is more
+                # important to ensure that the data file is not truncated
+                # when the index is replaced as otherwise data is lost.
+                tr.replace(self._datafile, self.start(trindex))
+
                 # the temp file replace the real index when we exit the context
                 # manager
 
--- a/tests/test-transaction-rollback-on-revlog-split.t	Tue May 18 02:35:27 2021 +0200
+++ b/tests/test-transaction-rollback-on-revlog-split.t	Wed May 19 13:46:19 2021 +0200
@@ -1,7 +1,27 @@
 Test correctness of revlog inline -> non-inline transition
 ----------------------------------------------------------
 
+Helper extension to intercept renames.
+
+  $ cat > $TESTTMP/intercept_rename.py << EOF
+  > import os
+  > import sys
+  > from mercurial import extensions, util
+  > 
+  > def extsetup(ui):
+  >     def close(orig, *args, **kwargs):
+  >         path = args[0]._atomictempfile__name
+  >         if path.endswith(b'/.hg/store/data/file.i'):
+  >             os._exit(80)
+  >         return orig(*args, **kwargs)
+  >     extensions.wrapfunction(util.atomictempfile, 'close', close)
+  > EOF
+
+
 Test offset computation to correctly factor in the index entries themselve.
+Also test that the new data size has the correct size if the transaction is aborted
+after the index has been replaced.
+
 Test repo has one small, one moderate and one big change. The clone has
 the small and moderate change and will transition to non-inline storage when
 adding the big change.
@@ -18,6 +38,12 @@
 
   $ hg clone -r 1 troffset-computation troffset-computation-copy --config format.revlog-compression=none -q
   $ cd troffset-computation-copy
+
+Reference size:
+
+  $ f -s .hg/store/data/file*
+  .hg/store/data/file.i: size=1174
+
   $ cat > .hg/hgrc <<EOF
   > [hooks]
   > pretxnchangegroup = python:$TESTDIR/helper-killhook.py:killme
@@ -33,3 +59,42 @@
 #endif
   $ cat .hg/store/journal | tr -s '\000' ' ' | grep data/file | tail -1
   data/file.i 128
+
+The first file.i entry should match the size above.
+The first file.d entry is the temporary record during the split,
+the second entry after the split happened. The sum of the second file.d
+and the second file.i entry should match the first file.i entry.
+
+  $ cat .hg/store/journal | tr -s '\000' ' ' | grep data/file
+  data/file.i 1174
+  data/file.d 0
+  data/file.d 1046
+  data/file.i 128
+  $ cd ..
+
+Now retry the same but intercept the rename of the index and check that
+the journal does not contain the new index size. This demonstrates the edge case
+where the data file is left as garbage.
+
+  $ hg clone -r 1 troffset-computation troffset-computation-copy2 --config format.revlog-compression=none -q
+  $ cd troffset-computation-copy2
+  $ cat > .hg/hgrc <<EOF
+  > [extensions]
+  > intercept_rename = $TESTTMP/intercept_rename.py
+  > [hooks]
+  > pretxnchangegroup = python:$TESTDIR/helper-killhook.py:killme
+  > EOF
+#if chg
+  $ hg pull ../troffset-computation
+  pulling from ../troffset-computation
+  [255]
+#else
+  $ hg pull ../troffset-computation
+  pulling from ../troffset-computation
+  [80]
+#endif
+  $ cat .hg/store/journal | tr -s '\000' ' ' | grep data/file
+  data/file.i 1174
+  data/file.d 0
+  data/file.d 1046
+  $ cd ..