view tests/test-audit-subrepo.t @ 46607:e9901d01d135

revlog: add a mechanism to verify expected file position before appending If someone uses `hg debuglocks`, or some non-hg process writes to the .hg directory without respecting the locks, or if the repo's on a networked filesystem, it's possible for the revlog code to write out corrupted data. The form of this corruption can vary depending on what data was written and how that happened. We are in the "networked filesystem" case (though I've had users also do this to themselves with the "`hg debuglocks`" scenario), and most often see this with the changelog. What ends up happening is we produce two items (let's call them rev1 and rev2) in the .i file that have the same linkrev, baserev, and offset into the .d file, while the data in the .d file is appended properly. rev2's compressed_size is accurate for rev2, but when we go to decompress the data in the .d file, we use the offset that's recorded in the index file, which is the same as rev1, and attempt to decompress rev2.compressed_size bytes of rev1's data. This usually does not succeed. :) When using inline data, this also fails, though I haven't investigated why too closely. This shows up as a "patch decode" error. I believe what's happening there is that we're basically ignoring the offset field, getting the data properly, but since baserev != rev, it thinks this is a delta based on rev (instead of a full text) and can't actually apply it as such. For now, I'm going to make this an optional component and default it to entirely off. I may increase the default severity of this in the future, once I've enabled it for my users and we gain more experience with it. Luckily, most of my users have a versioned filesystem and can roll back to before the corruption has been written, it's just a hassle to do so and not everyone knows how (so it's a support burden). Users on other filesystems will not have that luxury, and this can cause them to have a corrupted repository that they are unlikely to know how to resolve, and they'll see this as a data-loss event. Refusing to create the corruption is a much better user experience. This mechanism is not perfect. There may be false-negatives (racy writes that are not detected). There should not be any false-positives (non-racy writes that are detected as such). This is not a mechanism that makes putting a repo on a networked filesystem "safe" or "supported", just *less* likely to cause corruption. Differential Revision: https://phab.mercurial-scm.org/D9952
author Kyle Lippincott <spectral@google.com>
date Wed, 03 Feb 2021 16:33:10 -0800
parents dc00324e80f4
children 35f1ecd84bd0
line wrap: on
line source

Test illegal name
-----------------

on commit:

  $ hg init hgname
  $ cd hgname
  $ mkdir sub
  $ hg init sub/.hg
  $ echo 'sub/.hg = sub/.hg' >> .hgsub
  $ hg ci -qAm 'add subrepo "sub/.hg"'
  abort: path 'sub/.hg' is inside nested repo 'sub'
  [255]

prepare tampered repo (including the commit above):

  $ hg import --bypass -qm 'add subrepo "sub/.hg"' - <<'EOF'
  > diff --git a/.hgsub b/.hgsub
  > new file mode 100644
  > --- /dev/null
  > +++ b/.hgsub
  > @@ -0,0 +1,1 @@
  > +sub/.hg = sub/.hg
  > diff --git a/.hgsubstate b/.hgsubstate
  > new file mode 100644
  > --- /dev/null
  > +++ b/.hgsubstate
  > @@ -0,0 +1,1 @@
  > +0000000000000000000000000000000000000000 sub/.hg
  > EOF
  $ cd ..

on clone (and update):

  $ hg clone -q hgname hgname2
  abort: path 'sub/.hg' is inside nested repo 'sub'
  [255]

Test absolute path
------------------

on commit:

  $ hg init absolutepath
  $ cd absolutepath
  $ hg init sub
  $ echo '/sub = sub' >> .hgsub
  $ hg ci -qAm 'add subrepo "/sub"'
  abort: path contains illegal component: /sub
  [255]

prepare tampered repo (including the commit above):

  $ hg import --bypass -qm 'add subrepo "/sub"' - <<'EOF'
  > diff --git a/.hgsub b/.hgsub
  > new file mode 100644
  > --- /dev/null
  > +++ b/.hgsub
  > @@ -0,0 +1,1 @@
  > +/sub = sub
  > diff --git a/.hgsubstate b/.hgsubstate
  > new file mode 100644
  > --- /dev/null
  > +++ b/.hgsubstate
  > @@ -0,0 +1,1 @@
  > +0000000000000000000000000000000000000000 /sub
  > EOF
  $ cd ..

on clone (and update):

  $ hg clone -q absolutepath absolutepath2
  abort: path contains illegal component: /sub
  [255]

Test root path
--------------

on commit:

  $ hg init rootpath
  $ cd rootpath
  $ hg init sub
  $ echo '/ = sub' >> .hgsub
  $ hg ci -qAm 'add subrepo "/"'
  abort: path ends in directory separator: /
  [255]

prepare tampered repo (including the commit above):

  $ hg import --bypass -qm 'add subrepo "/"' - <<'EOF'
  > diff --git a/.hgsub b/.hgsub
  > new file mode 100644
  > --- /dev/null
  > +++ b/.hgsub
  > @@ -0,0 +1,1 @@
  > +/ = sub
  > diff --git a/.hgsubstate b/.hgsubstate
  > new file mode 100644
  > --- /dev/null
  > +++ b/.hgsubstate
  > @@ -0,0 +1,1 @@
  > +0000000000000000000000000000000000000000 /
  > EOF
  $ cd ..

on clone (and update):

  $ hg clone -q rootpath rootpath2
  abort: path ends in directory separator: /
  [255]

Test empty path
---------------

on commit:

  $ hg init emptypath
  $ cd emptypath
  $ hg init sub
  $ echo '= sub' >> .hgsub
  $ hg ci -qAm 'add subrepo ""'
  config error at .hgsub:1: = sub
  [30]

prepare tampered repo (including the commit above):

  $ hg import --bypass -qm 'add subrepo ""' - <<'EOF'
  > diff --git a/.hgsub b/.hgsub
  > new file mode 100644
  > --- /dev/null
  > +++ b/.hgsub
  > @@ -0,0 +1,1 @@
  > += sub
  > diff --git a/.hgsubstate b/.hgsubstate
  > new file mode 100644
  > --- /dev/null
  > +++ b/.hgsubstate
  > @@ -0,0 +1,1 @@
  > +0000000000000000000000000000000000000000
  > EOF
  $ cd ..

on clone (and update):

  $ hg clone -q emptypath emptypath2
  config error at .hgsub:1: = sub
  [30]

Test current path
-----------------

on commit:

  $ hg init currentpath
  $ cd currentpath
  $ hg init sub
  $ echo '. = sub' >> .hgsub
  $ hg ci -qAm 'add subrepo "."'
  abort: subrepo path contains illegal component: .
  [255]

prepare tampered repo (including the commit above):

  $ hg import --bypass -qm 'add subrepo "."' - <<'EOF'
  > diff --git a/.hgsub b/.hgsub
  > new file mode 100644
  > --- /dev/null
  > +++ b/.hgsub
  > @@ -0,0 +1,1 @@
  > +.= sub
  > diff --git a/.hgsubstate b/.hgsubstate
  > new file mode 100644
  > --- /dev/null
  > +++ b/.hgsubstate
  > @@ -0,0 +1,1 @@
  > +0000000000000000000000000000000000000000 .
  > EOF
  $ cd ..

on clone (and update):

  $ hg clone -q currentpath currentpath2
  abort: subrepo path contains illegal component: .
  [255]

Test outer path
---------------

on commit:

  $ mkdir outerpath
  $ cd outerpath
  $ hg init main
  $ cd main
  $ hg init ../sub
  $ echo '../sub = ../sub' >> .hgsub
  $ hg ci -qAm 'add subrepo "../sub"'
  abort: path contains illegal component: ../sub
  [255]

prepare tampered repo (including the commit above):

  $ hg import --bypass -qm 'add subrepo "../sub"' - <<'EOF'
  > diff --git a/.hgsub b/.hgsub
  > new file mode 100644
  > --- /dev/null
  > +++ b/.hgsub
  > @@ -0,0 +1,1 @@
  > +../sub = ../sub
  > diff --git a/.hgsubstate b/.hgsubstate
  > new file mode 100644
  > --- /dev/null
  > +++ b/.hgsubstate
  > @@ -0,0 +1,1 @@
  > +0000000000000000000000000000000000000000 ../sub
  > EOF
  $ cd ..

on clone (and update):

  $ hg clone -q main main2
  abort: path contains illegal component: ../sub
  [255]
  $ cd ..

Test variable expansion
-----------------------

Subrepository paths shouldn't be expanded, but we fail to handle them
properly. Any local repository paths are expanded.

on commit:

  $ mkdir envvar
  $ cd envvar
  $ hg init main
  $ cd main
  $ hg init sub1
  $ cat <<'EOF' > sub1/hgrc
  > [hooks]
  > log = echo pwned
  > EOF
  $ hg -R sub1 ci -qAm 'add sub1 files'
  $ hg -R sub1 log -r. -T '{node}\n'
  39eb4b4d3e096527668784893a9280578a8f38b8
  $ echo '$SUB = sub1' >> .hgsub
  $ SUB=sub1 hg ci -qAm 'add subrepo "$SUB"'
  abort: subrepo path contains illegal component: $SUB
  [255]

prepare tampered repo (including the changes above as two commits):

  $ hg import --bypass -qm 'add subrepo "$SUB"' - <<'EOF'
  > diff --git a/.hgsub b/.hgsub
  > new file mode 100644
  > --- /dev/null
  > +++ b/.hgsub
  > @@ -0,0 +1,1 @@
  > +$SUB = sub1
  > diff --git a/.hgsubstate b/.hgsubstate
  > new file mode 100644
  > --- /dev/null
  > +++ b/.hgsubstate
  > @@ -0,0 +1,1 @@
  > +0000000000000000000000000000000000000000 $SUB
  > EOF
  $ hg debugsetparents 0
  $ hg import --bypass -qm 'update subrepo "$SUB"' - <<'EOF'
  > diff --git a/.hgsubstate b/.hgsubstate
  > --- a/.hgsubstate
  > +++ b/.hgsubstate
  > @@ -1,1 +1,1 @@
  > -0000000000000000000000000000000000000000 $SUB
  > +39eb4b4d3e096527668784893a9280578a8f38b8 $SUB
  > EOF
  $ cd ..

on clone (and update) with various substitutions:

  $ hg clone -q main main2
  abort: subrepo path contains illegal component: $SUB
  [255]
  $ ls -A main2
  .hg
  .hgsub
  .hgsubstate

  $ SUB=sub1 hg clone -q main main3
  abort: subrepo path contains illegal component: $SUB
  [255]
  $ ls -A main3
  .hg
  .hgsub
  .hgsubstate

  $ SUB=sub2 hg clone -q main main4
  abort: subrepo path contains illegal component: $SUB
  [255]
  $ ls -A main4
  .hg
  .hgsub
  .hgsubstate

on clone empty subrepo into .hg, then pull (and update), which at least fails:

  $ SUB=.hg hg clone -qr0 main main5
  abort: subrepo path contains illegal component: $SUB
  [255]
  $ ls -A main5
  .hg
  .hgsub
  .hgsubstate
  $ test -d main5/.hg/.hg
  [1]
  $ SUB=.hg hg -R main5 pull -u
  pulling from $TESTTMP/envvar/main
  searching for changes
  adding changesets
  adding manifests
  adding file changes
  added 1 changesets with 1 changes to 1 files
  new changesets 7a2f0e59146f
  .hgsubstate: untracked file differs
  abort: untracked files in working directory differ from files in requested revision
  [20]
  $ cat main5/.hg/hgrc | grep pwned
  [1]

on clone (and update) into .hg, which at least fails:

  $ SUB=.hg hg clone -q main main6
  abort: subrepo path contains illegal component: $SUB
  [255]
  $ ls -A main6
  .hg
  .hgsub
  .hgsubstate
  $ cat main6/.hg/hgrc | grep pwned
  [1]

on clone (and update) into .hg/* subdir:

  $ SUB=.hg/foo hg clone -q main main7
  abort: subrepo path contains illegal component: $SUB
  [255]
  $ ls -A main7
  .hg
  .hgsub
  .hgsubstate
  $ test -d main7/.hg/.hg
  [1]

on clone (and update) into outer tree:

  $ SUB=../out-of-tree-write hg clone -q main main8
  abort: subrepo path contains illegal component: $SUB
  [255]
  $ ls -A main8
  .hg
  .hgsub
  .hgsubstate

on clone (and update) into e.g. $HOME, which doesn't work since subrepo paths
are concatenated prior to variable expansion:

  $ SUB="$TESTTMP/envvar/fakehome" hg clone -q main main9
  abort: subrepo path contains illegal component: $SUB
  [255]
  $ ls -A main9 | wc -l
  \s*3 (re)

  $ ls
  main
  main2
  main3
  main4
  main5
  main6
  main7
  main8
  main9
  $ cd ..

Test tilde
----------

The leading tilde may be expanded to $HOME, but it can be a valid subrepo
path in theory. However, we want to prohibit it as there might be unsafe
handling of such paths.

on commit:

  $ hg init tilde
  $ cd tilde
  $ hg init './~'
  $ echo '~ = ~' >> .hgsub
  $ hg ci -qAm 'add subrepo "~"'
  abort: subrepo path contains illegal component: ~
  [255]

prepare tampered repo (including the commit above):

  $ hg import --bypass -qm 'add subrepo "~"' - <<'EOF'
  > diff --git a/.hgsub b/.hgsub
  > new file mode 100644
  > --- /dev/null
  > +++ b/.hgsub
  > @@ -0,0 +1,1 @@
  > +~ = ~
  > diff --git a/.hgsubstate b/.hgsubstate
  > new file mode 100644
  > --- /dev/null
  > +++ b/.hgsubstate
  > @@ -0,0 +1,1 @@
  > +0000000000000000000000000000000000000000 ~
  > EOF
  $ cd ..

on clone (and update):

  $ hg clone -q tilde tilde2
  abort: subrepo path contains illegal component: ~
  [255]

Test direct symlink traversal
-----------------------------

#if symlink

on commit:

  $ mkdir hgsymdir
  $ hg init hgsymdir/root
  $ cd hgsymdir/root
  $ ln -s ../out
  $ hg ci -qAm 'add symlink "out"'
  $ hg init ../out
  $ echo 'out = out' >> .hgsub
  $ hg ci -qAm 'add subrepo "out"'
  abort: subrepo 'out' traverses symbolic link
  [255]

prepare tampered repo (including the commit above):

  $ hg import --bypass -qm 'add subrepo "out"' - <<'EOF'
  > diff --git a/.hgsub b/.hgsub
  > new file mode 100644
  > --- /dev/null
  > +++ b/.hgsub
  > @@ -0,0 +1,1 @@
  > +out = out
  > diff --git a/.hgsubstate b/.hgsubstate
  > new file mode 100644
  > --- /dev/null
  > +++ b/.hgsubstate
  > @@ -0,0 +1,1 @@
  > +0000000000000000000000000000000000000000 out
  > EOF
  $ cd ../..

on clone (and update):

  $ mkdir hgsymdir2
  $ hg clone -q hgsymdir/root hgsymdir2/root
  abort: subrepo 'out' traverses symbolic link
  [255]
  $ ls hgsymdir2
  root

#endif

Test indirect symlink traversal
-------------------------------

#if symlink

on commit:

  $ mkdir hgsymin
  $ hg init hgsymin/root
  $ cd hgsymin/root
  $ ln -s ../out
  $ hg ci -qAm 'add symlink "out"'
  $ mkdir ../out
  $ hg init ../out/sub
  $ echo 'out/sub = out/sub' >> .hgsub
  $ hg ci -qAm 'add subrepo "out/sub"'
  abort: path 'out/sub' traverses symbolic link 'out'
  [255]

prepare tampered repo (including the commit above):

  $ hg import --bypass -qm 'add subrepo "out/sub"' - <<'EOF'
  > diff --git a/.hgsub b/.hgsub
  > new file mode 100644
  > --- /dev/null
  > +++ b/.hgsub
  > @@ -0,0 +1,1 @@
  > +out/sub = out/sub
  > diff --git a/.hgsubstate b/.hgsubstate
  > new file mode 100644
  > --- /dev/null
  > +++ b/.hgsubstate
  > @@ -0,0 +1,1 @@
  > +0000000000000000000000000000000000000000 out/sub
  > EOF
  $ cd ../..

on clone (and update):

  $ mkdir hgsymin2
  $ hg clone -q hgsymin/root hgsymin2/root
  abort: path 'out/sub' traverses symbolic link 'out'
  [255]
  $ ls hgsymin2
  root

#endif

Test symlink traversal by variable expansion
--------------------------------------------

#if symlink

  $ FAKEHOME="$TESTTMP/envvarsym/fakehome"

on commit:

  $ mkdir envvarsym
  $ cd envvarsym
  $ hg init main
  $ cd main
  $ ln -s "`echo "$FAKEHOME" | sed 's|\(.\)/.*|\1|'`"
  $ hg ci -qAm 'add symlink to top-level system directory'

  $ hg init sub1
  $ echo pwned > sub1/pwned
  $ hg -R sub1 ci -qAm 'add sub1 files'
  $ hg -R sub1 log -r. -T '{node}\n'
  f40c9134ba1b6961e12f250868823f0092fb68a8
  $ echo '$SUB = sub1' >> .hgsub
  $ SUB="$FAKEHOME" hg ci -qAm 'add subrepo "$SUB"'
  abort: subrepo path contains illegal component: $SUB
  [255]

prepare tampered repo (including the changes above as two commits):

  $ hg import --bypass -qm 'add subrepo "$SUB"' - <<'EOF'
  > diff --git a/.hgsub b/.hgsub
  > new file mode 100644
  > --- /dev/null
  > +++ b/.hgsub
  > @@ -0,0 +1,1 @@
  > +$SUB = sub1
  > diff --git a/.hgsubstate b/.hgsubstate
  > new file mode 100644
  > --- /dev/null
  > +++ b/.hgsubstate
  > @@ -0,0 +1,1 @@
  > +0000000000000000000000000000000000000000 $SUB
  > EOF
  $ hg debugsetparents 1
  $ hg import --bypass -qm 'update subrepo "$SUB"' - <<'EOF'
  > diff --git a/.hgsubstate b/.hgsubstate
  > --- a/.hgsubstate
  > +++ b/.hgsubstate
  > @@ -1,1 +1,1 @@
  > -0000000000000000000000000000000000000000 $SUB
  > +f40c9134ba1b6961e12f250868823f0092fb68a8 $SUB
  > EOF
  $ cd ..

on clone (and update) without fakehome directory:

  $ rm -fR "$FAKEHOME"
  $ SUB="$FAKEHOME" hg clone -q main main2
  abort: subrepo path contains illegal component: $SUB
  [255]
  $ test -d "$FAKEHOME"
  [1]

on clone (and update) with empty fakehome directory:

  $ rm -fR "$FAKEHOME"
  $ mkdir "$FAKEHOME"
  $ SUB="$FAKEHOME" hg clone -q main main3
  abort: subrepo path contains illegal component: $SUB
  [255]
  $ ls "$FAKEHOME"

on clone (and update) with non-empty fakehome directory:

  $ rm -fR "$FAKEHOME"
  $ mkdir "$FAKEHOME"
  $ touch "$FAKEHOME/a"
  $ SUB="$FAKEHOME" hg clone -q main main4
  abort: subrepo path contains illegal component: $SUB
  [255]
  $ ls "$FAKEHOME"
  a

on clone empty subrepo with non-empty fakehome directory,
then pull (and update):

  $ rm -fR "$FAKEHOME"
  $ mkdir "$FAKEHOME"
  $ touch "$FAKEHOME/a"
  $ SUB="$FAKEHOME" hg clone -qr1 main main5
  abort: subrepo path contains illegal component: $SUB
  [255]
  $ ls "$FAKEHOME"
  a
  $ test -d "$FAKEHOME/.hg"
  [1]
  $ SUB="$FAKEHOME" hg -R main5 pull -u
  pulling from $TESTTMP/envvarsym/main
  searching for changes
  adding changesets
  adding manifests
  adding file changes
  added 1 changesets with 1 changes to 1 files
  new changesets * (glob)
  .hgsubstate: untracked file differs
  abort: untracked files in working directory differ from files in requested revision
  [20]
  $ ls "$FAKEHOME"
  a
  $ test -d "$FAKEHOME/.hg"
  [1]

on clone empty subrepo with hg-managed fakehome directory,
then pull (and update):

  $ rm -fR "$FAKEHOME"
  $ hg init "$FAKEHOME"
  $ touch "$FAKEHOME/a"
  $ hg -R "$FAKEHOME" ci -qAm 'add fakehome file'
  $ SUB="$FAKEHOME" hg clone -qr1 main main6
  abort: subrepo path contains illegal component: $SUB
  [255]
  $ ls -A "$FAKEHOME"
  .hg
  a
  $ SUB="$FAKEHOME" hg -R main6 pull -u
  pulling from $TESTTMP/envvarsym/main
  searching for changes
  adding changesets
  adding manifests
  adding file changes
  added 1 changesets with 1 changes to 1 files
  new changesets * (glob)
  .hgsubstate: untracked file differs
  abort: untracked files in working directory differ from files in requested revision
  [20]
  $ ls -A "$FAKEHOME"
  .hg
  a

on clone only symlink with hg-managed fakehome directory,
then pull (and update):

  $ rm -fR "$FAKEHOME"
  $ hg init "$FAKEHOME"
  $ touch "$FAKEHOME/a"
  $ hg -R "$FAKEHOME" ci -qAm 'add fakehome file'
  $ SUB="$FAKEHOME" hg clone -qr0 main main7
  $ ls -A "$FAKEHOME"
  .hg
  a
  $ SUB="$FAKEHOME" hg -R main7 pull -uf
  pulling from $TESTTMP/envvarsym/main
  searching for changes
  adding changesets
  adding manifests
  adding file changes
  added 2 changesets with 3 changes to 2 files
  new changesets * (glob)
  abort: subrepo path contains illegal component: $SUB
  [255]
  $ ls -A "$FAKEHOME"
  .hg
  a

  $ cd ..

#endif

Test drive letter
-----------------

Windows has a weird relative path that can change the drive letter, which
should also be prohibited on Windows.

prepare tampered repo:

  $ hg init driveletter
  $ cd driveletter
  $ hg import --bypass -qm 'add subrepo "X:"' - <<'EOF'
  > diff --git a/.hgsub b/.hgsub
  > new file mode 100644
  > --- /dev/null
  > +++ b/.hgsub
  > @@ -0,0 +1,1 @@
  > +X: = foo
  > diff --git a/.hgsubstate b/.hgsubstate
  > new file mode 100644
  > --- /dev/null
  > +++ b/.hgsubstate
  > @@ -0,0 +1,1 @@
  > +0000000000000000000000000000000000000000 X:
  > EOF
  $ cd ..

on clone (and update):

#if windows

  $ hg clone -q driveletter driveletter2
  abort: path contains illegal component: X:
  [255]

#else

  $ hg clone -q driveletter driveletter2
  $ ls -A driveletter2
  .hg
  .hgsub
  .hgsubstate
  X:

#endif