tests/test-pull-pull-corruption.t
author Gregory Szorc <gregory.szorc@gmail.com>
Wed, 04 Apr 2018 21:27:02 -0700
changeset 37438 7b7ca9ba2de5
parent 34661 eb586ed5d8ce
child 39489 f1186c292d03
permissions -rw-r--r--
commands: don't violate storage abstractions in `manifest --all` Previously, we asked the store to emit its data files. For modern repos, this would use fncache to resolve the set of files then would stat() each file. For my copy of the mozilla-unified repository, this took 3.3-10s depending on the state of my filesystem cache to render 449,790 items. The previous behavior was a massive layering violation because it assumed tracked files would have specific filenames in specific directories. Alternate storage backends would violate this assumption. The new behavior scans the changelog entries for the set of files changed by each commit. It aggregates them into a set and then sorts and prints the result. This reliably takes ~16.3s on my machine. ~80% of the time is spent in zlib decompression. The performance regression is unfortunate. If we want to claw it back, we can create a proper storage API to query for the set of tracked files. I'm not opposed to doing that. But I'm in no hurry because I suspect ~0 people care about the performance of `hg manifest --all`. .. perf:: `hg manifest --all` is likely slower due to changing its implementation to respect storage interface boundaries. If you are impacted by this regression in a meaningful way, please make noise on the development mailing list and it can be dealt with. Differential Revision: https://phab.mercurial-scm.org/D3119

Corrupt an hg repo with two pulls.
create one repo with a long history

  $ hg init source1
  $ cd source1
  $ touch foo
  $ hg add foo
  $ for i in 1 2 3 4 5 6 7 8 9 10; do
  >     echo $i >> foo
  >     hg ci -m $i
  > done
  $ cd ..

create one repo with a shorter history

  $ hg clone -r 0 source1 source2
  adding changesets
  adding manifests
  adding file changes
  added 1 changesets with 1 changes to 1 files
  new changesets 495a0ec48aaf
  updating to branch default
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  $ cd source2
  $ echo a >> foo
  $ hg ci -m a
  $ cd ..

create a third repo to pull both other repos into it

  $ hg init corrupted
  $ cd corrupted

use a hook to make the second pull start while the first one is still running

  $ echo '[hooks]' >> .hg/hgrc
  $ echo 'prechangegroup = sleep 5' >> .hg/hgrc

start a pull...

  $ hg pull ../source1 > pull.out 2>&1 &

... and start another pull before the first one has finished

  $ sleep 1
  $ hg pull ../source2 2>/dev/null
  pulling from ../source2
  searching for changes
  adding changesets
  adding manifests
  adding file changes
  added 1 changesets with 1 changes to 1 files (+1 heads)
  new changesets ca3c05af513e
  (run 'hg heads' to see heads, 'hg merge' to merge)
  $ cat pull.out
  pulling from ../source1
  requesting all changes
  adding changesets
  adding manifests
  adding file changes
  added 10 changesets with 10 changes to 1 files
  new changesets 495a0ec48aaf:1e7b6c812ca8
  (run 'hg update' to get a working copy)

see the result

  $ wait
  $ hg verify
  checking changesets
  checking manifests
  crosschecking files in changesets and manifests
  checking files
  1 files, 11 changesets, 11 total revisions

  $ cd ..