tests/test-convert-darcs.t
author Paul Morelle <paul.morelle@octobus.net>
Tue, 05 Jun 2018 08:19:35 +0200
changeset 38718 f8762ea73e0d
parent 30296 ab929a174f7b
permissions -rw-r--r--
sparse-revlog: implement algorithm to write sparse delta chains (issue5480) The classic behavior of revlog._isgooddeltainfo is to consider the span size of the whole delta chain, and limit it to 4 * textlen. Once sparse-revlog writing is allowed (and enforced with a requirement), revlog._isgooddeltainfo considers the span of the largest chunk as the distance used in the verification, instead of using the span of the whole delta chain. In order to compute the span of the largest chunk, we need to slice into chunks a chain with the new revision at the top of the revlog, and take the maximal span of these chunks. The sparse read density is a parameter to the slicing, as it will stop when the global read density reaches this threshold. For instance, a density of 50% means that 2 of 4 read bytes are actually used for the reconstruction of the revision (the others are part of other chains). This allows a new revision to be potentially stored with a diff against another revision anywhere in the history, instead of forcing it in the last 4 * textlen. The result is a much better compression on repositories that have many concurrent branches. Here are a comparison between using deltas from current upstream (aggressive-merge-deltas on by default) and deltas from a sparse-revlog Comparison of `.hg/store/` size: mercurial (6.74% merges): before: 46,831,873 bytes after: 46,795,992 bytes (no relevant change) pypy (8.30% merges): before: 333,524,651 bytes after: 308,417,511 bytes -8% netbeans (34.21% merges): before: 1,141,847,554 bytes after: 1,131,093,161 bytes -1% mozilla-central (4.84% merges): before: 2,344,248,850 bytes after: 2,328,459,258 bytes -1% large-private-repo-A (merge 19.73%) before: 41,510,550,163 bytes after: 8,121,763,428 bytes -80% large-private-repo-B (23.77%) before: 58,702,221,709 bytes after: 8,351,588,828 bytes -76% Comparison of `00manifest.d` size: mercurial (6.74% merges): before: 6,143,044 bytes after: 6,107,163 bytes pypy (8.30% merges): before: 52,941,780 bytes after: 27,834,082 bytes -48% netbeans (34.21% merges): before: 130,088,982 bytes after: 119,337,636 bytes -10% mozilla-central (4.84% merges): before: 215,096,339 bytes after: 199,496,863 bytes -8% large-private-repo-A (merge 19.73%) before: 33,725,285,081 bytes after: 390,302,545 bytes -99% large-private-repo-B (23.77%) before: 49,457,701,645 bytes after: 1,366,752,187 bytes -97% The better delta chains provide a performance boost in relevant repositories: pypy, bundling 1000 revisions: before: 1.670s after: 1.149s -31% Unbundling got a bit slower. probably because the sparse algorithm is still pure python. pypy, unbundling 1000 revisions: before: 4.062s after: 4.507s +10% Performance of bundle/unbundle in repository with few concurrent branches (eg: mercurial) are unaffected. No significant differences have been noticed then timing `hg push` and `hg pull` locally. More state timings are being gathered. Same as for aggressive-merge-delta, better delta comes with longer delta chains. Longer chains have a performance impact. For example. The length of the chain needed to get the manifest of pypy's tip moves from 82 item to 1929 items. This moves the restore time from 3.88ms to 11.3ms. Delta chain length is an independent issue that affects repository without this changes. It will be dealt with independently. No significant differences have been observed on repositories where `sparse-revlog` have not much effect (mercurial, unity, netbeans). On pypy, small differences have been observed on some operation affected by delta chain building and retrieval. pypy, perfmanifest before: 0.006162s after: 0.017899s +190% pypy, commit: before: 0.382 after: 0.376 -1% pypy, status: before: 0.157 after: 0.168 +7% More comprehensive and stable timing comparisons are in progress.

#require darcs

  $ echo "[extensions]" >> $HGRCPATH
  $ echo "convert=" >> $HGRCPATH
  $ DARCS_EMAIL='test@example.org'; export DARCS_EMAIL

initialize darcs repo

  $ mkdir darcs-repo
  $ cd darcs-repo
  $ darcs init -q
  $ echo a > a
  $ darcs record -a -l -m p0
  Finished recording patch 'p0'
  $ cd ..

branch and update

  $ darcs get -q darcs-repo darcs-clone >/dev/null
  $ cd darcs-clone
  $ echo c >> a
  $ echo c > c
  $ darcs record -a -l -m p1.1
  Finished recording patch 'p1.1'
  $ cd ..

skip if we can't import elementtree

  $ if hg convert darcs-repo darcs-dummy 2>&1 | grep ElementTree > /dev/null; then
  >     echo 'skipped: missing feature: elementtree module'
  >     exit 80
  > fi

update source

  $ cd darcs-repo
  $ echo b >> a
  $ echo b > b
  $ darcs record -a -l -m p1.2
  Finished recording patch 'p1.2'

  $ darcs pull -q -a --no-set-default ../darcs-clone
  Backing up ./a(*) (glob)
  We have conflicts in the following files:
  ./a
   (?)
  $ sleep 1
  $ echo e > a
  $ echo f > f
  $ mkdir dir
  $ echo d > dir/d
  $ echo d > dir/d2
  $ darcs record -a -l -m p2
  Finished recording patch 'p2'

test file and directory move

  $ darcs mv -q f ff

Test remove + move

  $ darcs remove -q dir/d2
  $ rm dir/d2
  $ darcs mv -q dir dir2
  $ darcs record -a -l -m p3
  Finished recording patch 'p3'

The converter does not currently handle patch conflicts very well.
When they occur, it reverts *all* changes and moves forward,
letting the conflict resolving patch fix collisions.
Unfortunately, non-conflicting changes, like the addition of the
"c" file in p1.1 patch are reverted too.
Just to say that manifest not listing "c" here is a bug.

  $ cd ..
  $ hg convert darcs-repo darcs-repo-hg
  initializing destination darcs-repo-hg repository
  scanning source...
  sorting...
  converting...
  4 p0
  3 p1.2
  2 p1.1
  1 p2
  0 p3
  $ hg log -R darcs-repo-hg -g --template '{rev} "{desc|firstline}" ({author}) files: {files}\n' "$@"
  4 "p3" (test@example.org) files: dir/d dir/d2 dir2/d f ff
  3 "p2" (test@example.org) files: a dir/d dir/d2 f
  2 "p1.1" (test@example.org) files: 
  1 "p1.2" (test@example.org) files: a b
  0 "p0" (test@example.org) files: a

  $ hg up -q -R darcs-repo-hg
  $ hg -R darcs-repo-hg manifest --debug
  7225b30cdf38257d5cc7780772c051b6f33e6d6b 644   a
  1e88685f5ddec574a34c70af492f95b6debc8741 644   b
  37406831adc447ec2385014019599dfec953c806 644   dir2/d
  b783a337463792a5c7d548ad85a7d3253c16ba8c 644   ff

#if no-outer-repo

try converting darcs1 repository

  $ hg clone -q "$TESTDIR/bundles/darcs1.hg" darcs
  $ hg convert -s darcs darcs/darcs1 2>&1 | grep darcs-1.0
  darcs-1.0 repository format is unsupported, please upgrade

#endif