view tests/test-narrow-expanddirstate.t @ 42743:8c9a6adec67a

rust-discovery: using the children cache in add_missing The DAG range computation often needs to get back to very old revisions, and turns out to be disproportionately long, given that the end goal is to remove the descendents of the given missing revisons from the undecided set. The fast iteration capabilities available in the Rust case make it possible to avoid the DAG range entirely, at the cost of precomputing the children cache, and to simply iterate on children of the given missing revisions. This is a case where staying on the same side of the interface between the two languages has clear benefits. On discoveries with initial undecided sets small enough to bypass sampling entirely, the total cost of computing the children cache and the subsequent iteration becomes better than the Python + C counterpart, which relies on reachableroots2. For example, on a repo with more than one million revisions with an initial undecided set of 11 elements, we get these figures: Rust version with simple iteration addcommons: 57.287us first undecided computation: 184.278334ms first children cache computation: 131.056us addmissings iteration: 42.766us first addinfo total: 185.24 ms Python + C version first addcommons: 0.29 ms addcommons 0.21 ms first undecided computation 191.35 ms addmissings 45.75 ms first addinfo total: 237.77 ms On discoveries with large undecided sets, the initial price paid makes the first addinfo slower than the Python + C version, but that's more than compensated by the gain in sampling and subsequent iterations. Here's an extreme example with an undecided set of a million revisions: Rust version: first undecided computation: 293.842629ms first children cache computation: 407.911297ms addmissings iteration: 34.312869ms first addinfo total: 776.02 ms taking initial sample query 2: sampling time: 1318.38 ms query 2; still undecided: 1005013, sample size is: 200 addmissings: 143.062us Python + C version: first undecided computation 298.13 ms addmissings 80.13 ms first addinfo total: 399.62 ms taking initial sample query 2: sampling time: 3957.23 ms query 2; still undecided: 1005013, sample size is: 200 addmissings 52.88 ms Differential Revision: https://phab.mercurial-scm.org/D6428
author Georges Racinet <georges.racinet@octobus.net>
date Tue, 16 Apr 2019 01:16:39 +0200
parents 44a51c1c8e17
children 28d5e05c139a
line wrap: on
line source

  $ . "$TESTDIR/narrow-library.sh"

  $ hg init master
  $ cd master

  $ mkdir inside
  $ echo inside > inside/f1
  $ mkdir outside
  $ echo outside > outside/f2
  $ mkdir patchdir
  $ echo patch_this > patchdir/f3
  $ hg ci -Aqm 'initial'

  $ cd ..

  $ hg clone --narrow ssh://user@dummy/master narrow --include inside
  requesting all changes
  adding changesets
  adding manifests
  adding file changes
  added 1 changesets with 1 changes to 1 files
  new changesets dff6a2a6d433
  updating to branch default
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved

  $ cd narrow

  $ mkdir outside
  $ echo other_contents > outside/f2
  $ hg tracked | grep outside
  [1]
  $ hg files | grep outside
  [1]
  $ hg status

`hg status` did not add outside.
  $ hg tracked | grep outside
  [1]
  $ hg files | grep outside
  [1]

Unfortunately this is not really a candidate for adding to narrowhg proper,
since it depends on some other source for providing the manifests (when using
treemanifests) and file contents. Something like a virtual filesystem and/or
remotefilelog. We want to be useful when not using those systems, so we do not
have this method available in narrowhg proper at the moment.
  $ cat > "$TESTTMP/expand_extension.py" <<EOF
  > import os
  > import sys
  > 
  > from mercurial import encoding
  > from mercurial import extensions
  > from mercurial import localrepo
  > from mercurial import match as matchmod
  > from mercurial import narrowspec
  > from mercurial import patch
  > from mercurial import util as hgutil
  > 
  > narrowspecexpanded = False
  > def expandnarrowspec(ui, repo, newincludes=None):
  >   if not newincludes:
  >     return
  >   if getattr(repo, '_narrowspecexpanded', False):
  >     return
  >   repo._narrowspecexpanded = True
  >   import sys
  >   newincludes = set([newincludes])
  >   includes, excludes = repo.narrowpats
  >   currentmatcher = narrowspec.match(repo.root, includes, excludes)
  >   includes = includes | newincludes
  >   if not repo.currenttransaction():
  >     ui.develwarn(b'expandnarrowspec called outside of transaction!')
  >   repo.setnarrowpats(includes, excludes)
  >   narrowspec.copytoworkingcopy(repo)
  >   newmatcher = narrowspec.match(repo.root, includes, excludes)
  >   added = matchmod.differencematcher(newmatcher, currentmatcher)
  >   for f in repo[b'.'].manifest().walk(added):
  >     repo.dirstate.normallookup(f)
  > 
  > def reposetup(ui, repo):
  >   class expandingrepo(repo.__class__):
  >     def narrowmatch(self, *args, **kwargs):
  >       with repo.wlock(), repo.lock(), repo.transaction(
  >           b'expandnarrowspec'):
  >         expandnarrowspec(ui, repo,
  >                          encoding.environ.get(b'DIRSTATEINCLUDES'))
  >       return super(expandingrepo, self).narrowmatch(*args, **kwargs)
  >   repo.__class__ = expandingrepo
  > 
  > def extsetup(unused_ui):
  >   def overridepatch(orig, ui, repo, *args, **kwargs):
  >     with repo.wlock():
  >       expandnarrowspec(ui, repo, encoding.environ.get(b'PATCHINCLUDES'))
  >       return orig(ui, repo, *args, **kwargs)
  > 
  >   extensions.wrapfunction(patch, b'patch', overridepatch)
  > EOF
  $ cat >> ".hg/hgrc" <<EOF
  > [extensions]
  > expand_extension = $TESTTMP/expand_extension.py
  > EOF

Since we do not have the ability to rely on a virtual filesystem or
remotefilelog in the test, we just fake it by copying the data from the 'master'
repo.
  $ cp -a ../master/.hg/store/data/* .hg/store/data
Do that for patchdir as well.
  $ cp -a ../master/patchdir .

`hg status` will now add outside, but not patchdir.
  $ DIRSTATEINCLUDES=path:outside hg status
  M outside/f2
  $ hg tracked | grep outside
  I path:outside
  $ hg files | grep outside > /dev/null
  $ hg tracked | grep patchdir
  [1]
  $ hg files | grep patchdir
  [1]

Get rid of the modification to outside/f2.
  $ hg update -C .
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved

This patch will not apply cleanly at the moment, so `hg import` will break
  $ cat > "$TESTTMP/foo.patch" <<EOF
  > --- patchdir/f3
  > +++ patchdir/f3
  > @@ -1,1 +1,1 @@
  > -this should be "patch_this", but its not, so patch fails
  > +this text is irrelevant
  > EOF
  $ PATCHINCLUDES=path:patchdir hg import -p0 -e "$TESTTMP/foo.patch" -m ignored
  applying $TESTTMP/foo.patch
  patching file patchdir/f3
  Hunk #1 FAILED at 0
  1 out of 1 hunks FAILED -- saving rejects to file patchdir/f3.rej
  abort: patch failed to apply
  [255]
  $ hg tracked | grep patchdir
  [1]
  $ hg files | grep patchdir > /dev/null
  [1]

Let's make it apply cleanly and see that it *did* expand properly
  $ cat > "$TESTTMP/foo.patch" <<EOF
  > --- patchdir/f3
  > +++ patchdir/f3
  > @@ -1,1 +1,1 @@
  > -patch_this
  > +patched_this
  > EOF
  $ PATCHINCLUDES=path:patchdir hg import -p0 -e "$TESTTMP/foo.patch" -m message
  applying $TESTTMP/foo.patch
  $ cat patchdir/f3
  patched_this
  $ hg tracked | grep patchdir
  I path:patchdir
  $ hg files | grep patchdir > /dev/null