dirstate: when calling rebuild(), avoid some N^2 codepaths
I had a user repo with 200k files in it. Calling `hg debugrebuilddirstate` took
tens of minutes (I didn't wait for it). In that situation,
changedfiles==allfiles, and both are lists. This meant that we had to run an
average of 100k comparisons, for each of 200k files, just to check whether a
file needed to have normallookup called (it always did), or drop.
While it's probably not a huge issue, in my very awkward synthetic benchmark I
wrote (not using a benchmark library or anything), I was seeing some slowdowns
for small-changedfiles and very-large-allfiles invocations, with an inflection
somewhere around 10 items in changedfiles (regardless of the size of allfiles);
above 10 items in changedfiles, the new code appears to always be faster. For
the case of 50k files in changedfiles and the same items in allfiles, I'm seeing
differences of 15s of just running comparisons vs. 0.003793s. I haven't bothered
to run a comparison of 200k items in changedfiles and allfiles. :)
Differential Revision: https://phab.mercurial-scm.org/D7665
#require no-windows
$ . "$TESTDIR/remotefilelog-library.sh"
$ hg init master
$ cd master
$ cat >> .hg/hgrc <<EOF
> [remotefilelog]
> server=True
> EOF
$ echo x > x
$ hg commit -qAm x
$ cd ..
# shallow clone from full
$ hgcloneshallow ssh://user@dummy/master shallow --noupdate
streaming all changes
2 files to transfer, 227 bytes of data
transferred 227 bytes in * seconds (*/sec) (glob)
searching for changes
no changes found
$ cd shallow
$ cat .hg/requires
dotencode
exp-remotefilelog-repo-req-1
fncache
generaldelta
revlogv1
sparserevlog
store
$ hg update
1 files updated, 0 files merged, 0 files removed, 0 files unresolved
1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
$ cat x
x
$ ls .hg/store/data
$ echo foo > f
$ hg add f
$ hg ci -m 'local content'
$ ls .hg/store/data
4a0a19218e082a343a1b17e5333409af9d98f0f5
$ cd ..
# shallow clone from shallow
$ hgcloneshallow ssh://user@dummy/shallow shallow2 --noupdate
streaming all changes
3 files to transfer, 564 bytes of data
transferred 564 bytes in * seconds (*/sec) (glob)
searching for changes
no changes found
$ cd shallow2
$ cat .hg/requires
dotencode
exp-remotefilelog-repo-req-1
fncache
generaldelta
revlogv1
sparserevlog
store
$ ls .hg/store/data
4a0a19218e082a343a1b17e5333409af9d98f0f5
$ hg update
2 files updated, 0 files merged, 0 files removed, 0 files unresolved
$ cat x
x
$ cd ..
# full clone from shallow
Note: the output to STDERR comes from a different process to the output on
STDOUT and their relative ordering is not deterministic. As a result, the test
was failing sporadically. To avoid this, we capture STDERR to a file and
check its contents separately.
$ TEMP_STDERR=full-clone-from-shallow.stderr.tmp
$ hg clone --noupdate ssh://user@dummy/shallow full 2>$TEMP_STDERR
streaming all changes
remote: abort: Cannot clone from a shallow repo to a full repo.
[255]
$ cat $TEMP_STDERR
abort: pull failed on remote
$ rm $TEMP_STDERR
# getbundle full clone
$ printf '[server]\npreferuncompressed=False\n' >> master/.hg/hgrc
$ hgcloneshallow ssh://user@dummy/master shallow3
requesting all changes
adding changesets
adding manifests
adding file changes
added 1 changesets with 0 changes to 0 files
new changesets b292c1e3311f
updating to branch default
1 files updated, 0 files merged, 0 files removed, 0 files unresolved
$ ls shallow3/.hg/store/data
$ cat shallow3/.hg/requires
dotencode
exp-remotefilelog-repo-req-1
fncache
generaldelta
revlogv1
sparserevlog
store