copies: do full filtering at end of _changesetforwardcopies()
As mentioned earlier, pathcopies() is very slow when copies are stored
in the changeset. Most of the cost comes from calling _chain() for
every changeset, which is slow because it needs to read manifests. It
needs to read manifests to be able to filter out copies that are were
created in one commit and then deleted. (It also filters out copies
that were created from a file that didn't exist in the starting
revision, but that's a fixed revision across calls to _chain(), so
it's much cheaper.)
This patch changes from _chainandfilter() to just _chain() in the main
loop in _changesetforwardcopies(). It instead removes copies that have
subsequently been removed by using ctx.filesremoved(). We thus rely on
that to be fast.
It timed this command in mozilla-unified:
hg debugpathcopies FIREFOX_59_0b3_BUILD2 FIREFOX_BETA_59_END
It took 18s before and 1.1s after. It's still faster when copy
information is stored in filelogs: 0.70s. It also still gets slow when
there are merge commits involved, because we read manifests there
too. We'll deal with that later.
Differential Revision: https://phab.mercurial-scm.org/D6419
Enable obsolete markers
$ cat >> $HGRCPATH << EOF
> [experimental]
> evolution.createmarkers=True
> [phases]
> publish=False
> EOF
Build a repo with some cacheable bits:
$ hg init a
$ cd a
$ echo a > a
$ hg ci -qAm0
$ hg tag t1
$ hg book -i bk1
$ hg branch -q b2
$ hg ci -Am1
$ hg tag t2
$ echo dumb > dumb
$ hg ci -qAmdumb
$ hg debugobsolete b1174d11b69e63cb0c5726621a43c859f0858d7f
obsoleted 1 changesets
$ hg phase -pr t1
$ hg phase -fsr t2
Make a helper function to check cache damage invariants:
- command output shouldn't change
- cache should be present after first use
- corruption/repair should be silent (no exceptions or warnings)
- cache should survive deletion, overwrite, and append
- unreadable / unwriteable caches should be ignored
- cache should be rebuilt after corruption
$ damage() {
> CMD=$1
> CACHE=.hg/cache/$2
> CLEAN=$3
> hg $CMD > before
> test -f $CACHE || echo "not present"
> echo bad > $CACHE
> test -z "$CLEAN" || $CLEAN
> hg $CMD > after
> "$RUNTESTDIR/pdiff" before after || echo "*** overwrite corruption"
> echo corruption >> $CACHE
> test -z "$CLEAN" || $CLEAN
> hg $CMD > after
> "$RUNTESTDIR/pdiff" before after || echo "*** append corruption"
> rm $CACHE
> mkdir $CACHE
> test -z "$CLEAN" || $CLEAN
> hg $CMD > after
> "$RUNTESTDIR/pdiff" before after || echo "*** read-only corruption"
> test -d $CACHE || echo "*** directory clobbered"
> rmdir $CACHE
> test -z "$CLEAN" || $CLEAN
> hg $CMD > after
> "$RUNTESTDIR/pdiff" before after || echo "*** missing corruption"
> test -f $CACHE || echo "not rebuilt"
> }
Beat up tags caches:
$ damage "tags --hidden" tags2
$ damage tags tags2-visible
$ damage "tag -f t3" hgtagsfnodes1
1 new orphan changesets
1 new orphan changesets
1 new orphan changesets
1 new orphan changesets
1 new orphan changesets
Beat up branch caches:
$ damage branches branch2-base "rm .hg/cache/branch2-[vs]*"
$ damage branches branch2-served "rm .hg/cache/branch2-[bv]*"
$ damage branches branch2-visible
$ damage "log -r branch(.)" rbc-names-v1
$ damage "log -r branch(default)" rbc-names-v1
$ damage "log -r branch(b2)" rbc-revs-v1
We currently can't detect an rbc cache with unknown names:
$ damage "log -qr branch(b2)" rbc-names-v1
--- before * (glob)
+++ after * (glob)
@@ -1,8 +?,0 @@ (glob)
-2:5fb7d38b9dc4
-3:60b597ffdafa
-4:b1174d11b69e
-5:6354685872c0
-6:5ebc725f1bef
-7:7b76eec2f273
-8:ef3428d9d644
-9:ba7a936bc03c
*** append corruption