Mercurial > hg
view tests/test-patch-offset.t @ 34179:036d47d7cf39
copytrace: move fast heuristic copytracing algorithm to core
copytrace extension in fb-hgext has a heuristic implementation of copy tracing
which is faster than the current copy tracing. The heuristic limits the search
of copies to just files that are either:
1) Renames in the same directory
2) Moved to other directory with same name
The default copytrace implementation is very slow as it finds all the new files
that were added from merge base up to the head commit and for each file it
checks whether it this was copied or moved version of a different file.
Stash@fb did analysis for the above heuristics on the fb repo and found that
among 2,443,768 moves/copies there are only 32,234 moves/copies which does not
fall under the above heuristics which is approx. 0.013 of total copies.
This patch moves the heuristics algorithm under config
`experimental.copytrace=heuristics`.
While moving fbext to core, this patch removes couple of less useful config
options named `sourcecommitlimit` and `maxmovescandidatestocheck`.
Tests are also added for the heuristics algorithm, which are basically copied
from fbext/tests/test-copytrace.t. The tests follow a pattern creating a server
repo and then cloning to a local repo to create public and draft changesets, the
distinction which will be useful in upcoming patches.
After this patch `experimental.copytrace` has the following behaviour:
1) `off`: turns off copytracing
2) `heuristics`: use the heuristic algorithm added in this patch.
3) everything else: use the full copytracing algorithm
.. feature::
A new fast heuristic algorithm for copytracing which assumes that the files
moves are either::
1) Renames in the same directory
2) Moves in other directories with same names
You can use this algorithm by setting `experimental.copytrace=heuristics`.
Differential Revision: https://phab.mercurial-scm.org/D623
author | Pulkit Goyal <7895pulkit@gmail.com> |
---|---|
date | Sun, 03 Sep 2017 03:49:15 +0530 |
parents | 75be14993fda |
children | bfc9ab6c1bec |
line wrap: on
line source
$ cat > writepatterns.py <<EOF > import sys > > path = sys.argv[1] > patterns = sys.argv[2:] > > fp = file(path, 'wb') > for pattern in patterns: > count = int(pattern[0:-1]) > char = pattern[-1] + '\n' > fp.write(char*count) > fp.close() > EOF prepare repo $ hg init a $ cd a These initial lines of Xs were not in the original file used to generate the patch. So all the patch hunks need to be applied to a constant offset within this file. If the offset isn't tracked then the hunks can be applied to the wrong lines of this file. $ $PYTHON ../writepatterns.py a 34X 10A 1B 10A 1C 10A 1B 10A 1D 10A 1B 10A 1E 10A 1B 10A $ hg commit -Am adda adding a This is a cleaner patch generated via diff In this case it reproduces the problem when the output of hg export does not import patch $ hg import -v -m 'b' -d '2 0' - <<EOF > --- a/a 2009-12-08 19:26:17.000000000 -0800 > +++ b/a 2009-12-08 19:26:17.000000000 -0800 > @@ -9,7 +9,7 @@ > A > A > B > -A > +a > A > A > A > @@ -53,7 +53,7 @@ > A > A > B > -A > +a > A > A > A > @@ -75,7 +75,7 @@ > A > A > B > -A > +a > A > A > A > EOF applying patch from stdin patching file a Hunk #1 succeeded at 43 (offset 34 lines). Hunk #2 succeeded at 87 (offset 34 lines). Hunk #3 succeeded at 109 (offset 34 lines). committing files: a committing manifest committing changelog created 189885cecb41 compare imported changes against reference file $ $PYTHON ../writepatterns.py aref 34X 10A 1B 1a 9A 1C 10A 1B 10A 1D 10A 1B 1a 9A 1E 10A 1B 1a 9A $ diff aref a $ cd ..