match: add a subclass for dirstate normalizing of the matched patterns
This class is only needed on case insensitive filesystems, and only
for wdir context matches. It allows the user to not match the case of
the items in the filesystem- especially for naming directories, which
dirstate doesn't handle[1]. Making dirstate handle mismatched
directory cases is too expensive[2].
Since dirstate doesn't apply to committed csets, this is only created by
overriding basectx.match() in workingctx, and only on icasefs. The default
arguments have been dropped, because the ctx must be passed to the matcher in
order to function.
For operations that can apply to both wdir and some other context, this ends up
normalizing the filename to the case as it exists in the filesystem, and using
that case for the lookup in the other context. See the diff example in the
test.
Previously, given a directory with an inexact case:
- add worked as expected
- diff, forget and status would silently ignore the request
- files would exit with 1
- commit, revert and remove would fail (even when the commands leading up to
them worked):
$ hg ci -m "AbCDef" capsdir1/capsdir
abort: CapsDir1/CapsDir: no match under directory!
$ hg revert -r '.^' capsdir1/capsdir
capsdir1\capsdir: no such file in rev
64dae27060b7
$ hg remove capsdir1/capsdir
not removing capsdir1\capsdir: no tracked files
[1]
Globs are normalized, so that the -I and -X don't need to be specified with a
case match. Without that, the second last remove (with -X) removes the files,
leaving nothing for the last remove. However, specifying the files as
'glob:**.Txt' does not work. Perhaps this requires 're.IGNORECASE'?
There are only a handful of places that create matchers directly, instead of
being routed through the context.match() method. Some may benefit from changing
over to using ctx.match() as a factory function:
revset.checkstatus()
revset.contains()
revset.filelog()
revset._matchfiles()
localrepository._loadfilter()
ignore.ignore()
fileset.subrepo()
filemerge._picktool()
overrides.addlargefiles()
lfcommands.lfconvert()
kwtemplate.__init__()
eolfile.__init__()
eolfile.checkrev()
acl.buildmatch()
Currently, a toplevel subrepo can be named with an inexact case. However, the
path auditor gets in the way of naming _anything_ in the subrepo if the top
level case doesn't match. That is trickier to handle, because there's the user
provided case, the case in the filesystem, and the case stored in .hgsub. This
can be fixed next cycle.
--- a/tests/test-subrepo-deep-nested-change.t
+++ b/tests/test-subrepo-deep-nested-change.t
@@ -170,8 +170,15 @@
R sub1/sub2/test.txt
$ hg update -Cq
$ touch sub1/sub2/folder/bar
+#if icasefs
+ $ hg addremove Sub1/sub2
+ abort: path 'Sub1\sub2' is inside nested repo 'Sub1'
+ [255]
+ $ hg -q addremove sub1/sub2
+#else
$ hg addremove sub1/sub2
adding sub1/sub2/folder/bar (glob)
+#endif
$ hg status -S
A sub1/sub2/folder/bar
? foo/bar/abc
The narrowmatcher class may need to be tweaked when that is fixed.
[1] http://www.selenic.com/pipermail/mercurial-devel/2015-April/068183.html
[2] http://www.selenic.com/pipermail/mercurial-devel/2015-April/068191.html
match: move _normalize() into the match class
This will be overridden in an upcoming patch to also deal with dirstate
normalization on case insensitive filesystems.
largefiles: always consider updatelfiles 'checked' parameter set
mergeupdate already set the flag to update all. This will thus only change
overriderevert and scmutilmarktouched ... where the flag effectually also were
true. The test coverage thus shows no change.
As the flag always is set, it is removed.
This is mainly a change for keeping the code simple and consistent and correct,
but it should also make it faster in many cases.
largefiles: for update -C, only update largefiles when necessary
Before, a --clean update with largefiles would use the "optimization" that it
didn't read hashes from standin files before and after the update. Instead of
trusting the content of the standin files, it would rehash all the actual
largefiles that lfdirstate reported clean and update the standins that didn't
have the expected content. It could thus in some "impossible" situations
automatically recover from some "largefile got out sync with its standin"
issues (even there apparently still were weird corner cases where it could
fail). This extra checking is similar to what core --clean intentionally do
not do, and it made update --clean unbearable slow.
Usually in core Mercurial, --clean will rely on the dirstate to find the files
it should update. (It is thus intentionally possible (when trying to trick the
system or if there should be bugs) to end up in situations where --clean not
will restore the working directory content correctly.) Checking every file when
we "know" it is ok is however not an option - that would be too slow.
Instead, trust the content of the standin files. Use the same logic for --clean
as for linear updates and trust the dirstate and that our "logic" will keep
them in sync. It is much cheaper to just rehash the largefiles reported dirty
by a status walk and read all standins than to hash largefiles.
Most of the changes are just a change of indentation now when the different
kinds of updates no longer are handled that differently. Standins for added
files are however only written when doing a normal update, while deleted and
removed files only will be updated for --clean updates.
subrepo: calculate _relpath for hgsubrepo based on self instead of parent
Prior to
105758d1b37b, the subrelpath() (now _relpath) for hgsubrepo was
calculated by removing the root path of the outermost repo from the root path of
the subrepo. Since the root paths use platform specific separators, and the
relative path is printed by various commands, the output of these commands
require a glob (and check-code.py enforces this).
In an effort to be generic to all subrepos,
105758d1b37b started calculating
this path based on the parent repo, and then joining the subrepo path in .hgsub.
One of the tests in test-subrepo.t creates a subrepo inside a directory, so the
path being joined contained '/' instead of '\'. This made the test fail with a
'~' status, because the glob is unnecessary[1]. Removing them made the test
work, but then check-code complains. We can't just drop the check-code rule,
because sub-subrepos are still joined with '\'. Presumably the other subrepo
types have this issue as well, but there likely isn't a test with git or svn
repos inside a subdirectory.
This simply restores the exact _relpath value (and output) for hgsubrepos prior
to
105758d1b37b.
[1] http://www.selenic.com/pipermail/mercurial-devel/2015-April/068720.html
subrepo: backout
93b0e0db7929 to restore reporelpath()
The path for hgsubrepo needs to be calculated slightly differently from other
subrepo types, but can reuse this. See the next patch for details.
rollback: clear resolve state (
issue4593)
diff: pass the diff matcher to the copy logic
This passes the existing diff matcher instance down to the copy logic so copy
tracing can be more efficient when possible and only trace copies for matching
files.
This only actually affects forwardcopies (i.e. Given A<-B<-C<-D, it works for
'hg diff -r B -r D foo.txt', but not for 'hg diff -r D -r B foo.txt') since
backward copies require walking all histories, and not just the individual
file's.
This reduces 'hg diff -r A -r B foo.txt' time from 15s to 1s when A and B have
80,000 files different.