Mercurial > hg-stable
changeset 41282:4fab8a7d2d72
match: support rooted globs in hgignore
In a .hgignore, "glob:foo" always means "**/foo". This cannot be
avoided because there is no syntax like "^" in regexes to say you
don't want the implied "**/" (of course one can use regexes, but glob
syntax is nice).
When you have a long list of fairly specific globs like
path/to/some/thing, this has two consequences:
1. unintended files may be ignored (not too common though)
2. matching performance can suffer significantly
Here is vanilla hg status timing on a private repository:
Using syntax:glob everywhere
real 0m2.199s
user 0m1.545s
sys 0m0.619s
When rooting the appropriate globs
real 0m1.434s
user 0m0.847s
sys 0m0.565s
(tangentially, none of this shows up in --profile's output. It
seems that C code doesn't play well with profiling)
The code already supports this but there is no syntax to make use of
it, so it seems reasonable to create such syntax. I create a new
hgignore syntax "rootglob".
Differential Revision: https://phab.mercurial-scm.org/D5493
author | Valentin Gatien-Baron <vgatien-baron@janestreet.com> |
---|---|
date | Thu, 03 Jan 2019 19:02:46 -0500 |
parents | 183df3df6031 |
children | 4948b327d3b9 |
files | mercurial/help/hgignore.txt mercurial/help/patterns.txt mercurial/match.py tests/test-hgignore.t |
diffstat | 4 files changed, 45 insertions(+), 12 deletions(-) [+] |
line wrap: on
line diff
--- a/mercurial/help/hgignore.txt Wed Nov 07 15:45:09 2018 -0800 +++ b/mercurial/help/hgignore.txt Thu Jan 03 19:02:46 2019 -0500 @@ -59,14 +59,17 @@ Regular expression, Python/Perl syntax. ``glob`` Shell-style glob. +``rootglob`` + A variant of ``glob`` that is rooted (see below). The chosen syntax stays in effect when parsing all patterns that follow, until another syntax is selected. -Neither glob nor regexp patterns are rooted. A glob-syntax pattern of -the form ``*.c`` will match a file ending in ``.c`` in any directory, -and a regexp pattern of the form ``\.c$`` will do the same. To root a -regexp pattern, start it with ``^``. +Neither ``glob`` nor regexp patterns are rooted. A glob-syntax +pattern of the form ``*.c`` will match a file ending in ``.c`` in any +directory, and a regexp pattern of the form ``\.c$`` will do the +same. To root a regexp pattern, start it with ``^``. To get the same +effect with glob-syntax, you have to use ``rootglob``. Subdirectories can have their own .hgignore settings by adding ``subinclude:path/to/subdir/.hgignore`` to the root ``.hgignore``. See
--- a/mercurial/help/patterns.txt Wed Nov 07 15:45:09 2018 -0800 +++ b/mercurial/help/patterns.txt Thu Jan 03 19:02:46 2019 -0500 @@ -20,7 +20,9 @@ To use an extended glob, start a name with ``glob:``. Globs are rooted at the current directory; a glob such as ``*.c`` will only match files -in the current directory ending with ``.c``. +in the current directory ending with ``.c``. ``rootglob:`` can be used +instead of ``glob:`` for a glob that is rooted at the root of the +repository. The supported glob syntax extensions are ``**`` to match any string across path separators and ``{a,b}`` to mean "a or b". @@ -64,6 +66,7 @@ foo/*.c any name ending in ".c" in the directory foo foo/**.c any name ending in ".c" in any subdirectory of foo including itself. + rootglob:*.c any name ending in ".c" in the root of the repository Regexp examples::
--- a/mercurial/match.py Wed Nov 07 15:45:09 2018 -0800 +++ b/mercurial/match.py Thu Jan 03 19:02:46 2019 -0500 @@ -25,6 +25,7 @@ ) allpatternkinds = ('re', 'glob', 'path', 'relglob', 'relpath', 'relre', + 'rootglob', 'listfile', 'listfile0', 'set', 'include', 'subinclude', 'rootfilesin') cwdrelativepatternkinds = ('relpath', 'glob') @@ -221,7 +222,7 @@ for kind, pat in [_patsplit(p, default) for p in patterns]: if kind in cwdrelativepatternkinds: pat = pathutil.canonpath(root, cwd, pat, auditor) - elif kind in ('relglob', 'path', 'rootfilesin'): + elif kind in ('relglob', 'path', 'rootfilesin', 'rootglob'): pat = util.normpath(pat) elif kind in ('listfile', 'listfile0'): try: @@ -1137,7 +1138,7 @@ if pat.startswith('^'): return pat return '.*' + pat - if kind == 'glob': + if kind in ('glob', 'rootglob'): return _globre(pat) + globsuffix raise error.ProgrammingError('not a regex pattern: %s:%s' % (kind, pat)) @@ -1252,7 +1253,7 @@ r = [] d = [] for kind, pat, source in kindpats: - if kind == 'glob': # find the non-glob prefix + if kind in ('glob', 'rootglob'): # find the non-glob prefix root = [] for p in pat.split('/'): if '[' in p or '{' in p or '*' in p or '?' in p: @@ -1351,6 +1352,7 @@ syntax: glob # defaults following lines to non-rooted globs re:pattern # non-rooted regular expression glob:pattern # non-rooted glob + rootglob:pat # rooted glob (same root as ^ in regexps) pattern # pattern of the current default type if sourceinfo is set, returns a list of tuples: @@ -1361,6 +1363,7 @@ 're': 'relre:', 'regexp': 'relre:', 'glob': 'relglob:', + 'rootglob': 'rootglob:', 'include': 'include', 'subinclude': 'subinclude', }
--- a/tests/test-hgignore.t Wed Nov 07 15:45:09 2018 -0800 +++ b/tests/test-hgignore.t Thu Jan 03 19:02:46 2019 -0500 @@ -239,6 +239,17 @@ dir/c.o is ignored (ignore rule in $TESTTMP/ignorerepo/.hgignore, line 2: 'dir/**/c.o') (glob) +Check rooted globs + + $ hg purge --all --config extensions.purge= + $ echo "syntax: rootglob" > .hgignore + $ echo "a/*.ext" >> .hgignore + $ for p in a b/a aa; do mkdir -p $p; touch $p/b.ext; done + $ hg status -A 'set:**.ext' + ? aa/b.ext + ? b/a/b.ext + I a/b.ext + Check using 'include:' in ignore file $ hg purge --all --config extensions.purge= @@ -257,10 +268,15 @@ Check recursive uses of 'include:' $ echo "include:nested/ignore" >> otherignore - $ mkdir nested + $ mkdir nested nested/more $ echo "glob:*ignore" > nested/ignore + $ echo "rootglob:a" >> nested/ignore + $ touch a nested/a nested/more/a $ hg status A dir/b.o + ? nested/a + ? nested/more/a + $ rm a nested/a nested/more/a $ cp otherignore goodignore $ echo "include:badignore" >> otherignore @@ -291,18 +307,26 @@ ? dir1/file2 ? dir2/file1 -Check including subincludes with regexs +Check including subincludes with other patterns $ echo "subinclude:dir1/.hgignore" >> .hgignore + + $ mkdir dir1/subdir + $ touch dir1/subdir/file1 + $ echo "rootglob:f?le1" > dir1/.hgignore + $ hg status + ? dir1/file2 + ? dir1/subdir/file1 + ? dir2/file1 + $ rm dir1/subdir/file1 + $ echo "regexp:f.le1" > dir1/.hgignore - $ hg status ? dir1/file2 ? dir2/file1 Check multiple levels of sub-ignores - $ mkdir dir1/subdir $ touch dir1/subdir/subfile1 dir1/subdir/subfile3 dir1/subdir/subfile4 $ echo "subinclude:subdir/.hgignore" >> dir1/.hgignore $ echo "glob:subfil*3" >> dir1/subdir/.hgignore