match: document that visitchildrenset might return files
At least when using includematcher, and probably most matchers, we do not know
if a/b/f refers to a file 'f' in a/b, or a subdirectory 'f' in a/b, so most
matchers will return {'f'} for visitchildrenset('a/b'). Arguably, all matchers
could/should - for exactmatcher, we know that 'f' is a file, but there's no
reason to return 'this' for visitchildrenset('a/b') causing code to investigate
'a/b/x', for example.
Differential Revision: https://phab.mercurial-scm.org/D4364
--- a/mercurial/dirstate.py Fri Aug 24 10:13:27 2018 -0700
+++ b/mercurial/dirstate.py Thu Aug 23 18:04:15 2018 -0700
@@ -912,11 +912,14 @@
continue
raise
for f, kind, st in entries:
- # If we needed to inspect any files, visitentries would have
- # been 'this' or 'all', and we would have set it to None
- # above. If we have visitentries populated here, we don't
- # care about any files in this directory, so no need to
- # check the type of `f`.
+ # Some matchers may return files in the visitentries set,
+ # instead of 'this', if the matcher explicitly mentions them
+ # and is not an exactmatcher. This is acceptable; we do not
+ # make any hard assumptions about file-or-directory below
+ # based on the presence of `f` in visitentries. If
+ # visitchildrenset returned a set, we can always skip the
+ # entries *not* in the set it provided regardless of whether
+ # they're actually a file or a directory.
if visitentries and f not in visitentries:
continue
if normalizefile:
--- a/mercurial/match.py Fri Aug 24 10:13:27 2018 -0700
+++ b/mercurial/match.py Thu Aug 23 18:04:15 2018 -0700
@@ -346,7 +346,7 @@
----------+-------------------
False | set()
'all' | 'all'
- True | 'this' OR non-empty set of subdirs to visit
+ True | 'this' OR non-empty set of subdirs -or files- to visit
Example:
Assume matchers ['path:foo/bar', 'rootfilesin:qux'], we would return
@@ -357,10 +357,21 @@
'baz' -> set()
'foo' -> {'bar'}
# Ideally this would be 'all', but since the prefix nature of matchers
- # is applied to the entire matcher, we have to downgrade to this
- # 'this' due to the non-prefix 'rootfilesin'-kind matcher.
+ # is applied to the entire matcher, we have to downgrade this to
+ # 'this' due to the non-prefix 'rootfilesin'-kind matcher being mixed
+ # in.
'foo/bar' -> 'this'
'qux' -> 'this'
+
+ Important:
+ Most matchers do not know if they're representing files or
+ directories. They see ['path:dir/f'] and don't know whether 'f' is a
+ file or a directory, so visitchildrenset('dir') for most matchers will
+ return {'f'}, but if the matcher knows it's a file (like exactmatcher
+ does), it may return 'this'. Do not rely on the return being a set
+ indicating that there are no files in this dir to investigate (or
+ equivalently that if there are files to investigate in 'dir' that it
+ will always return 'this').
'''
return 'this'