subrepo: normalize path in the specific way for problematic encodings
Before this patch, "reporelpath()" uses "rstrip(os.sep)" to trim
"os.sep" at the end of "parent.root" path.
But it doesn't work correctly with some problematic encodings on
Windows, because some multi-byte characters in such encodings contain
'\\' (0x5c) as the tail byte of them.
In such cases, "reporelpath()" leaves unexpected '\\' at the beginning
of the path returned to callers.
"lcalrepository.root" seems not to have tail "os.sep", because it is
always normalized by "os.path.realpath()" in "vfs.__init__()", but in
fact it has tail "os.sep", if it is a root (of the drive): path
normalization trims tail "os.sep" off "/foo/bar/", but doesn't trim
one off "/".
So, just avoiding "rstrip(os.sep)" in "reporelpath()" causes
regression around
issue3033 fixed by
fccd350acf79.
This patch introduces "pathutil.normasprefix" to normalize specified
path in the specific way for problematic encodings without regression
around
issue3033.
subrepo: avoid sanitizing ".hg/hgrc" in meta data area for non-hg subrepos
Before this patch, sanitizing ".hg/hgrc" scans directories and files
also in meta data area for non-hg subrepos: under ".svn" for
Subversion subrepo, for example.
This may cause not only performance impact (especially in large scale
subrepos) but also unexpected removing meta data files.
This patch avoids sanitizing ".hg/hgrc" in meta data area for non-hg
subrepos.
This patch stops checking "ignore" target at the first
(case-insensitive) appearance of it, because continuation of scanning
is meaningless in almost all cases.
subrepo: make "_sanitize()" take absolute path to the root of subrepo
Before this patch, "hg update" doesn't sanitize ".hg/hgrc" in non-hg
subrepos correctly, if "hg update" is executed not at the root of the
parent repository.
"_sanitize()" takes relative path to subrepo from the root of the
parent repository, and passes it to "os.walk()". In this case,
"os.walk()" expects CWD to be equal to the root of the parent
repository.
So, "os.walk()" can't find specified path (or may scan unexpected
path), if CWD isn't equal to the root of the parent repository.
Non-hg subrepo under nested hg-subrepos may cause same problem, too:
CWD may be equal to the root of the outer most repository, or so.
This patch makes "_sanitize()" take absolute path to the root of
subrepo to sanitize correctly in such cases.
This patch doesn't normalize the path to hostile files as the one
relative to CWD (or the root of the outer most repository), to fix the
problem in the simple way suitable for "stable".
Normalizing should be done in the future: maybe as a part of the
migration to vfs.
subrepo: invoke "_sanitize()" also after "git merge --ff"
Before this patch, sanitizing ".hg/hgrc" in git subrepo doesn't work,
when the working directory is updated by "git merge --ff".
"_sanitize()" is not invoked after checking target revision out into
the working directory in this case, even though it is invoked
indirectly via "checkout" (or "rawcheckout") in other cases.
This patch invokes "_sanitize()" explicitly also after "git merge
--ff" execution.
subrepo: make "_sanitize()" work
"_sanitize()" was introduced by
224e96078708 on "stable" branch, but
it has done nothing for sanitizing since
224e96078708.
"_sanitize()" assumes "Visitor" design pattern:
"os.walk()" should invoke specified function ("v" in this case)
for each directory elements under specified path
but "os.walk()" assumes "Iterator" design pattern:
callers of it should drive loop to scan each directory elements
under specified path by themselves with the returned generator
object
Because of this mismatching, "_sanitize()" just discards the generator
object returned by "os.walk()" and does nothing for sanitizing.
This patch makes "_sanitize()" work.
This patch also changes the format of warning message to show each
unlinked files, for multiple appearances of "potentially hostile
.hg/hgrc".