convert: correctly convert paths to UTF-8 for Subversion
The previous code using encoding.tolocal() only worked by chance in these
situations:
* The string is ASCII: The fast path was triggered and the string was returned
unmodified.
* The local encoding is UTF-8: The source and target encoding is the same.
* The string is not valid UTF-8 and the native encoding is ISO-8859-1: If the
string doesn’t decode using UTF-8, ISO-8859-1 is tried as a fallback. During
`hg convert`, the local encoding is always UTF-8. The irony is that in this
case, encoding.tolocal() behaves like what someone would expect the reverse
function, encoding.fromlocal(), to do.
When the locale encoding is ISO-8859-15, trying to convert a SVN repo `/tmp/a€`
failed before like this:
file:///tmp/a%C2%A4 does not look like a Subversion repository to libsvn version 1.14.0
The correct URL is `file:///tmp/a%E2%82%AC`.
Unlike previously (with the ISO-8859-1 fallback), decoding the path using the
locale encoding can fail. In this case, we have to bail out, as Subversion
won’t be able to do anything useful with the path.
create verbosemmap.py
$ cat << EOF > verbosemmap.py
> # extension to make util.mmapread verbose
>
> from __future__ import absolute_import
>
> from mercurial import (
> extensions,
> pycompat,
> util,
> )
>
> def extsetup(ui):
> def mmapread(orig, fp):
> ui.write(b"mmapping %s\n" % pycompat.bytestr(fp.name))
> ui.flush()
> return orig(fp)
>
> extensions.wrapfunction(util, 'mmapread', mmapread)
> EOF
setting up base repo
$ hg init a
$ cd a
$ touch a
$ hg add a
$ hg commit -qm base
$ for i in `$TESTDIR/seq.py 1 100` ; do
> echo $i > a
> hg commit -qm $i
> done
set up verbosemmap extension
$ cat << EOF >> $HGRCPATH
> [extensions]
> verbosemmap=$TESTTMP/verbosemmap.py
> EOF
mmap index which is now more than 4k long
$ hg log -l 5 -T '{rev}\n' --config experimental.mmapindexthreshold=4k
mmapping $TESTTMP/a/.hg/store/00changelog.i
100
99
98
97
96
do not mmap index which is still less than 32k
$ hg log -l 5 -T '{rev}\n' --config experimental.mmapindexthreshold=32k
100
99
98
97
96
$ cd ..