Mon, 29 Jun 2020 15:03:36 +0200 convert: correctly convert paths to UTF-8 for Subversion stable
Manuel Jacob <me@manueljacob.de> [Mon, 29 Jun 2020 15:03:36 +0200] rev 45022
convert: correctly convert paths to UTF-8 for Subversion The previous code using encoding.tolocal() only worked by chance in these situations: * The string is ASCII: The fast path was triggered and the string was returned unmodified. * The local encoding is UTF-8: The source and target encoding is the same. * The string is not valid UTF-8 and the native encoding is ISO-8859-1: If the string doesn’t decode using UTF-8, ISO-8859-1 is tried as a fallback. During `hg convert`, the local encoding is always UTF-8. The irony is that in this case, encoding.tolocal() behaves like what someone would expect the reverse function, encoding.fromlocal(), to do. When the locale encoding is ISO-8859-15, trying to convert a SVN repo `/tmp/a€` failed before like this: file:///tmp/a%C2%A4 does not look like a Subversion repository to libsvn version 1.14.0 The correct URL is `file:///tmp/a%E2%82%AC`. Unlike previously (with the ISO-8859-1 fallback), decoding the path using the locale encoding can fail. In this case, we have to bail out, as Subversion won’t be able to do anything useful with the path.
Tue, 30 Jun 2020 05:04:36 +0200 py3: pass URL as str stable
Manuel Jacob <me@manueljacob.de> [Tue, 30 Jun 2020 05:04:36 +0200] rev 45021
py3: pass URL as str Before the patch, HTTP(S) URLs were never recognized as a Subversion repository on Python 3.
Tue, 30 Jun 2020 04:55:52 +0200 convert: bail out in Subversion source if encountering non-ASCII HTTP(S) URL stable
Manuel Jacob <me@manueljacob.de> [Tue, 30 Jun 2020 04:55:52 +0200] rev 45020
convert: bail out in Subversion source if encountering non-ASCII HTTP(S) URL Before this patch, in the tested case, urllib raised `httplib.InvalidURL: URL can't contain control characters. '/\xff/!svn/ver/0/.svn' (found at least '\xff')`, which resulted in that the URL was never recognized as a Subversion repository. This patch adds a check that bails out if the URL contains non-ASCII characters. The warning is not overly user-friendly, but giving the user something to type into a search engine is definitively better than not explaining why the repository was not recognized. We could support non-ASCII chracters by quoting them before passing them to urllib. However, we would want to be compatible with what the `svn` command does, which converts the URL from the locale encoding to UTF-8, percent-encodes it and sends it to the server. If the locale encoding is not UTF-8, the behavior is IMHO not very intuitive, as the `svn` command may send different (percent-encoded) octets than what was passed on the console. Instead of copying this behavior, we better leave it forbidden.
(0) -30000 -10000 -3000 -1000 -300 -100 -30 -10 -3 +3 +10 +30 +100 +300 +1000 +3000 tip