encoding: introduce tagging type for non-lossy non-ASCII string
This fixes the weird behavior of toutf8b(), which would convert a local
string back to UTF-8 *only if* it was lossy in the system encoding.
Before
b7b26e54e37a "encoding: avoid localstr when a string can be encoded
losslessly (
issue2763)", all local strings were wrapped by the localstr
class. I think this would justify the round-trip behavior of toutf8b().
ASCII strings are special-cased, so the cost of wrapping with safelocalstr
is negligible.
(with mercurial repo)
$ export HGRCPATH=/dev/null HGPLAIN= HGENCODING=latin-1
$ hg log --time --config experimental.evolution=all > /dev/null
(original)
time: real 11.340 secs (user 11.290+0.000 sys 0.060+0.000)
time: real 11.390 secs (user 11.300+0.000 sys 0.080+0.000)
time: real 11.430 secs (user 11.360+0.000 sys 0.070+0.000)
(this patch)
time: real 11.200 secs (user 11.100+0.000 sys 0.100+0.000)
time: real 11.370 secs (user 11.300+0.000 sys 0.070+0.000)
time: real 11.190 secs (user 11.130+0.000 sys 0.060+0.000)
encoding: fix toutf8b() to resurrect lossy characters even if "\xed" in it
If 's' is a localstr, 's._utf8' must be returned to get the original UTF-8
sequence back. Because of this, it was totally wrong to test if '"\xed" not
in s', which should be either '"\xed" not in s._utf8' or just omitted.
This patch moves the localstr handling to top as the validity of 's._utf8'
should be pre-checked by encoding.tolocal().
sshserver: redirect stdin/stdout early and use duplicated streams
This is what we achieved with hook.redirect(True) plus ui.fout = ui.ferr.
The hook.redirect() function can't be completely removed yet since hgweb
still depends on it. I'm not sure if it is necessary for WSGI servers. CGI
needs it, but does WSGI communicate over stdin/stdout channels?
sshserver: do setbinary() by caller (API)
In most cases, stdio should be set to binary mode by the dispatcher, so
the sshserver does not have to take care of that. The only exception was
hg-ssh, which is fixed by this patch.
.. api::
``sshserver()`` no longer sets stdin and stdout to binary mode.
test-ssh: add some flush() to make output deterministic
We shouldn't rely on buffering mode/state of file handles.
stringutil: flip the default of pprint() to bprefix=False
If we use pprint() as a drop-in replacement for repr(), bprefix=False is more
appropriate. Let's make it the default to remove bprefix=False noise.