diff mercurial/templatekw.py @ 37947:3ea3c96ada54

encoding: introduce tagging type for non-lossy non-ASCII string This fixes the weird behavior of toutf8b(), which would convert a local string back to UTF-8 *only if* it was lossy in the system encoding. Before b7b26e54e37a "encoding: avoid localstr when a string can be encoded losslessly (issue2763)", all local strings were wrapped by the localstr class. I think this would justify the round-trip behavior of toutf8b(). ASCII strings are special-cased, so the cost of wrapping with safelocalstr is negligible. (with mercurial repo) $ export HGRCPATH=/dev/null HGPLAIN= HGENCODING=latin-1 $ hg log --time --config experimental.evolution=all > /dev/null (original) time: real 11.340 secs (user 11.290+0.000 sys 0.060+0.000) time: real 11.390 secs (user 11.300+0.000 sys 0.080+0.000) time: real 11.430 secs (user 11.360+0.000 sys 0.070+0.000) (this patch) time: real 11.200 secs (user 11.100+0.000 sys 0.100+0.000) time: real 11.370 secs (user 11.300+0.000 sys 0.070+0.000) time: real 11.190 secs (user 11.130+0.000 sys 0.060+0.000)
author Yuya Nishihara <yuya@tcha.org>
date Sun, 23 Apr 2017 13:15:30 +0900
parents 8808d5d401ee
children 009b424c9cb6
line wrap: on
line diff
--- a/mercurial/templatekw.py	Sun Apr 22 11:38:53 2018 +0900
+++ b/mercurial/templatekw.py	Sun Apr 23 13:15:30 2017 +0900
@@ -278,6 +278,8 @@
     if isinstance(s, encoding.localstr):
         # try hard to preserve utf-8 bytes
         return encoding.tolocal(encoding.fromlocal(s).strip())
+    elif isinstance(s, encoding.safelocalstr):
+        return encoding.safelocalstr(s.strip())
     else:
         return s.strip()