Thu, 05 Nov 2015 17:30:10 -0600 encoding: re-escape U+DCxx characters in toutf8b input (issue4927)
Matt Mackall <mpm@selenic.com> [Thu, 05 Nov 2015 17:30:10 -0600] rev 26879
encoding: re-escape U+DCxx characters in toutf8b input (issue4927) This is the final missing piece in fully round-tripping random byte strings through UTF-8b. While this issue means that UTF-8 <-> UTF-8b isn't fully bijective, we don't expect to ever see U+DCxx codepoints in "real" UTF-8 data, so it should remain bijective in practice.
(0) -10000 -3000 -1000 -300 -100 -30 -10 -1 +1 +10 +30 +100 +300 +1000 +3000 +10000 tip