changeset 23613:7b8ff3fd11d3

highlight: ignore Unicode's extra linebreaks (issue4291) Unicode and Python's unicode.splitlines() treat several extra legacy ASCII codepoints as linebreaks, even though the vast bulk of computing and Python's own str.splitlines() do not. Rather than introduce line numbering confusion, we filter them out when highlighting.
author Matt Mackall <mpm@selenic.com>
date Wed, 17 Dec 2014 13:25:24 -0600
parents 6006cad5e7a9
children cd79fb4d75fd
files hgext/highlight/highlight.py
diffstat 1 files changed, 5 insertions(+), 0 deletions(-) [+]
line wrap: on
line diff
--- a/hgext/highlight/highlight.py	Thu Dec 18 21:53:55 2014 +0100
+++ b/hgext/highlight/highlight.py	Wed Dec 17 13:25:24 2014 -0600
@@ -32,6 +32,11 @@
     if util.binary(text):
         return
 
+    # str.splitlines() != unicode.splitlines() because "reasons"
+    for c in "\x0c\x1c\x1d\x1e":
+        if c in text:
+            text = text.replace(c, '')
+
     # Pygments is best used with Unicode strings:
     # <http://pygments.org/docs/unicode/>
     text = text.decode(encoding.encoding, 'replace')