py3: work around weird handling of bytes/unicode in decode_header()
authorYuya Nishihara <yuya@tcha.org>
Sun, 08 Apr 2018 15:22:30 +0900
changeset 37469 7edf68862fe3
parent 37468 ef661ce45cdb
child 37470 d658cbef8041
py3: work around weird handling of bytes/unicode in decode_header() Basically decode_header() works as follows, and on Python 3, email headers ARE UNICODE. def decode_header(header): if not ecre.search(header): # ecre is unicode regexp return [(header, None)] # so header is unicode string ... decode header into [(bytes_data, unicode_charset_name)] return collapsed
mercurial/mail.py
--- a/mercurial/mail.py	Sun Apr 08 15:03:00 2018 +0900
+++ b/mercurial/mail.py	Sun Apr 08 15:22:30 2018 +0900
@@ -332,6 +332,11 @@
                 continue
             except UnicodeDecodeError:
                 pass
+        # On Python 3, decode_header() may return either bytes or unicode
+        # depending on whether the header has =?<charset>? or not
+        if isinstance(part, type(u'')):
+            uparts.append(part)
+            continue
         try:
             uparts.append(part.decode('UTF-8'))
             continue