changeset 29155:aaabed77791a stable 3.8.2

help: search section of help topic by translated section name correctly Before this patch, "hg help topic.section" might show unexpected section of help topic in some encoding. It applies str.lower() instead of encoding.lower(str) on translated message to search section case-insensitively, but some encoding uses 0x41(A) - 0x5a(Z) as the second or later byte of multi-byte character (for example, ja_JP.cp932), and str.lower() causes unexpected result. To search section of help topic by translated section name correctly, this patch replaces str.lower() by encoding.lower(str) for both query string (in commands.help()) and translated help text (in minirst.getsections()).
author FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
date Fri, 13 May 2016 07:19:59 +0900
parents 9d38a2061fd8
children 3622469fd82f
files mercurial/commands.py mercurial/minirst.py tests/test-help.t
diffstat 3 files changed, 74 insertions(+), 2 deletions(-) [+]
line wrap: on
line diff
--- a/mercurial/commands.py	Fri May 13 07:19:59 2016 +0900
+++ b/mercurial/commands.py	Fri May 13 07:19:59 2016 +0900
@@ -4590,7 +4590,7 @@
     subtopic = None
     if name and '.' in name:
         name, section = name.split('.', 1)
-        section = section.lower()
+        section = encoding.lower(section)
         if '.' in section:
             subtopic, section = section.split('.', 1)
         else:
--- a/mercurial/minirst.py	Fri May 13 07:19:59 2016 +0900
+++ b/mercurial/minirst.py	Fri May 13 07:19:59 2016 +0900
@@ -724,7 +724,7 @@
             x = b['key']
         else:
             x = b['lines'][0]
-        x = x.lower().strip('"')
+        x = encoding.lower(x).strip('"')
         if '(' in x:
             x = x.split('(')[0]
         return x
--- a/tests/test-help.t	Fri May 13 07:19:59 2016 +0900
+++ b/tests/test-help.t	Fri May 13 07:19:59 2016 +0900
@@ -1524,6 +1524,78 @@
       files         List of strings. All files modified, added, or removed by
                     this changeset.
 
+Test section lookup by translated message
+
+str.lower() instead of encoding.lower(str) on translated message might
+make message meaningless, because some encoding uses 0x41(A) - 0x5a(Z)
+as the second or later byte of multi-byte character.
+
+For example, "\x8bL\x98^" (translation of "record" in ja_JP.cp932)
+contains 0x4c (L). str.lower() replaces 0x4c(L) by 0x6c(l) and this
+replacement makes message meaningless.
+
+This tests that section lookup by translated string isn't broken by
+such str.lower().
+
+  $ python <<EOF
+  > def escape(s):
+  >     return ''.join('\u%x' % ord(uc) for uc in s.decode('cp932'))
+  > # translation of "record" in ja_JP.cp932
+  > upper = "\x8bL\x98^"
+  > # str.lower()-ed section name should be treated as different one
+  > lower = "\x8bl\x98^"
+  > with open('ambiguous.py', 'w') as fp:
+  >     fp.write("""# ambiguous section names in ja_JP.cp932
+  > u'''summary of extension
+  > 
+  > %s
+  > ----
+  > 
+  > Upper name should show only this message
+  > 
+  > %s
+  > ----
+  > 
+  > Lower name should show only this message
+  > 
+  > subsequent section
+  > ------------------
+  > 
+  > This should be hidden at "hg help ambiguous" with section name.
+  > '''
+  > """ % (escape(upper), escape(lower)))
+  > EOF
+
+  $ cat >> $HGRCPATH <<EOF
+  > [extensions]
+  > ambiguous = ./ambiguous.py
+  > EOF
+
+  $ python <<EOF | sh
+  > upper = "\x8bL\x98^"
+  > print "hg --encoding cp932 help -e ambiguous.%s" % upper
+  > EOF
+  \x8bL\x98^ (esc)
+  ----
+  
+  Upper name should show only this message
+  
+
+  $ python <<EOF | sh
+  > lower = "\x8bl\x98^"
+  > print "hg --encoding cp932 help -e ambiguous.%s" % lower
+  > EOF
+  \x8bl\x98^ (esc)
+  ----
+  
+  Lower name should show only this message
+  
+
+  $ cat >> $HGRCPATH <<EOF
+  > [extensions]
+  > ambiguous = !
+  > EOF
+
 Test dynamic list of merge tools only shows up once
   $ hg help merge-tools
   Merge Tools