contrib: fix a subtle bug in check-code's regex rewriting
authorAugie Fackler <augie@google.com>
Wed, 14 Mar 2018 14:05:45 -0400
changeset 36957 a8d540d2628c
parent 36956 b710fdebd0db
child 36958 644a02f6b34f
contrib: fix a subtle bug in check-code's regex rewriting We rewrite `\s` to `[ \t]` when preparing our regular expressions, but we previously weren't working to avoid having nested sets. Previously, Python let this slide without incident, but in Python 3.7 wants to make sure you meant an actual [ in a set, and so this warns. This appears to be fortunate for us, because `[\s(]` was getting rewritten to be `[[ \t](]` which doesn't actually match what we expected. See preceding changes that were revealed to be necessary after implementing this fix. Differential Revision: https://phab.mercurial-scm.org/D2866
contrib/check-code.py
--- a/contrib/check-code.py	Tue Mar 13 17:55:03 2018 -0400
+++ b/contrib/check-code.py	Wed Mar 14 14:05:45 2018 -0400
@@ -542,8 +542,11 @@
             for i, pseq in enumerate(pats):
                 # fix-up regexes for multi-line searches
                 p = pseq[0]
-                # \s doesn't match \n
-                p = re.sub(r'(?<!\\)\\s', r'[ \\t]', p)
+                # \s doesn't match \n (done in two steps)
+                # first, we replace \s that appears in a set already
+                p = re.sub(r'\[\\s', r'[ \\t', p)
+                # now we replace other \s instances.
+                p = re.sub(r'(?<!(\\|\[))\\s', r'[ \\t]', p)
                 # [^...] doesn't match newline
                 p = re.sub(r'(?<!\\)\[\^', r'[^\\n', p)