contrib: fix a subtle bug in check-code's regex rewriting
We rewrite `\s` to `[ \t]` when preparing our regular expressions, but
we previously weren't working to avoid having nested sets. Previously,
Python let this slide without incident, but in Python 3.7 wants to
make sure you meant an actual [ in a set, and so this warns. This
appears to be fortunate for us, because `[\s(]` was getting rewritten
to be `[[ \t](]` which doesn't actually match what we expected. See
preceding changes that were revealed to be necessary after
implementing this fix.
Differential Revision: https://phab.mercurial-scm.org/D2866
--- a/contrib/check-code.py Tue Mar 13 17:55:03 2018 -0400
+++ b/contrib/check-code.py Wed Mar 14 14:05:45 2018 -0400
@@ -542,8 +542,11 @@
for i, pseq in enumerate(pats):
# fix-up regexes for multi-line searches
p = pseq[0]
- # \s doesn't match \n
- p = re.sub(r'(?<!\\)\\s', r'[ \\t]', p)
+ # \s doesn't match \n (done in two steps)
+ # first, we replace \s that appears in a set already
+ p = re.sub(r'\[\\s', r'[ \\t', p)
+ # now we replace other \s instances.
+ p = re.sub(r'(?<!(\\|\[))\\s', r'[ \\t]', p)
# [^...] doesn't match newline
p = re.sub(r'(?<!\\)\[\^', r'[^\\n', p)