hgweb: fix websub regex flag syntax on Python 3
The `websub` config section for hgweb is broken under Python 3
when using regex flags syntax (ie the optional `i` in the example
from `hg help config.websub`:
patternname = s/SEARCH_REGEX/REPLACE_EXPRESSION/[i]
Flags are pulled out of the specified byte-string using a regular
expression, and uppercased. The flags are then iterated over and
passed to the `re` module using `re.__dict__[item]`, to get the
object attribute of the same name from the `re` module. So on Python
2 if the `il` flags are passed, this transition looks like:
`'il'` -> `'IL'` -> `'I'` -> `re.__dict__['I']` -> `re.I`
However on Python 3, these are bytes objects. When we iterate over
a bytes object in Python 3, instead of getting the individual characters
in the string as string objects of length one, we get the integer \
value corresponding to that byte. So the same transition looks like:
`b'il'` -> `b'IL'` -> `73` -> `re.__dict__[73]` -> `KeyError`
This commit fixes the type mismatch by converting the bytes to a
system string before iterating over each element to pass to `re`.
The transition will now look like:
`b'il'` -> `u'IL'` -> `u'I'` -> `re.__dict__[u'I']` -> `re.I`
In addition we expand `test-websub.t` to cover the regex flag case
(for both the `websub` section and `interhg`).
Differential Revision: https://phab.mercurial-scm.org/D6788
--- a/mercurial/hgweb/webutil.py Mon Sep 09 17:26:17 2019 -0400
+++ b/mercurial/hgweb/webutil.py Mon Sep 09 13:25:00 2019 -0400
@@ -791,7 +791,7 @@
flagin = match.group(3)
flags = 0
if flagin:
- for flag in flagin.upper():
+ for flag in pycompat.sysstr(flagin.upper()):
flags |= re.__dict__[flag]
try:
--- a/tests/test-websub.t Mon Sep 09 17:26:17 2019 -0400
+++ b/tests/test-websub.t Mon Sep 09 13:25:00 2019 -0400
@@ -11,16 +11,18 @@
>
> [websub]
> issues = s|Issue(\d+)|<a href="http://bts.example.org/issue\1">Issue\1</a>|
+ > tickets = s|ticket(\d+)|<a href="http://ticket.example.org/issue\1">Ticket\1</a>|i
>
> [interhg]
> # check that we maintain some interhg backwards compatibility...
> # yes, 'x' is a weird delimiter...
> markbugs = sxbugx<i class="\x">bug</i>x
+ > problems = sxPROBLEMx<i class="\x">problem</i>xi
> EOF
$ touch foo
$ hg add foo
- $ hg commit -d '1 0' -m 'Issue123: fixed the bug!'
+ $ hg commit -d '1 0' -m 'Issue123: fixed the bug! Ticket456 and problem789 too'
$ hg serve -n test -p $HGPORT -d --pid-file=hg.pid -A access.log -E errors.log
$ cat hg.pid >> $DAEMON_PIDS
@@ -28,7 +30,7 @@
log
$ get-with-headers.py localhost:$HGPORT "rev/tip" | grep bts
- <div class="description"><a href="http://bts.example.org/issue123">Issue123</a>: fixed the <i class="x">bug</i>!</div>
+ <div class="description"><a href="http://bts.example.org/issue123">Issue123</a>: fixed the <i class="x">bug</i>! <a href="http://ticket.example.org/issue456">Ticket456</a> and <i class="x">problem</i>789 too</div>
errors
$ cat errors.log