revset: support raw string literals
This adds support for r'...' and r"..." as string literals. Strings
with the "r" prefix will not have their escape characters interpreted.
This is especially useful for grep(), where, with regular string
literals, \number is interpreted as an octal escape code, and \b is
interpreted as the backspace character (\x08).
--- a/mercurial/help/revsets.txt Sun Sep 26 13:11:31 2010 -0500
+++ b/mercurial/help/revsets.txt Fri Sep 24 15:36:53 2010 -0500
@@ -7,8 +7,11 @@
Identifiers such as branch names must be quoted with single or double
quotes if they contain characters outside of
``[._a-zA-Z0-9\x80-\xff]`` or if they match one of the predefined
-predicates. Special characters can be used in quoted identifiers by
-escaping them, e.g., ``\n`` is interpreted as a newline.
+predicates.
+
+Special characters can be used in quoted identifiers by escaping them,
+e.g., ``\n`` is interpreted as a newline. To prevent them from being
+interpreted, strings can be prefixed with ``r``, e.g. ``r'...'``.
There is a single prefix operator:
@@ -82,7 +85,8 @@
An alias for ``::.`` (ancestors of the working copy's first parent).
``grep(regex)``
- Like ``keyword(string)`` but accepts a regex.
+ Like ``keyword(string)`` but accepts a regex. Use ``grep(r'...')``
+ to ensure special escape characters are handled correctly.
``head()``
Changeset is a head.
--- a/mercurial/revset.py Sun Sep 26 13:11:31 2010 -0500
+++ b/mercurial/revset.py Fri Sep 24 15:36:53 2010 -0500
@@ -48,7 +48,14 @@
pos += 1 # skip ahead
elif c in "():,-|&+!": # handle simple operators
yield (c, None, pos)
- elif c in '"\'': # handle quoted strings
+ elif (c in '"\'' or c == 'r' and
+ program[pos:pos + 2] in ("r'", 'r"')): # handle quoted strings
+ if c == 'r':
+ pos += 1
+ c = program[pos]
+ decode = lambda x: x
+ else:
+ decode = lambda x: x.decode('string-escape')
pos += 1
s = pos
while pos < l: # find closing quote
@@ -57,7 +64,7 @@
pos += 2
continue
if d == c:
- yield ('string', program[s:pos].decode('string-escape'), s)
+ yield ('string', decode(program[s:pos]), s)
break
pos += 1
else:
--- a/tests/test-revset.t Sun Sep 26 13:11:31 2010 -0500
+++ b/tests/test-revset.t Fri Sep 24 15:36:53 2010 -0500
@@ -215,6 +215,14 @@
('func', ('symbol', 'grep'), ('string', '('))
hg: parse error: invalid match pattern: unbalanced parenthesis
[255]
+ $ try 'grep("\bissue\d+")'
+ ('func', ('symbol', 'grep'), ('string', '\x08issue\\d+'))
+ $ try 'grep(r"\bissue\d+")'
+ ('func', ('symbol', 'grep'), ('string', '\\bissue\\d+'))
+ 6
+ $ try 'grep(r"\")'
+ hg: parse error at 7: unterminated string
+ [255]
$ log 'head()'
0
1