contrib/byteify-strings.py
author Ian Moody <moz-ian@perix.co.uk>
Tue, 11 Jun 2019 19:37:19 +0100
changeset 42443 d3c81439e2ee
parent 42247 970aaf38c3fc
child 42674 70bd1965bd07
permissions -rwxr-xr-x
phabricator: auto-sanitise API tokens and HTTP cookies from VCR recordings Currently when making VCR recordings one needs to manually sanitise sensitive credentials before committing and submitting them as part of tests. It is easy to imagine this being accidentally missed one time by a fallible human and said credentials being leaked. It is also possible that it wouldn't be noticed to alert the user to the leak since the recording files are so large and practically unreviewable. Thus do so automatically, so the only place that needs checking is in the test-phabricator.t file. Differential Revision: https://phab.mercurial-scm.org/D6513
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
38385
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
     1
#!/usr/bin/env python3
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
     2
#
38384
1d9c97db465f byteify-strings: fork py3 code transformer to make it a standalone command
Yuya Nishihara <yuya@tcha.org>
parents: 36617
diff changeset
     3
# byteify-strings.py - transform string literals to be Python 3 safe
27220
4374d819ccd5 mercurial: implement import hook for handling C/Python modules
Gregory Szorc <gregory.szorc@gmail.com>
parents: 0
diff changeset
     4
#
4374d819ccd5 mercurial: implement import hook for handling C/Python modules
Gregory Szorc <gregory.szorc@gmail.com>
parents: 0
diff changeset
     5
# Copyright 2015 Gregory Szorc <gregory.szorc@gmail.com>
4374d819ccd5 mercurial: implement import hook for handling C/Python modules
Gregory Szorc <gregory.szorc@gmail.com>
parents: 0
diff changeset
     6
#
4374d819ccd5 mercurial: implement import hook for handling C/Python modules
Gregory Szorc <gregory.szorc@gmail.com>
parents: 0
diff changeset
     7
# This software may be used and distributed according to the terms of the
4374d819ccd5 mercurial: implement import hook for handling C/Python modules
Gregory Szorc <gregory.szorc@gmail.com>
parents: 0
diff changeset
     8
# GNU General Public License version 2 or any later version.
4374d819ccd5 mercurial: implement import hook for handling C/Python modules
Gregory Szorc <gregory.szorc@gmail.com>
parents: 0
diff changeset
     9
42247
970aaf38c3fc contrib: have byteify-strings explode if run in Python 2
Augie Fackler <augie@google.com>
parents: 39103
diff changeset
    10
from __future__ import absolute_import, print_function
27220
4374d819ccd5 mercurial: implement import hook for handling C/Python modules
Gregory Szorc <gregory.szorc@gmail.com>
parents: 0
diff changeset
    11
38385
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
    12
import argparse
38386
9f42e4a83676 byteify-strings: add --inplace option to write back result
Yuya Nishihara <yuya@tcha.org>
parents: 38385
diff changeset
    13
import contextlib
9f42e4a83676 byteify-strings: add --inplace option to write back result
Yuya Nishihara <yuya@tcha.org>
parents: 38385
diff changeset
    14
import errno
9f42e4a83676 byteify-strings: add --inplace option to write back result
Yuya Nishihara <yuya@tcha.org>
parents: 38385
diff changeset
    15
import os
38385
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
    16
import sys
38386
9f42e4a83676 byteify-strings: add --inplace option to write back result
Yuya Nishihara <yuya@tcha.org>
parents: 38385
diff changeset
    17
import tempfile
38384
1d9c97db465f byteify-strings: fork py3 code transformer to make it a standalone command
Yuya Nishihara <yuya@tcha.org>
parents: 36617
diff changeset
    18
import token
1d9c97db465f byteify-strings: fork py3 code transformer to make it a standalone command
Yuya Nishihara <yuya@tcha.org>
parents: 36617
diff changeset
    19
import tokenize
29550
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
    20
38390
47dd23e6b116 byteify-strings: try to preserve column alignment
Yuya Nishihara <yuya@tcha.org>
parents: 38389
diff changeset
    21
def adjusttokenpos(t, ofs):
47dd23e6b116 byteify-strings: try to preserve column alignment
Yuya Nishihara <yuya@tcha.org>
parents: 38389
diff changeset
    22
    """Adjust start/end column of the given token"""
47dd23e6b116 byteify-strings: try to preserve column alignment
Yuya Nishihara <yuya@tcha.org>
parents: 38389
diff changeset
    23
    return t._replace(start=(t.start[0], t.start[1] + ofs),
47dd23e6b116 byteify-strings: try to preserve column alignment
Yuya Nishihara <yuya@tcha.org>
parents: 38389
diff changeset
    24
                      end=(t.end[0], t.end[1] + ofs))
47dd23e6b116 byteify-strings: try to preserve column alignment
Yuya Nishihara <yuya@tcha.org>
parents: 38389
diff changeset
    25
38391
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    26
def replacetokens(tokens, opts):
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    27
    """Transform a stream of tokens from raw to Python 3.
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    28
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    29
    Returns a generator of possibly rewritten tokens.
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    30
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    31
    The input token list may be mutated as part of processing. However,
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    32
    its changes do not necessarily match the output token stream.
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    33
    """
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    34
    sysstrtokens = set()
29550
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
    35
38391
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    36
    # The following utility functions access the tokens list and i index of
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    37
    # the for i, t enumerate(tokens) loop below
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    38
    def _isop(j, *o):
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    39
        """Assert that tokens[j] is an OP with one of the given values"""
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    40
        try:
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    41
            return tokens[j].type == token.OP and tokens[j].string in o
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    42
        except IndexError:
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    43
            return False
29550
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
    44
38391
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    45
    def _findargnofcall(n):
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    46
        """Find arg n of a call expression (start at 0)
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    47
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    48
        Returns index of the first token of that argument, or None if
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    49
        there is not that many arguments.
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    50
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    51
        Assumes that token[i + 1] is '('.
38389
1d68fd5f614a byteify-strings: do not rewrite system string literals to u''
Yuya Nishihara <yuya@tcha.org>
parents: 38388
diff changeset
    52
38391
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    53
        """
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    54
        nested = 0
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    55
        for j in range(i + 2, len(tokens)):
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    56
            if _isop(j, ')', ']', '}'):
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    57
                # end of call, tuple, subscription or dict / set
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    58
                nested -= 1
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    59
                if nested < 0:
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    60
                    return None
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    61
            elif n == 0:
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    62
                # this is the starting position of arg
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    63
                return j
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    64
            elif _isop(j, '(', '[', '{'):
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    65
                nested += 1
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    66
            elif _isop(j, ',') and nested == 0:
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    67
                n -= 1
30165
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    68
38391
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    69
        return None
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    70
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    71
    def _ensuresysstr(j):
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    72
        """Make sure the token at j is a system string
30165
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    73
38391
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    74
        Remember the given token so the string transformer won't add
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    75
        the byte prefix.
30165
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    76
38391
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    77
        Ignores tokens that are not strings. Assumes bounds checking has
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    78
        already been done.
30165
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    79
38391
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    80
        """
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    81
        st = tokens[j]
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    82
        if st.type == token.STRING and st.string.startswith(("'", '"')):
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    83
            sysstrtokens.add(st)
30165
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    84
38391
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    85
    coldelta = 0  # column increment for new opening parens
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    86
    coloffset = -1  # column offset for the current line (-1: TBD)
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    87
    parens = [(0, 0, 0)]  # stack of (line, end-column, column-offset)
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    88
    for i, t in enumerate(tokens):
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    89
        # Compute the column offset for the current line, such that
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    90
        # the current line will be aligned to the last opening paren
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    91
        # as before.
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    92
        if coloffset < 0:
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    93
            if t.start[1] == parens[-1][1]:
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    94
                coloffset = parens[-1][2]
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    95
            elif t.start[1] + 1 == parens[-1][1]:
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    96
                # fix misaligned indent of s/util.Abort/error.Abort/
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    97
                coloffset = parens[-1][2] + (parens[-1][1] - t.start[1])
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    98
            else:
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
    99
                coloffset = 0
30165
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
   100
38391
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   101
        # Reset per-line attributes at EOL.
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   102
        if t.type in (token.NEWLINE, tokenize.NL):
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   103
            yield adjusttokenpos(t, coloffset)
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   104
            coldelta = 0
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   105
            coloffset = -1
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   106
            continue
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   107
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   108
        # Remember the last paren position.
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   109
        if _isop(i, '(', '[', '{'):
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   110
            parens.append(t.end + (coloffset + coldelta,))
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   111
        elif _isop(i, ')', ']', '}'):
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   112
            parens.pop()
30165
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
   113
38391
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   114
        # Convert most string literals to byte literals. String literals
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   115
        # in Python 2 are bytes. String literals in Python 3 are unicode.
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   116
        # Most strings in Mercurial are bytes and unicode strings are rare.
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   117
        # Rather than rewrite all string literals to use ``b''`` to indicate
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   118
        # byte strings, we apply this token transformer to insert the ``b``
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   119
        # prefix nearly everywhere.
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   120
        if t.type == token.STRING and t not in sysstrtokens:
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   121
            s = t.string
38390
47dd23e6b116 byteify-strings: try to preserve column alignment
Yuya Nishihara <yuya@tcha.org>
parents: 38389
diff changeset
   122
38391
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   123
            # Preserve docstrings as string literals. This is inconsistent
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   124
            # with regular unprefixed strings. However, the
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   125
            # "from __future__" parsing (which allows a module docstring to
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   126
            # exist before it) doesn't properly handle the docstring if it
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   127
            # is b''' prefixed, leading to a SyntaxError. We leave all
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   128
            # docstrings as unprefixed to avoid this. This means Mercurial
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   129
            # components touching docstrings need to handle unicode,
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   130
            # unfortunately.
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   131
            if s[0:3] in ("'''", '"""'):
38390
47dd23e6b116 byteify-strings: try to preserve column alignment
Yuya Nishihara <yuya@tcha.org>
parents: 38389
diff changeset
   132
                yield adjusttokenpos(t, coloffset)
47dd23e6b116 byteify-strings: try to preserve column alignment
Yuya Nishihara <yuya@tcha.org>
parents: 38389
diff changeset
   133
                continue
47dd23e6b116 byteify-strings: try to preserve column alignment
Yuya Nishihara <yuya@tcha.org>
parents: 38389
diff changeset
   134
38391
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   135
            # If the first character isn't a quote, it is likely a string
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   136
            # prefixing character (such as 'b', 'u', or 'r'. Ignore.
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   137
            if s[0] not in ("'", '"'):
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   138
                yield adjusttokenpos(t, coloffset)
29550
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
   139
                continue
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
   140
38391
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   141
            # String literal. Prefix to make a b'' string.
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   142
            yield adjusttokenpos(t._replace(string='b%s' % t.string),
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   143
                                 coloffset)
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   144
            coldelta += 1
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   145
            continue
29550
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
   146
38391
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   147
        # This looks like a function call.
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   148
        if t.type == token.NAME and _isop(i + 1, '('):
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   149
            fn = t.string
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   150
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   151
            # *attr() builtins don't accept byte strings to 2nd argument.
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   152
            if (fn in ('getattr', 'setattr', 'hasattr', 'safehasattr') and
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   153
                    not _isop(i - 1, '.')):
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   154
                arg1idx = _findargnofcall(1)
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   155
                if arg1idx is not None:
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   156
                    _ensuresysstr(arg1idx)
29550
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
   157
38391
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   158
            # .encode() and .decode() on str/bytes/unicode don't accept
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   159
            # byte strings on Python 3.
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   160
            elif fn in ('encode', 'decode') and _isop(i - 1, '.'):
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   161
                for argn in range(2):
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   162
                    argidx = _findargnofcall(argn)
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   163
                    if argidx is not None:
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   164
                        _ensuresysstr(argidx)
29550
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
   165
38391
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   166
            # It changes iteritems/values to items/values as they are not
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   167
            # present in Python 3 world.
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   168
            elif opts['dictiter'] and fn in ('iteritems', 'itervalues'):
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   169
                yield adjusttokenpos(t._replace(string=fn[4:]), coloffset)
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   170
                continue
30052
eaaedad68011 py3: switch to .items() using transformer
Pulkit Goyal <7895pulkit@gmail.com>
parents: 30051
diff changeset
   171
39103
da130c5cef90 byteify-strings: prevent "__name__ == '__main__'" from being transformed
Yuya Nishihara <yuya@tcha.org>
parents: 38391
diff changeset
   172
        # Looks like "if __name__ == '__main__'".
da130c5cef90 byteify-strings: prevent "__name__ == '__main__'" from being transformed
Yuya Nishihara <yuya@tcha.org>
parents: 38391
diff changeset
   173
        if (t.type == token.NAME and t.string == '__name__'
da130c5cef90 byteify-strings: prevent "__name__ == '__main__'" from being transformed
Yuya Nishihara <yuya@tcha.org>
parents: 38391
diff changeset
   174
            and _isop(i + 1, '==')):
da130c5cef90 byteify-strings: prevent "__name__ == '__main__'" from being transformed
Yuya Nishihara <yuya@tcha.org>
parents: 38391
diff changeset
   175
            _ensuresysstr(i + 2)
da130c5cef90 byteify-strings: prevent "__name__ == '__main__'" from being transformed
Yuya Nishihara <yuya@tcha.org>
parents: 38391
diff changeset
   176
38391
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   177
        # Emit unmodified token.
f77bbd34a1df byteify-strings: remove superfluous "if True" block
Yuya Nishihara <yuya@tcha.org>
parents: 38390
diff changeset
   178
        yield adjusttokenpos(t, coloffset)
38385
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
   179
38388
f701bc936e7f byteify-strings: do not rewrite iteritems() and itervalues() by default
Yuya Nishihara <yuya@tcha.org>
parents: 38387
diff changeset
   180
def process(fin, fout, opts):
38385
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
   181
    tokens = tokenize.tokenize(fin.readline)
38388
f701bc936e7f byteify-strings: do not rewrite iteritems() and itervalues() by default
Yuya Nishihara <yuya@tcha.org>
parents: 38387
diff changeset
   182
    tokens = replacetokens(list(tokens), opts)
38385
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
   183
    fout.write(tokenize.untokenize(tokens))
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
   184
38386
9f42e4a83676 byteify-strings: add --inplace option to write back result
Yuya Nishihara <yuya@tcha.org>
parents: 38385
diff changeset
   185
def tryunlink(fname):
9f42e4a83676 byteify-strings: add --inplace option to write back result
Yuya Nishihara <yuya@tcha.org>
parents: 38385
diff changeset
   186
    try:
9f42e4a83676 byteify-strings: add --inplace option to write back result
Yuya Nishihara <yuya@tcha.org>
parents: 38385
diff changeset
   187
        os.unlink(fname)
9f42e4a83676 byteify-strings: add --inplace option to write back result
Yuya Nishihara <yuya@tcha.org>
parents: 38385
diff changeset
   188
    except OSError as err:
9f42e4a83676 byteify-strings: add --inplace option to write back result
Yuya Nishihara <yuya@tcha.org>
parents: 38385
diff changeset
   189
        if err.errno != errno.ENOENT:
9f42e4a83676 byteify-strings: add --inplace option to write back result
Yuya Nishihara <yuya@tcha.org>
parents: 38385
diff changeset
   190
            raise
9f42e4a83676 byteify-strings: add --inplace option to write back result
Yuya Nishihara <yuya@tcha.org>
parents: 38385
diff changeset
   191
9f42e4a83676 byteify-strings: add --inplace option to write back result
Yuya Nishihara <yuya@tcha.org>
parents: 38385
diff changeset
   192
@contextlib.contextmanager
9f42e4a83676 byteify-strings: add --inplace option to write back result
Yuya Nishihara <yuya@tcha.org>
parents: 38385
diff changeset
   193
def editinplace(fname):
9f42e4a83676 byteify-strings: add --inplace option to write back result
Yuya Nishihara <yuya@tcha.org>
parents: 38385
diff changeset
   194
    n = os.path.basename(fname)
9f42e4a83676 byteify-strings: add --inplace option to write back result
Yuya Nishihara <yuya@tcha.org>
parents: 38385
diff changeset
   195
    d = os.path.dirname(fname)
9f42e4a83676 byteify-strings: add --inplace option to write back result
Yuya Nishihara <yuya@tcha.org>
parents: 38385
diff changeset
   196
    fp = tempfile.NamedTemporaryFile(prefix='.%s-' % n, suffix='~', dir=d,
9f42e4a83676 byteify-strings: add --inplace option to write back result
Yuya Nishihara <yuya@tcha.org>
parents: 38385
diff changeset
   197
                                     delete=False)
9f42e4a83676 byteify-strings: add --inplace option to write back result
Yuya Nishihara <yuya@tcha.org>
parents: 38385
diff changeset
   198
    try:
9f42e4a83676 byteify-strings: add --inplace option to write back result
Yuya Nishihara <yuya@tcha.org>
parents: 38385
diff changeset
   199
        yield fp
9f42e4a83676 byteify-strings: add --inplace option to write back result
Yuya Nishihara <yuya@tcha.org>
parents: 38385
diff changeset
   200
        fp.close()
9f42e4a83676 byteify-strings: add --inplace option to write back result
Yuya Nishihara <yuya@tcha.org>
parents: 38385
diff changeset
   201
        if os.name == 'nt':
9f42e4a83676 byteify-strings: add --inplace option to write back result
Yuya Nishihara <yuya@tcha.org>
parents: 38385
diff changeset
   202
            tryunlink(fname)
9f42e4a83676 byteify-strings: add --inplace option to write back result
Yuya Nishihara <yuya@tcha.org>
parents: 38385
diff changeset
   203
        os.rename(fp.name, fname)
9f42e4a83676 byteify-strings: add --inplace option to write back result
Yuya Nishihara <yuya@tcha.org>
parents: 38385
diff changeset
   204
    finally:
9f42e4a83676 byteify-strings: add --inplace option to write back result
Yuya Nishihara <yuya@tcha.org>
parents: 38385
diff changeset
   205
        fp.close()
9f42e4a83676 byteify-strings: add --inplace option to write back result
Yuya Nishihara <yuya@tcha.org>
parents: 38385
diff changeset
   206
        tryunlink(fp.name)
9f42e4a83676 byteify-strings: add --inplace option to write back result
Yuya Nishihara <yuya@tcha.org>
parents: 38385
diff changeset
   207
38385
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
   208
def main():
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
   209
    ap = argparse.ArgumentParser()
38386
9f42e4a83676 byteify-strings: add --inplace option to write back result
Yuya Nishihara <yuya@tcha.org>
parents: 38385
diff changeset
   210
    ap.add_argument('-i', '--inplace', action='store_true', default=False,
9f42e4a83676 byteify-strings: add --inplace option to write back result
Yuya Nishihara <yuya@tcha.org>
parents: 38385
diff changeset
   211
                    help='edit files in place')
38388
f701bc936e7f byteify-strings: do not rewrite iteritems() and itervalues() by default
Yuya Nishihara <yuya@tcha.org>
parents: 38387
diff changeset
   212
    ap.add_argument('--dictiter', action='store_true', default=False,
f701bc936e7f byteify-strings: do not rewrite iteritems() and itervalues() by default
Yuya Nishihara <yuya@tcha.org>
parents: 38387
diff changeset
   213
                    help='rewrite iteritems() and itervalues()'),
38385
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
   214
    ap.add_argument('files', metavar='FILE', nargs='+', help='source file')
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
   215
    args = ap.parse_args()
38388
f701bc936e7f byteify-strings: do not rewrite iteritems() and itervalues() by default
Yuya Nishihara <yuya@tcha.org>
parents: 38387
diff changeset
   216
    opts = {
f701bc936e7f byteify-strings: do not rewrite iteritems() and itervalues() by default
Yuya Nishihara <yuya@tcha.org>
parents: 38387
diff changeset
   217
        'dictiter': args.dictiter,
f701bc936e7f byteify-strings: do not rewrite iteritems() and itervalues() by default
Yuya Nishihara <yuya@tcha.org>
parents: 38387
diff changeset
   218
    }
38385
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
   219
    for fname in args.files:
38386
9f42e4a83676 byteify-strings: add --inplace option to write back result
Yuya Nishihara <yuya@tcha.org>
parents: 38385
diff changeset
   220
        if args.inplace:
9f42e4a83676 byteify-strings: add --inplace option to write back result
Yuya Nishihara <yuya@tcha.org>
parents: 38385
diff changeset
   221
            with editinplace(fname) as fout:
9f42e4a83676 byteify-strings: add --inplace option to write back result
Yuya Nishihara <yuya@tcha.org>
parents: 38385
diff changeset
   222
                with open(fname, 'rb') as fin:
38388
f701bc936e7f byteify-strings: do not rewrite iteritems() and itervalues() by default
Yuya Nishihara <yuya@tcha.org>
parents: 38387
diff changeset
   223
                    process(fin, fout, opts)
38386
9f42e4a83676 byteify-strings: add --inplace option to write back result
Yuya Nishihara <yuya@tcha.org>
parents: 38385
diff changeset
   224
        else:
9f42e4a83676 byteify-strings: add --inplace option to write back result
Yuya Nishihara <yuya@tcha.org>
parents: 38385
diff changeset
   225
            with open(fname, 'rb') as fin:
9f42e4a83676 byteify-strings: add --inplace option to write back result
Yuya Nishihara <yuya@tcha.org>
parents: 38385
diff changeset
   226
                fout = sys.stdout.buffer
38388
f701bc936e7f byteify-strings: do not rewrite iteritems() and itervalues() by default
Yuya Nishihara <yuya@tcha.org>
parents: 38387
diff changeset
   227
                process(fin, fout, opts)
38385
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
   228
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
   229
if __name__ == '__main__':
42247
970aaf38c3fc contrib: have byteify-strings explode if run in Python 2
Augie Fackler <augie@google.com>
parents: 39103
diff changeset
   230
    if sys.version_info.major < 3:
970aaf38c3fc contrib: have byteify-strings explode if run in Python 2
Augie Fackler <augie@google.com>
parents: 39103
diff changeset
   231
        print('This script must be run under Python 3.')
970aaf38c3fc contrib: have byteify-strings explode if run in Python 2
Augie Fackler <augie@google.com>
parents: 39103
diff changeset
   232
        sys.exit(3)
38385
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
   233
    main()