i18n/hggettext
author Steve Losh <steve@stevelosh.com>
Tue, 16 Feb 2010 09:31:35 -0500
branchstable
changeset 10487 7a6b5f85c3ab
parent 10263 25e572394f5c
child 12823 80deae3bc5ea
permissions -rwxr-xr-x
util: use the built-in any() and all() methods if they are available
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
8542
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
     1
#!/usr/bin/env python
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
     2
#
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
     3
# hggettext - carefully extract docstrings for Mercurial
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
     4
#
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
     5
# Copyright 2009 Matt Mackall <mpm@selenic.com> and others
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
     6
#
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
     7
# This software may be used and distributed according to the terms of the
10263
25e572394f5c Update license to GPLv2+
Matt Mackall <mpm@selenic.com>
parents: 9539
diff changeset
     8
# GNU General Public License version 2 or any later version.
8542
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
     9
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    10
# The normalize function is taken from pygettext which is distributed
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    11
# with Python under the Python License, which is GPL compatible.
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    12
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    13
"""Extract docstrings from Mercurial commands.
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    14
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    15
Compared to pygettext, this script knows about the cmdtable and table
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    16
dictionaries used by Mercurial, and will only extract docstrings from
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    17
functions mentioned therein.
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    18
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    19
Use xgettext like normal to extract strings marked as translatable and
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    20
join the message cataloges to get the final catalog.
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    21
"""
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    22
8626
1fc1c77d4863 hggettext: ensure correct Mercurial is imported
Martin Geisler <mg@lazybytes.net>
parents: 8542
diff changeset
    23
import os, sys, inspect
8542
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    24
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    25
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    26
def escape(s):
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    27
    # The order is important, the backslash must be escaped first
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    28
    # since the other replacements introduce new backslashes
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    29
    # themselves.
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    30
    s = s.replace('\\', '\\\\')
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    31
    s = s.replace('\n', '\\n')
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    32
    s = s.replace('\r', '\\r')
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    33
    s = s.replace('\t', '\\t')
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    34
    s = s.replace('"', '\\"')
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    35
    return s
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    36
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    37
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    38
def normalize(s):
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    39
    # This converts the various Python string types into a format that
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    40
    # is appropriate for .po files, namely much closer to C style.
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    41
    lines = s.split('\n')
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    42
    if len(lines) == 1:
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    43
        s = '"' + escape(s) + '"'
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    44
    else:
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    45
        if not lines[-1]:
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    46
            del lines[-1]
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    47
            lines[-1] = lines[-1] + '\n'
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    48
        lines = map(escape, lines)
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    49
        lineterm = '\\n"\n"'
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    50
        s = '""\n"' + lineterm.join(lines) + '"'
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    51
    return s
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    52
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    53
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    54
def poentry(path, lineno, s):
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    55
    return ('#: %s:%d\n' % (path, lineno) +
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    56
            'msgid %s\n' % normalize(s) +
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    57
            'msgstr ""\n')
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    58
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    59
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    60
def offset(src, doc, name, default):
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    61
    """Compute offset or issue a warning on stdout."""
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    62
    # Backslashes in doc appear doubled in src.
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    63
    end = src.find(doc.replace('\\', '\\\\'))
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    64
    if end == -1:
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    65
        # This can happen if the docstring contains unnecessary escape
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    66
        # sequences such as \" in a triple-quoted string. The problem
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    67
        # is that \" is turned into " and so doc wont appear in src.
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    68
        sys.stderr.write("warning: unknown offset in %s, assuming %d lines\n"
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    69
                         % (name, default))
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    70
        return default
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    71
    else:
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    72
        return src.count('\n', 0, end)
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    73
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    74
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    75
def importpath(path):
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    76
    """Import a path like foo/bar/baz.py and return the baz module."""
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    77
    if path.endswith('.py'):
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    78
        path = path[:-3]
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    79
    if path.endswith('/__init__'):
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    80
        path = path[:-9]
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    81
    path = path.replace('/', '.')
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    82
    mod = __import__(path)
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    83
    for comp in path.split('.')[1:]:
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    84
        mod = getattr(mod, comp)
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    85
    return mod
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    86
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    87
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    88
def docstrings(path):
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    89
    """Extract docstrings from path.
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    90
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    91
    This respects the Mercurial cmdtable/table convention and will
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    92
    only extract docstrings from functions mentioned in these tables.
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    93
    """
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    94
    mod = importpath(path)
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    95
    if mod.__doc__:
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    96
        src = open(path).read()
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    97
        lineno = 1 + offset(src, mod.__doc__, path, 7)
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    98
        print poentry(path, lineno, mod.__doc__)
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    99
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   100
    cmdtable = getattr(mod, 'cmdtable', {})
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   101
    if not cmdtable:
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   102
        # Maybe we are processing mercurial.commands?
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   103
        cmdtable = getattr(mod, 'table', {})
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   104
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   105
    for entry in cmdtable.itervalues():
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   106
        func = entry[0]
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   107
        if func.__doc__:
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   108
            src = inspect.getsource(func)
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   109
            name = "%s.%s" % (path, func.__name__)
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   110
            lineno = func.func_code.co_firstlineno
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   111
            lineno += offset(src, func.__doc__, name, 1)
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   112
            print poentry(path, lineno, func.__doc__)
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   113
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   114
9539
c904e76e3834 help: move help topics from mercurial/help.py to help/*.txt
Martin Geisler <mg@lazybytes.net>
parents: 8626
diff changeset
   115
def rawtext(path):
c904e76e3834 help: move help topics from mercurial/help.py to help/*.txt
Martin Geisler <mg@lazybytes.net>
parents: 8626
diff changeset
   116
    src = open(path).read()
c904e76e3834 help: move help topics from mercurial/help.py to help/*.txt
Martin Geisler <mg@lazybytes.net>
parents: 8626
diff changeset
   117
    print poentry(path, 1, src)
c904e76e3834 help: move help topics from mercurial/help.py to help/*.txt
Martin Geisler <mg@lazybytes.net>
parents: 8626
diff changeset
   118
c904e76e3834 help: move help topics from mercurial/help.py to help/*.txt
Martin Geisler <mg@lazybytes.net>
parents: 8626
diff changeset
   119
8542
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   120
if __name__ == "__main__":
8626
1fc1c77d4863 hggettext: ensure correct Mercurial is imported
Martin Geisler <mg@lazybytes.net>
parents: 8542
diff changeset
   121
    # It is very important that we import the Mercurial modules from
1fc1c77d4863 hggettext: ensure correct Mercurial is imported
Martin Geisler <mg@lazybytes.net>
parents: 8542
diff changeset
   122
    # the source tree where hggettext is executed. Otherwise we might
1fc1c77d4863 hggettext: ensure correct Mercurial is imported
Martin Geisler <mg@lazybytes.net>
parents: 8542
diff changeset
   123
    # accidentally import and extract strings from a Mercurial
1fc1c77d4863 hggettext: ensure correct Mercurial is imported
Martin Geisler <mg@lazybytes.net>
parents: 8542
diff changeset
   124
    # installation mentioned in PYTHONPATH.
1fc1c77d4863 hggettext: ensure correct Mercurial is imported
Martin Geisler <mg@lazybytes.net>
parents: 8542
diff changeset
   125
    sys.path.insert(0, os.getcwd())
1fc1c77d4863 hggettext: ensure correct Mercurial is imported
Martin Geisler <mg@lazybytes.net>
parents: 8542
diff changeset
   126
    from mercurial import demandimport; demandimport.enable()
8542
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   127
    for path in sys.argv[1:]:
9539
c904e76e3834 help: move help topics from mercurial/help.py to help/*.txt
Martin Geisler <mg@lazybytes.net>
parents: 8626
diff changeset
   128
        if path.endswith('.txt'):
c904e76e3834 help: move help topics from mercurial/help.py to help/*.txt
Martin Geisler <mg@lazybytes.net>
parents: 8626
diff changeset
   129
            rawtext(path)
c904e76e3834 help: move help topics from mercurial/help.py to help/*.txt
Martin Geisler <mg@lazybytes.net>
parents: 8626
diff changeset
   130
        else:
c904e76e3834 help: move help topics from mercurial/help.py to help/*.txt
Martin Geisler <mg@lazybytes.net>
parents: 8626
diff changeset
   131
            docstrings(path)