annotate i18n/hggettext @ 12717:89df79b3c011 stable

convert/darcs: support changelogs with bytes 0x7F-0xFF (issue2411) This is a followup to 4481f8a93c7a, which only fixed the conversion of patches with UTF-8 metadata. This patch allows a changelog to have any bytes with values 0x7F-0xFF. It parses the XML changelog as Latin-1 and uses converter_source.recode() to decode the data as UTF-8/Latin-1. Caveats: - Since the convert extension doesn't provide any way to specify the source encoding, users are still limited to UTF-8 and Latin-1. - etree will still complain if the changelog has bytes with values 0x00-0x19. XML only allows printable characters.
author Brodie Rao <brodie@bitheap.org>
date Fri, 01 Oct 2010 10:15:04 -0500
parents 25e572394f5c
children 80deae3bc5ea
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
8542
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
1 #!/usr/bin/env python
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
2 #
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
3 # hggettext - carefully extract docstrings for Mercurial
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
4 #
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
5 # Copyright 2009 Matt Mackall <mpm@selenic.com> and others
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
6 #
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
7 # This software may be used and distributed according to the terms of the
10263
25e572394f5c Update license to GPLv2+
Matt Mackall <mpm@selenic.com>
parents: 9539
diff changeset
8 # GNU General Public License version 2 or any later version.
8542
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
9
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
10 # The normalize function is taken from pygettext which is distributed
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
11 # with Python under the Python License, which is GPL compatible.
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
12
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
13 """Extract docstrings from Mercurial commands.
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
14
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
15 Compared to pygettext, this script knows about the cmdtable and table
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
16 dictionaries used by Mercurial, and will only extract docstrings from
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
17 functions mentioned therein.
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
18
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
19 Use xgettext like normal to extract strings marked as translatable and
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
20 join the message cataloges to get the final catalog.
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
21 """
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
22
8626
1fc1c77d4863 hggettext: ensure correct Mercurial is imported
Martin Geisler <mg@lazybytes.net>
parents: 8542
diff changeset
23 import os, sys, inspect
8542
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
24
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
25
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
26 def escape(s):
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
27 # The order is important, the backslash must be escaped first
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
28 # since the other replacements introduce new backslashes
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
29 # themselves.
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
30 s = s.replace('\\', '\\\\')
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
31 s = s.replace('\n', '\\n')
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
32 s = s.replace('\r', '\\r')
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
33 s = s.replace('\t', '\\t')
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
34 s = s.replace('"', '\\"')
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
35 return s
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
36
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
37
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
38 def normalize(s):
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
39 # This converts the various Python string types into a format that
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
40 # is appropriate for .po files, namely much closer to C style.
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
41 lines = s.split('\n')
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
42 if len(lines) == 1:
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
43 s = '"' + escape(s) + '"'
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
44 else:
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
45 if not lines[-1]:
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
46 del lines[-1]
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
47 lines[-1] = lines[-1] + '\n'
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
48 lines = map(escape, lines)
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
49 lineterm = '\\n"\n"'
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
50 s = '""\n"' + lineterm.join(lines) + '"'
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
51 return s
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
52
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
53
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
54 def poentry(path, lineno, s):
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
55 return ('#: %s:%d\n' % (path, lineno) +
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
56 'msgid %s\n' % normalize(s) +
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
57 'msgstr ""\n')
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
58
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
59
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
60 def offset(src, doc, name, default):
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
61 """Compute offset or issue a warning on stdout."""
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
62 # Backslashes in doc appear doubled in src.
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
63 end = src.find(doc.replace('\\', '\\\\'))
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
64 if end == -1:
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
65 # This can happen if the docstring contains unnecessary escape
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
66 # sequences such as \" in a triple-quoted string. The problem
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
67 # is that \" is turned into " and so doc wont appear in src.
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
68 sys.stderr.write("warning: unknown offset in %s, assuming %d lines\n"
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
69 % (name, default))
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
70 return default
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
71 else:
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
72 return src.count('\n', 0, end)
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
73
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
74
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
75 def importpath(path):
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
76 """Import a path like foo/bar/baz.py and return the baz module."""
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
77 if path.endswith('.py'):
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
78 path = path[:-3]
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
79 if path.endswith('/__init__'):
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
80 path = path[:-9]
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
81 path = path.replace('/', '.')
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
82 mod = __import__(path)
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
83 for comp in path.split('.')[1:]:
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
84 mod = getattr(mod, comp)
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
85 return mod
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
86
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
87
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
88 def docstrings(path):
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
89 """Extract docstrings from path.
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
90
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
91 This respects the Mercurial cmdtable/table convention and will
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
92 only extract docstrings from functions mentioned in these tables.
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
93 """
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
94 mod = importpath(path)
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
95 if mod.__doc__:
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
96 src = open(path).read()
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
97 lineno = 1 + offset(src, mod.__doc__, path, 7)
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
98 print poentry(path, lineno, mod.__doc__)
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
99
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
100 cmdtable = getattr(mod, 'cmdtable', {})
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
101 if not cmdtable:
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
102 # Maybe we are processing mercurial.commands?
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
103 cmdtable = getattr(mod, 'table', {})
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
104
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
105 for entry in cmdtable.itervalues():
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
106 func = entry[0]
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
107 if func.__doc__:
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
108 src = inspect.getsource(func)
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
109 name = "%s.%s" % (path, func.__name__)
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
110 lineno = func.func_code.co_firstlineno
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
111 lineno += offset(src, func.__doc__, name, 1)
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
112 print poentry(path, lineno, func.__doc__)
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
113
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
114
9539
c904e76e3834 help: move help topics from mercurial/help.py to help/*.txt
Martin Geisler <mg@lazybytes.net>
parents: 8626
diff changeset
115 def rawtext(path):
c904e76e3834 help: move help topics from mercurial/help.py to help/*.txt
Martin Geisler <mg@lazybytes.net>
parents: 8626
diff changeset
116 src = open(path).read()
c904e76e3834 help: move help topics from mercurial/help.py to help/*.txt
Martin Geisler <mg@lazybytes.net>
parents: 8626
diff changeset
117 print poentry(path, 1, src)
c904e76e3834 help: move help topics from mercurial/help.py to help/*.txt
Martin Geisler <mg@lazybytes.net>
parents: 8626
diff changeset
118
c904e76e3834 help: move help topics from mercurial/help.py to help/*.txt
Martin Geisler <mg@lazybytes.net>
parents: 8626
diff changeset
119
8542
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
120 if __name__ == "__main__":
8626
1fc1c77d4863 hggettext: ensure correct Mercurial is imported
Martin Geisler <mg@lazybytes.net>
parents: 8542
diff changeset
121 # It is very important that we import the Mercurial modules from
1fc1c77d4863 hggettext: ensure correct Mercurial is imported
Martin Geisler <mg@lazybytes.net>
parents: 8542
diff changeset
122 # the source tree where hggettext is executed. Otherwise we might
1fc1c77d4863 hggettext: ensure correct Mercurial is imported
Martin Geisler <mg@lazybytes.net>
parents: 8542
diff changeset
123 # accidentally import and extract strings from a Mercurial
1fc1c77d4863 hggettext: ensure correct Mercurial is imported
Martin Geisler <mg@lazybytes.net>
parents: 8542
diff changeset
124 # installation mentioned in PYTHONPATH.
1fc1c77d4863 hggettext: ensure correct Mercurial is imported
Martin Geisler <mg@lazybytes.net>
parents: 8542
diff changeset
125 sys.path.insert(0, os.getcwd())
1fc1c77d4863 hggettext: ensure correct Mercurial is imported
Martin Geisler <mg@lazybytes.net>
parents: 8542
diff changeset
126 from mercurial import demandimport; demandimport.enable()
8542
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
127 for path in sys.argv[1:]:
9539
c904e76e3834 help: move help topics from mercurial/help.py to help/*.txt
Martin Geisler <mg@lazybytes.net>
parents: 8626
diff changeset
128 if path.endswith('.txt'):
c904e76e3834 help: move help topics from mercurial/help.py to help/*.txt
Martin Geisler <mg@lazybytes.net>
parents: 8626
diff changeset
129 rawtext(path)
c904e76e3834 help: move help topics from mercurial/help.py to help/*.txt
Martin Geisler <mg@lazybytes.net>
parents: 8626
diff changeset
130 else:
c904e76e3834 help: move help topics from mercurial/help.py to help/*.txt
Martin Geisler <mg@lazybytes.net>
parents: 8626
diff changeset
131 docstrings(path)