Mercurial > hg
annotate mercurial/i18n.py @ 45637:ad6ebb6f0dfe
copies: make two version of the changeset centric algorithm
They are two main ways to run the changeset-centric copy-tracing algorithm. One
fed from data stored in side-data and still in development, and one based on
data stored in extra (with a "compatibility" mode).
The `extra` based is used in production at Google, but still experimental in
code. It is mostly unsuitable for other users because it affects the hash.
The side-data based storage and algorithm have been evolving to store more data, cover more cases
(mostly around merge, that Google do not really care about) and use lower level
storage for efficiency.
All this changes make is increasingly hard to maintain de common code base,
without impacting code complexity and performance. For example, the
compatibility mode requires to keep things at different level than what we
need for side-data.
So, I am duplicating the involved functions. The newly added `_extra` variants
will be kept as today, while I will do some deeper rework of the side data
versions.
Long terms, the side-data version should be more featureful and performant than
the extra based version, so I expect the duplicated `_extra` functions to
eventually get dropped.
Differential Revision: https://phab.mercurial-scm.org/D9114
author | Pierre-Yves David <pierre-yves.david@octobus.net> |
---|---|
date | Fri, 25 Sep 2020 14:39:04 +0200 |
parents | f0bee3b1b847 |
children | b9f40b743627 |
rev | line source |
---|---|
8226
8b2cd04a6e97
put license and copyright info into comment blocks
Martin Geisler <mg@lazybytes.net>
parents:
8225
diff
changeset
|
1 # i18n.py - internationalization support for mercurial |
8b2cd04a6e97
put license and copyright info into comment blocks
Martin Geisler <mg@lazybytes.net>
parents:
8225
diff
changeset
|
2 # |
8b2cd04a6e97
put license and copyright info into comment blocks
Martin Geisler <mg@lazybytes.net>
parents:
8225
diff
changeset
|
3 # Copyright 2005, 2006 Matt Mackall <mpm@selenic.com> |
8b2cd04a6e97
put license and copyright info into comment blocks
Martin Geisler <mg@lazybytes.net>
parents:
8225
diff
changeset
|
4 # |
8b2cd04a6e97
put license and copyright info into comment blocks
Martin Geisler <mg@lazybytes.net>
parents:
8225
diff
changeset
|
5 # This software may be used and distributed according to the terms of the |
10263 | 6 # GNU General Public License version 2 or any later version. |
1400
cf9a1233738a
i18n first part: make '_' available for files who need it
Benoit Boissinot <benoit.boissinot@ens-lyon.org
parents:
diff
changeset
|
7 |
25955
2c07c6884394
i18n: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents:
23031
diff
changeset
|
8 from __future__ import absolute_import |
2c07c6884394
i18n: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents:
23031
diff
changeset
|
9 |
2c07c6884394
i18n: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents:
23031
diff
changeset
|
10 import gettext as gettextmod |
2c07c6884394
i18n: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents:
23031
diff
changeset
|
11 import locale |
2c07c6884394
i18n: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents:
23031
diff
changeset
|
12 import os |
2c07c6884394
i18n: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents:
23031
diff
changeset
|
13 import sys |
2c07c6884394
i18n: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents:
23031
diff
changeset
|
14 |
43089
c59eb1560c44
py3: manually import getattr where it is needed
Gregory Szorc <gregory.szorc@gmail.com>
parents:
43077
diff
changeset
|
15 from .pycompat import getattr |
43673
f0bee3b1b847
i18n: get datapath directly from resourceutil
Martin von Zweigbergk <martinvonz@google.com>
parents:
43506
diff
changeset
|
16 from .utils import resourceutil |
30050
d229be12e256
py3: convert to unicode to pass into encode()
Pulkit Goyal <7895pulkit@gmail.com>
parents:
30035
diff
changeset
|
17 from . import ( |
d229be12e256
py3: convert to unicode to pass into encode()
Pulkit Goyal <7895pulkit@gmail.com>
parents:
30035
diff
changeset
|
18 encoding, |
d229be12e256
py3: convert to unicode to pass into encode()
Pulkit Goyal <7895pulkit@gmail.com>
parents:
30035
diff
changeset
|
19 pycompat, |
d229be12e256
py3: convert to unicode to pass into encode()
Pulkit Goyal <7895pulkit@gmail.com>
parents:
30035
diff
changeset
|
20 ) |
7650
85ae7aaf08e9
i18n: lookup .mo files in private locale/ directory
Martin Geisler <mg@daimi.au.dk>
parents:
3888
diff
changeset
|
21 |
85ae7aaf08e9
i18n: lookup .mo files in private locale/ directory
Martin Geisler <mg@daimi.au.dk>
parents:
3888
diff
changeset
|
22 # modelled after templater.templatepath: |
14975
b64538363dbe
i18n: use getattr instead of hasattr
Augie Fackler <durin42@gmail.com>
parents:
13849
diff
changeset
|
23 if getattr(sys, 'frozen', None) is not None: |
30669
10b17ed9b591
py3: replace sys.executable with pycompat.sysexecutable
Pulkit Goyal <7895pulkit@gmail.com>
parents:
30639
diff
changeset
|
24 module = pycompat.sysexecutable |
7650
85ae7aaf08e9
i18n: lookup .mo files in private locale/ directory
Martin Geisler <mg@daimi.au.dk>
parents:
3888
diff
changeset
|
25 else: |
31074
2912b06905dc
py3: use pycompat.fsencode() to convert __file__ to bytes
Pulkit Goyal <7895pulkit@gmail.com>
parents:
30669
diff
changeset
|
26 module = pycompat.fsencode(__file__) |
7650
85ae7aaf08e9
i18n: lookup .mo files in private locale/ directory
Martin Geisler <mg@daimi.au.dk>
parents:
3888
diff
changeset
|
27 |
21987
4953cd193e84
i18n: detect UI language without POSIX-style locale variable on Windows (BC)
Yuya Nishihara <yuya@tcha.org>
parents:
21746
diff
changeset
|
28 _languages = None |
43076
2372284d9457
formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents:
40254
diff
changeset
|
29 if ( |
2372284d9457
formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents:
40254
diff
changeset
|
30 pycompat.iswindows |
43077
687b865b95ad
formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents:
43076
diff
changeset
|
31 and b'LANGUAGE' not in encoding.environ |
687b865b95ad
formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents:
43076
diff
changeset
|
32 and b'LC_ALL' not in encoding.environ |
687b865b95ad
formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents:
43076
diff
changeset
|
33 and b'LC_MESSAGES' not in encoding.environ |
687b865b95ad
formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents:
43076
diff
changeset
|
34 and b'LANG' not in encoding.environ |
43076
2372284d9457
formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents:
40254
diff
changeset
|
35 ): |
21987
4953cd193e84
i18n: detect UI language without POSIX-style locale variable on Windows (BC)
Yuya Nishihara <yuya@tcha.org>
parents:
21746
diff
changeset
|
36 # Try to detect UI language by "User Interface Language Management" API |
4953cd193e84
i18n: detect UI language without POSIX-style locale variable on Windows (BC)
Yuya Nishihara <yuya@tcha.org>
parents:
21746
diff
changeset
|
37 # if no locale variables are set. Note that locale.getdefaultlocale() |
4953cd193e84
i18n: detect UI language without POSIX-style locale variable on Windows (BC)
Yuya Nishihara <yuya@tcha.org>
parents:
21746
diff
changeset
|
38 # uses GetLocaleInfo(), which may be different from UI language. |
4953cd193e84
i18n: detect UI language without POSIX-style locale variable on Windows (BC)
Yuya Nishihara <yuya@tcha.org>
parents:
21746
diff
changeset
|
39 # (See http://msdn.microsoft.com/en-us/library/dd374098(v=VS.85).aspx ) |
4953cd193e84
i18n: detect UI language without POSIX-style locale variable on Windows (BC)
Yuya Nishihara <yuya@tcha.org>
parents:
21746
diff
changeset
|
40 try: |
4953cd193e84
i18n: detect UI language without POSIX-style locale variable on Windows (BC)
Yuya Nishihara <yuya@tcha.org>
parents:
21746
diff
changeset
|
41 import ctypes |
43076
2372284d9457
formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents:
40254
diff
changeset
|
42 |
21987
4953cd193e84
i18n: detect UI language without POSIX-style locale variable on Windows (BC)
Yuya Nishihara <yuya@tcha.org>
parents:
21746
diff
changeset
|
43 langid = ctypes.windll.kernel32.GetUserDefaultUILanguage() |
4953cd193e84
i18n: detect UI language without POSIX-style locale variable on Windows (BC)
Yuya Nishihara <yuya@tcha.org>
parents:
21746
diff
changeset
|
44 _languages = [locale.windows_locale[langid]] |
4953cd193e84
i18n: detect UI language without POSIX-style locale variable on Windows (BC)
Yuya Nishihara <yuya@tcha.org>
parents:
21746
diff
changeset
|
45 except (ImportError, AttributeError, KeyError): |
4953cd193e84
i18n: detect UI language without POSIX-style locale variable on Windows (BC)
Yuya Nishihara <yuya@tcha.org>
parents:
21746
diff
changeset
|
46 # ctypes not found or unknown langid |
4953cd193e84
i18n: detect UI language without POSIX-style locale variable on Windows (BC)
Yuya Nishihara <yuya@tcha.org>
parents:
21746
diff
changeset
|
47 pass |
4953cd193e84
i18n: detect UI language without POSIX-style locale variable on Windows (BC)
Yuya Nishihara <yuya@tcha.org>
parents:
21746
diff
changeset
|
48 |
22638
0d0350cfc7ab
i18n: use datapath for i18n like for templates and help
Mads Kiilerich <madski@unity3d.com>
parents:
21987
diff
changeset
|
49 |
43673
f0bee3b1b847
i18n: get datapath directly from resourceutil
Martin von Zweigbergk <martinvonz@google.com>
parents:
43506
diff
changeset
|
50 datapath = pycompat.fsdecode(resourceutil.datapath) |
f0bee3b1b847
i18n: get datapath directly from resourceutil
Martin von Zweigbergk <martinvonz@google.com>
parents:
43506
diff
changeset
|
51 localedir = os.path.join(datapath, 'locale') |
f0bee3b1b847
i18n: get datapath directly from resourceutil
Martin von Zweigbergk <martinvonz@google.com>
parents:
43506
diff
changeset
|
52 t = gettextmod.translation('hg', localedir, _languages, fallback=True) |
f0bee3b1b847
i18n: get datapath directly from resourceutil
Martin von Zweigbergk <martinvonz@google.com>
parents:
43506
diff
changeset
|
53 try: |
f0bee3b1b847
i18n: get datapath directly from resourceutil
Martin von Zweigbergk <martinvonz@google.com>
parents:
43506
diff
changeset
|
54 _ugettext = t.ugettext |
f0bee3b1b847
i18n: get datapath directly from resourceutil
Martin von Zweigbergk <martinvonz@google.com>
parents:
43506
diff
changeset
|
55 except AttributeError: |
f0bee3b1b847
i18n: get datapath directly from resourceutil
Martin von Zweigbergk <martinvonz@google.com>
parents:
43506
diff
changeset
|
56 _ugettext = t.gettext |
7651
5b5036ef847a
i18n: encode output in user's local encoding
Martin Geisler <mg@daimi.au.dk>
parents:
7650
diff
changeset
|
57 |
43076
2372284d9457
formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents:
40254
diff
changeset
|
58 |
34660
d00ec62d156f
i18n: cache translated messages per encoding
Yuya Nishihara <yuya@tcha.org>
parents:
34645
diff
changeset
|
59 _msgcache = {} # encoding: {message: translation} |
23031
3c0983cc279e
i18n: cache the result of every gettext call
Augie Fackler <raf@durin42.com>
parents:
22638
diff
changeset
|
60 |
43076
2372284d9457
formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents:
40254
diff
changeset
|
61 |
7651
5b5036ef847a
i18n: encode output in user's local encoding
Martin Geisler <mg@daimi.au.dk>
parents:
7650
diff
changeset
|
62 def gettext(message): |
5b5036ef847a
i18n: encode output in user's local encoding
Martin Geisler <mg@daimi.au.dk>
parents:
7650
diff
changeset
|
63 """Translate message. |
5b5036ef847a
i18n: encode output in user's local encoding
Martin Geisler <mg@daimi.au.dk>
parents:
7650
diff
changeset
|
64 |
5b5036ef847a
i18n: encode output in user's local encoding
Martin Geisler <mg@daimi.au.dk>
parents:
7650
diff
changeset
|
65 The message is looked up in the catalog to get a Unicode string, |
5b5036ef847a
i18n: encode output in user's local encoding
Martin Geisler <mg@daimi.au.dk>
parents:
7650
diff
changeset
|
66 which is encoded in the local encoding before being returned. |
5b5036ef847a
i18n: encode output in user's local encoding
Martin Geisler <mg@daimi.au.dk>
parents:
7650
diff
changeset
|
67 |
5b5036ef847a
i18n: encode output in user's local encoding
Martin Geisler <mg@daimi.au.dk>
parents:
7650
diff
changeset
|
68 Important: message is restricted to characters in the encoding |
5b5036ef847a
i18n: encode output in user's local encoding
Martin Geisler <mg@daimi.au.dk>
parents:
7650
diff
changeset
|
69 given by sys.getdefaultencoding() which is most likely 'ascii'. |
5b5036ef847a
i18n: encode output in user's local encoding
Martin Geisler <mg@daimi.au.dk>
parents:
7650
diff
changeset
|
70 """ |
5b5036ef847a
i18n: encode output in user's local encoding
Martin Geisler <mg@daimi.au.dk>
parents:
7650
diff
changeset
|
71 # If message is None, t.ugettext will return u'None' as the |
5b5036ef847a
i18n: encode output in user's local encoding
Martin Geisler <mg@daimi.au.dk>
parents:
7650
diff
changeset
|
72 # translation whereas our callers expect us to return None. |
22638
0d0350cfc7ab
i18n: use datapath for i18n like for templates and help
Mads Kiilerich <madski@unity3d.com>
parents:
21987
diff
changeset
|
73 if message is None or not _ugettext: |
7651
5b5036ef847a
i18n: encode output in user's local encoding
Martin Geisler <mg@daimi.au.dk>
parents:
7650
diff
changeset
|
74 return message |
5b5036ef847a
i18n: encode output in user's local encoding
Martin Geisler <mg@daimi.au.dk>
parents:
7650
diff
changeset
|
75 |
34660
d00ec62d156f
i18n: cache translated messages per encoding
Yuya Nishihara <yuya@tcha.org>
parents:
34645
diff
changeset
|
76 cache = _msgcache.setdefault(encoding.encoding, {}) |
d00ec62d156f
i18n: cache translated messages per encoding
Yuya Nishihara <yuya@tcha.org>
parents:
34645
diff
changeset
|
77 if message not in cache: |
38312
79dd61a4554f
py3: replace `unicode` with pycompat.unicode
Pulkit Goyal <7895pulkit@gmail.com>
parents:
36835
diff
changeset
|
78 if type(message) is pycompat.unicode: |
23031
3c0983cc279e
i18n: cache the result of every gettext call
Augie Fackler <raf@durin42.com>
parents:
22638
diff
changeset
|
79 # goofy unicode docstrings in test |
3c0983cc279e
i18n: cache the result of every gettext call
Augie Fackler <raf@durin42.com>
parents:
22638
diff
changeset
|
80 paragraphs = message.split(u'\n\n') |
3c0983cc279e
i18n: cache the result of every gettext call
Augie Fackler <raf@durin42.com>
parents:
22638
diff
changeset
|
81 else: |
40254
dd83aafdb64a
py3: get around unicode docstrings in test-encoding-textwrap.t and test-help.t
Yuya Nishihara <yuya@tcha.org>
parents:
38312
diff
changeset
|
82 # should be ascii, but we have unicode docstrings in test, which |
dd83aafdb64a
py3: get around unicode docstrings in test-encoding-textwrap.t and test-help.t
Yuya Nishihara <yuya@tcha.org>
parents:
38312
diff
changeset
|
83 # are converted to utf-8 bytes on Python 3. |
43077
687b865b95ad
formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents:
43076
diff
changeset
|
84 paragraphs = [p.decode("utf-8") for p in message.split(b'\n\n')] |
23031
3c0983cc279e
i18n: cache the result of every gettext call
Augie Fackler <raf@durin42.com>
parents:
22638
diff
changeset
|
85 # Be careful not to translate the empty string -- it holds the |
3c0983cc279e
i18n: cache the result of every gettext call
Augie Fackler <raf@durin42.com>
parents:
22638
diff
changeset
|
86 # meta data of the .po file. |
29415
47fb4beb992b
i18n: use unicode literal
Gregory Szorc <gregory.szorc@gmail.com>
parents:
28674
diff
changeset
|
87 u = u'\n\n'.join([p and _ugettext(p) or u'' for p in paragraphs]) |
23031
3c0983cc279e
i18n: cache the result of every gettext call
Augie Fackler <raf@durin42.com>
parents:
22638
diff
changeset
|
88 try: |
3c0983cc279e
i18n: cache the result of every gettext call
Augie Fackler <raf@durin42.com>
parents:
22638
diff
changeset
|
89 # encoding.tolocal cannot be used since it will first try to |
3c0983cc279e
i18n: cache the result of every gettext call
Augie Fackler <raf@durin42.com>
parents:
22638
diff
changeset
|
90 # decode the Unicode string. Calling u.decode(enc) really |
3c0983cc279e
i18n: cache the result of every gettext call
Augie Fackler <raf@durin42.com>
parents:
22638
diff
changeset
|
91 # means u.encode(sys.getdefaultencoding()).decode(enc). Since |
3c0983cc279e
i18n: cache the result of every gettext call
Augie Fackler <raf@durin42.com>
parents:
22638
diff
changeset
|
92 # the Python encoding defaults to 'ascii', this fails if the |
3c0983cc279e
i18n: cache the result of every gettext call
Augie Fackler <raf@durin42.com>
parents:
22638
diff
changeset
|
93 # translated string use non-ASCII characters. |
30050
d229be12e256
py3: convert to unicode to pass into encode()
Pulkit Goyal <7895pulkit@gmail.com>
parents:
30035
diff
changeset
|
94 encodingstr = pycompat.sysstr(encoding.encoding) |
34660
d00ec62d156f
i18n: cache translated messages per encoding
Yuya Nishihara <yuya@tcha.org>
parents:
34645
diff
changeset
|
95 cache[message] = u.encode(encodingstr, "replace") |
23031
3c0983cc279e
i18n: cache the result of every gettext call
Augie Fackler <raf@durin42.com>
parents:
22638
diff
changeset
|
96 except LookupError: |
3c0983cc279e
i18n: cache the result of every gettext call
Augie Fackler <raf@durin42.com>
parents:
22638
diff
changeset
|
97 # An unknown encoding results in a LookupError. |
34660
d00ec62d156f
i18n: cache translated messages per encoding
Yuya Nishihara <yuya@tcha.org>
parents:
34645
diff
changeset
|
98 cache[message] = message |
d00ec62d156f
i18n: cache translated messages per encoding
Yuya Nishihara <yuya@tcha.org>
parents:
34645
diff
changeset
|
99 return cache[message] |
7651
5b5036ef847a
i18n: encode output in user's local encoding
Martin Geisler <mg@daimi.au.dk>
parents:
7650
diff
changeset
|
100 |
43076
2372284d9457
formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents:
40254
diff
changeset
|
101 |
13849
9f97de157aad
HGPLAIN: allow exceptions to plain mode, like i18n, via HGPLAINEXCEPT
Brodie Rao <brodie@bitheap.org>
parents:
11403
diff
changeset
|
102 def _plain(): |
43076
2372284d9457
formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents:
40254
diff
changeset
|
103 if ( |
43077
687b865b95ad
formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents:
43076
diff
changeset
|
104 b'HGPLAIN' not in encoding.environ |
687b865b95ad
formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents:
43076
diff
changeset
|
105 and b'HGPLAINEXCEPT' not in encoding.environ |
43076
2372284d9457
formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents:
40254
diff
changeset
|
106 ): |
13849
9f97de157aad
HGPLAIN: allow exceptions to plain mode, like i18n, via HGPLAINEXCEPT
Brodie Rao <brodie@bitheap.org>
parents:
11403
diff
changeset
|
107 return False |
43077
687b865b95ad
formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents:
43076
diff
changeset
|
108 exceptions = encoding.environ.get(b'HGPLAINEXCEPT', b'').strip().split(b',') |
687b865b95ad
formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents:
43076
diff
changeset
|
109 return b'i18n' not in exceptions |
13849
9f97de157aad
HGPLAIN: allow exceptions to plain mode, like i18n, via HGPLAINEXCEPT
Brodie Rao <brodie@bitheap.org>
parents:
11403
diff
changeset
|
110 |
43076
2372284d9457
formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents:
40254
diff
changeset
|
111 |
13849
9f97de157aad
HGPLAIN: allow exceptions to plain mode, like i18n, via HGPLAINEXCEPT
Brodie Rao <brodie@bitheap.org>
parents:
11403
diff
changeset
|
112 if _plain(): |
10455
40dfd46d098f
ui: add HGPLAIN environment variable for easier scripting
Brodie Rao <me+hg@dackz.net>
parents:
10263
diff
changeset
|
113 _ = lambda message: message |
40dfd46d098f
ui: add HGPLAIN environment variable for easier scripting
Brodie Rao <me+hg@dackz.net>
parents:
10263
diff
changeset
|
114 else: |
40dfd46d098f
ui: add HGPLAIN environment variable for easier scripting
Brodie Rao <me+hg@dackz.net>
parents:
10263
diff
changeset
|
115 _ = gettext |