annotate hgext/win32mbcs.py @ 9102:bbc78cb1bf15

Merge with stable
author Matt Mackall <mpm@selenic.com>
date Thu, 09 Jul 2009 19:49:02 -0500
parents 22a33e207a7d 623f96ae3a26
children 47ce7a3a1fb0
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
1 # win32mbcs.py -- MBCS filename support for Mercurial
5846
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
2 #
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
3 # Copyright (c) 2008 Shun-ichi Goto <shunichi.goto@gmail.com>
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
4 #
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
5 # Version: 0.2
5846
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
6 # Author: Shun-ichi Goto <shunichi.goto@gmail.com>
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
7 #
8225
46293a0c7e9f updated license to be explicit about GPL version 2
Martin Geisler <mg@lazybytes.net>
parents: 8001
diff changeset
8 # This software may be used and distributed according to the terms of the
46293a0c7e9f updated license to be explicit about GPL version 2
Martin Geisler <mg@lazybytes.net>
parents: 8001
diff changeset
9 # GNU General Public License version 2, incorporated herein by reference.
5846
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
10 #
8228
eee2319c5895 add blank line after copyright notices and after header
Martin Geisler <mg@lazybytes.net>
parents: 8225
diff changeset
11
8932
f87884329419 extensions: fix up description lines some more
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 8894
diff changeset
12 '''allow the use of MBCS paths with problematic encodings
5846
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
13
9077
22a33e207a7d win32mbcs: wrapped docstrings at 78 characters
Martin Geisler <mg@lazybytes.net>
parents: 8932
diff changeset
14 Some MBCS encodings are not good for some path operations (i.e. splitting
22a33e207a7d win32mbcs: wrapped docstrings at 78 characters
Martin Geisler <mg@lazybytes.net>
parents: 8932
diff changeset
15 path, case conversion, etc.) with its encoded bytes. We call such a encoding
22a33e207a7d win32mbcs: wrapped docstrings at 78 characters
Martin Geisler <mg@lazybytes.net>
parents: 8932
diff changeset
16 (i.e. shift_jis and big5) as "problematic encoding". This extension can be
22a33e207a7d win32mbcs: wrapped docstrings at 78 characters
Martin Geisler <mg@lazybytes.net>
parents: 8932
diff changeset
17 used to fix the issue with those encodings by wrapping some functions to
22a33e207a7d win32mbcs: wrapped docstrings at 78 characters
Martin Geisler <mg@lazybytes.net>
parents: 8932
diff changeset
18 convert to Unicode string before path operation.
5846
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
19
8668
aea3a23151bd fixed typos found in translatable strings
Martin Geisler <mg@lazybytes.net>
parents: 8667
diff changeset
20 This extension is useful for:
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
21 * Japanese Windows users using shift_jis encoding.
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
22 * Chinese Windows users using big5 encoding.
8001
c0e3aca616de win32mbcs: word-wrap help texts at 70 characters
Martin Geisler <mg@daimi.au.dk>
parents: 7983
diff changeset
23 * All users who use a repository with one of problematic encodings on
c0e3aca616de win32mbcs: word-wrap help texts at 70 characters
Martin Geisler <mg@daimi.au.dk>
parents: 7983
diff changeset
24 case-insensitive file system.
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
25
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
26 This extension is not needed for:
8667
594507755800 graphlog, win32mbcs: capitalize ASCII
Martin Geisler <mg@lazybytes.net>
parents: 8665
diff changeset
27 * Any user who use only ASCII chars in path.
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
28 * Any user who do not use any of problematic encodings.
5846
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
29
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
30 Note that there are some limitations on using this extension:
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
31 * You should use single encoding in one repository.
9077
22a33e207a7d win32mbcs: wrapped docstrings at 78 characters
Martin Geisler <mg@lazybytes.net>
parents: 8932
diff changeset
32 * You should set same encoding for the repository by locale or HGENCODING.
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
33
9077
22a33e207a7d win32mbcs: wrapped docstrings at 78 characters
Martin Geisler <mg@lazybytes.net>
parents: 8932
diff changeset
34 Path encoding conversion are done between Unicode and encoding.encoding which
22a33e207a7d win32mbcs: wrapped docstrings at 78 characters
Martin Geisler <mg@lazybytes.net>
parents: 8932
diff changeset
35 is decided by Mercurial from current locale setting or HGENCODING.
8894
868670dbc237 extensions: improve the consistency of synopses
Cédric Duval <cedricduval@free.fr>
parents: 8866
diff changeset
36 '''
5846
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
37
9098
5e4654f5522d win32mbcs: look up modules using sys.modules (issue1729)
Brodie Rao <me+hg@dackz.net>
parents: 8932
diff changeset
38 import os, sys
5846
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
39 from mercurial.i18n import _
7948
de377b1a9a84 move encoding bits from util to encoding
Matt Mackall <mpm@selenic.com>
parents: 7877
diff changeset
40 from mercurial import util, encoding
5846
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
41
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
42 def decode(arg):
7877
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
43 if isinstance(arg, str):
7948
de377b1a9a84 move encoding bits from util to encoding
Matt Mackall <mpm@selenic.com>
parents: 7877
diff changeset
44 uarg = arg.decode(encoding.encoding)
de377b1a9a84 move encoding bits from util to encoding
Matt Mackall <mpm@selenic.com>
parents: 7877
diff changeset
45 if arg == uarg.encode(encoding.encoding):
7877
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
46 return uarg
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
47 raise UnicodeError("Not local encoding")
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
48 elif isinstance(arg, tuple):
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
49 return tuple(map(decode, arg))
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
50 elif isinstance(arg, list):
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
51 return map(decode, arg)
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
52 return arg
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
53
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
54 def encode(arg):
7877
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
55 if isinstance(arg, unicode):
7948
de377b1a9a84 move encoding bits from util to encoding
Matt Mackall <mpm@selenic.com>
parents: 7877
diff changeset
56 return arg.encode(encoding.encoding)
7877
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
57 elif isinstance(arg, tuple):
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
58 return tuple(map(encode, arg))
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
59 elif isinstance(arg, list):
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
60 return map(encode, arg)
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
61 return arg
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
62
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
63 def wrapper(func, args):
7877
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
64 # check argument is unicode, then call original
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
65 for arg in args:
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
66 if isinstance(arg, unicode):
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
67 return func(*args)
5846
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
68
7877
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
69 try:
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
70 # convert arguments to unicode, call func, then convert back
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
71 return encode(func(*decode(args)))
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
72 except UnicodeError:
7948
de377b1a9a84 move encoding bits from util to encoding
Matt Mackall <mpm@selenic.com>
parents: 7877
diff changeset
73 # If not encoded with encoding.encoding, report it then
7877
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
74 # continue with calling original function.
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
75 raise util.Abort(_("[win32mbcs] filename conversion fail with"
7948
de377b1a9a84 move encoding bits from util to encoding
Matt Mackall <mpm@selenic.com>
parents: 7877
diff changeset
76 " %s encoding\n") % (encoding.encoding))
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
77
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
78 def wrapname(name):
9098
5e4654f5522d win32mbcs: look up modules using sys.modules (issue1729)
Brodie Rao <me+hg@dackz.net>
parents: 8932
diff changeset
79 module, name = name.rsplit('.', 1)
5e4654f5522d win32mbcs: look up modules using sys.modules (issue1729)
Brodie Rao <me+hg@dackz.net>
parents: 8932
diff changeset
80 module = sys.modules[module]
7877
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
81 func = getattr(module, name)
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
82 def f(*args):
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
83 return wrapper(func, args)
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
84 try:
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
85 f.__name__ = func.__name__ # fail with python23
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
86 except Exception:
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
87 pass
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
88 setattr(module, name, f)
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
89
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
90 # List of functions to be wrapped.
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
91 # NOTE: os.path.dirname() and os.path.basename() are safe because
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
92 # they use result of os.path.split()
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
93 funcs = '''os.path.join os.path.split os.path.splitext
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
94 os.path.splitunc os.path.normpath os.path.normcase os.makedirs
9098
5e4654f5522d win32mbcs: look up modules using sys.modules (issue1729)
Brodie Rao <me+hg@dackz.net>
parents: 8932
diff changeset
95 mercurial.util.endswithsep mercurial.util.splitpath mercurial.util.checkcase
9100
623f96ae3a26 win32mbcs: also wrap windows.pconvert()
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents: 9098
diff changeset
96 mercurial.util.fspath mercurial.windows.pconvert'''
5846
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
97
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
98 # codec and alias names of sjis and big5 to be faked.
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
99 problematic_encodings = '''big5 big5-tw csbig5 big5hkscs big5-hkscs
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
100 hkscs cp932 932 ms932 mskanji ms-kanji shift_jis csshiftjis shiftjis
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
101 sjis s_jis shift_jis_2004 shiftjis2004 sjis_2004 sjis2004
8714
505a96cbc923 Add cp950 as problematic encoding which is used in chinese windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents: 8668
diff changeset
102 shift_jisx0213 shiftjisx0213 sjisx0213 s_jisx0213 950 cp950 ms950 '''
5846
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
103
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
104 def reposetup(ui, repo):
7877
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
105 # TODO: decide use of config section for this extension
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
106 if not os.path.supports_unicode_filenames:
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
107 ui.warn(_("[win32mbcs] cannot activate on this platform.\n"))
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
108 return
5846
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
109
7877
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
110 # fake is only for relevant environment.
7948
de377b1a9a84 move encoding bits from util to encoding
Matt Mackall <mpm@selenic.com>
parents: 7877
diff changeset
111 if encoding.encoding.lower() in problematic_encodings.split():
7877
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
112 for f in funcs.split():
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
113 wrapname(f)
7948
de377b1a9a84 move encoding bits from util to encoding
Matt Mackall <mpm@selenic.com>
parents: 7877
diff changeset
114 ui.debug(_("[win32mbcs] activated with encoding: %s\n")
de377b1a9a84 move encoding bits from util to encoding
Matt Mackall <mpm@selenic.com>
parents: 7877
diff changeset
115 % encoding.encoding)
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
116