hgext/win32mbcs.py
author Christian Ebert <blacktrash@gmx.net>
Wed, 22 Oct 2008 11:57:20 +0200
changeset 7241 421f4cbddd68
parent 6887 304484c7e0ba
child 7598 26adfaccdf73
permissions -rw-r--r--
hgrc.5: explain order of mail.charsets TODO: add mail.charsets section to hgrc.5.ja.txt
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
     1
# win32mbcs.py -- MBCS filename support for Mercurial
5846
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
     2
#
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
     3
# Copyright (c) 2008 Shun-ichi Goto <shunichi.goto@gmail.com>
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
     4
#
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
     5
# Version: 0.2
5846
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
     6
# Author:  Shun-ichi Goto <shunichi.goto@gmail.com>
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
     7
#
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
     8
# This software may be used and distributed according to the terms
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
     9
# of the GNU General Public License, incorporated herein by reference.
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
    10
#
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    11
"""Allow to use MBCS path with problematic encoding.
5846
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
    12
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    13
Some MBCS encodings are not good for some path operations
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    14
(i.e. splitting path, case conversion, etc.) with its encoded bytes.
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    15
We call such a encoding (i.e. shift_jis and big5) as "problematic
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    16
encoding".  This extension can be used to fix the issue with those
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    17
encodings by wrapping some functions to convert to unicode string
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    18
before path operation.
5846
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
    19
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
    20
This extension is usefull for:
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    21
 * Japanese Windows users using shift_jis encoding.
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    22
 * Chinese Windows users using big5 encoding.
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    23
 * All users who use a repository with one of problematic encodings
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    24
   on case-insensitive file system.
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    25
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    26
This extension is not needed for:
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    27
 * Any user who use only ascii chars in path.
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    28
 * Any user who do not use any of problematic encodings.
5846
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
    29
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    30
Note that there are some limitations on using this extension:
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    31
 * You should use single encoding in one repository.
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    32
 * You should set same encoding for the repository by locale or HGENCODING.
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    33
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    34
To use this extension, enable the extension in .hg/hgrc or ~/.hgrc:
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    35
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    36
  [extensions]
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    37
  hgext.win32mbcs =
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    38
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    39
Path encoding conversion are done between unicode and util._encoding
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    40
which is decided by mercurial from current locale setting or HGENCODING.
5846
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
    41
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
    42
"""
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
    43
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
    44
import os
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
    45
from mercurial.i18n import _
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
    46
from mercurial import util
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
    47
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    48
def decode(arg):
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    49
   if isinstance(arg, str):
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    50
       uarg = arg.decode(util._encoding)
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    51
       if arg == uarg.encode(util._encoding):
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    52
           return uarg
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    53
       raise UnicodeError("Not local encoding")
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    54
   elif isinstance(arg, tuple):
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    55
       return tuple(map(decode, arg))
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    56
   elif isinstance(arg, list):
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    57
       return map(decode, arg)
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    58
   return arg
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    59
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    60
def encode(arg):
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    61
   if isinstance(arg, unicode):
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    62
       return arg.encode(util._encoding)
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    63
   elif isinstance(arg, tuple):
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    64
       return tuple(map(encode, arg))
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    65
   elif isinstance(arg, list):
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    66
       return map(encode, arg)
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    67
   return arg
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    68
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    69
def wrapper(func, args):
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    70
   # check argument is unicode, then call original
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    71
   for arg in args:
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    72
       if isinstance(arg, unicode):
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    73
           return func(*args)
5846
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
    74
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    75
   try:
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    76
       # convert arguments to unicode, call func, then convert back
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    77
       return encode(func(*decode(args)))
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    78
   except UnicodeError:
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    79
       # If not encoded with util._encoding, report it then
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    80
       # continue with calling original function.
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    81
      raise util.Abort(_("[win32mbcs] filename conversion fail with"
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    82
                         " %s encoding\n") % (util._encoding))
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    83
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    84
def wrapname(name):
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    85
   idx = name.rfind('.')
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    86
   module = name[:idx]
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    87
   name = name[idx+1:]
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    88
   module = eval(module)
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    89
   func = getattr(module, name)
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    90
   def f(*args):
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    91
       return wrapper(func, args)
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    92
   try:
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    93
      f.__name__ = func.__name__                # fail with python23
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    94
   except Exception:
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    95
      pass
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    96
   setattr(module, name, f)
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    97
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    98
# List of functions to be wrapped.
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    99
# NOTE: os.path.dirname() and os.path.basename() are safe because
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
   100
#       they use result of os.path.split()
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
   101
funcs = '''os.path.join os.path.split os.path.splitext
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
   102
 os.path.splitunc os.path.normpath os.path.normcase os.makedirs
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
   103
 util.endswithsep util.splitpath util.checkcase util.fspath'''
5846
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
   104
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
   105
# codec and alias names of sjis and big5 to be faked.
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
   106
problematic_encodings = '''big5 big5-tw csbig5 big5hkscs big5-hkscs
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
   107
 hkscs cp932 932 ms932 mskanji ms-kanji shift_jis csshiftjis shiftjis
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
   108
 sjis s_jis shift_jis_2004 shiftjis2004 sjis_2004 sjis2004
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
   109
 shift_jisx0213 shiftjisx0213 sjisx0213 s_jisx0213'''
5846
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
   110
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
   111
def reposetup(ui, repo):
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
   112
   # TODO: decide use of config section for this extension
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
   113
   if not os.path.supports_unicode_filenames:
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
   114
       ui.warn(_("[win32mbcs] cannot activate on this platform.\n"))
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
   115
       return
5846
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
   116
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
   117
   # fake is only for relevant environment.
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
   118
   if util._encoding.lower() in problematic_encodings.split():
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
   119
       for f in funcs.split():
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
   120
           wrapname(f)
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
   121
       ui.debug(_("[win32mbcs] activated with encoding: %s\n") % util._encoding)
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
   122