Mercurial > hg-stable
changeset 43871:5e0f6451e2d2
chg: fix chg to work with py3.7+ "coercing" the locale
When the environment is empty (specifically: it doesn't contain LC_ALL,
LC_CTYPE, or LANG), Python will "coerce" the locale environment variables to be
a UTF-8 capable one. It sets LC_CTYPE in the environment, and this breaks chg,
since chg operates by:
- start hg, using whatever environment the user has when chg starts
- hg stores a hash of this "original" environment, but python has already set
LC_CTYPE even though the user doesn't have it in their environment
- chg calls setenv over the commandserver. This clears the environment inside of
hg and sets it to be exactly what the environment in chg is (without
LC_CTYPE).
- chg calls validate to ensure that the environment hg is using (after the
setenv call) is the one that the chg process has - if not, it is assumed the
user changed their environment and we should use a different server. This will
*never* be true in this situation because LC_CTYPE was removed.
Differential Revision: https://phab.mercurial-scm.org/D7550
author | Kyle Lippincott <spectral@google.com> |
---|---|
date | Thu, 05 Dec 2019 14:28:21 -0800 |
parents | 8766728dbce6 |
children | aac921f54554 |
files | mercurial/chgserver.py tests/test-chg.t |
diffstat | 2 files changed, 57 insertions(+), 0 deletions(-) [+] |
line wrap: on
line diff
--- a/mercurial/chgserver.py Mon Dec 09 22:20:35 2019 -0500 +++ b/mercurial/chgserver.py Thu Dec 05 14:28:21 2019 -0800 @@ -549,6 +549,41 @@ except ValueError: raise ValueError(b'unexpected value in setenv request') self.ui.log(b'chgserver', b'setenv: %r\n', sorted(newenv.keys())) + + # Python3 has some logic to "coerce" the C locale to a UTF-8 capable + # one, and it sets LC_CTYPE in the environment to C.UTF-8 if none of + # 'LC_CTYPE', 'LC_ALL' or 'LANG' are set (to any value). This can be + # disabled with PYTHONCOERCECLOCALE=0 in the environment. + # + # When fromui is called via _inithashstate, python has already set + # this, so that's in the environment right when we start up the hg + # process. Then chg will call us and tell us to set the environment to + # the one it has; this might NOT have LC_CTYPE, so we'll need to + # carry-forward the LC_CTYPE that was coerced in these situations. + # + # If this is not handled, we will fail config+env validation and fail + # to start chg. If this is just ignored instead of carried forward, we + # may have different behavior between chg and non-chg. + if pycompat.ispy3: + # Rename for wordwrapping purposes + oldenv = encoding.environ + if not any( + e.get(b'PYTHONCOERCECLOCALE') == b'0' for e in [oldenv, newenv] + ): + keys = [b'LC_CTYPE', b'LC_ALL', b'LANG'] + old_keys = [k for k, v in oldenv.items() if k in keys and v] + new_keys = [k for k, v in newenv.items() if k in keys and v] + # If the user's environment (from chg) doesn't have ANY of the + # keys that python looks for, and the environment (from + # initialization) has ONLY LC_CTYPE and it's set to C.UTF-8, + # carry it forward. + if ( + not new_keys + and old_keys == [b'LC_CTYPE'] + and oldenv[b'LC_CTYPE'] == b'C.UTF-8' + ): + newenv[b'LC_CTYPE'] = oldenv[b'LC_CTYPE'] + encoding.environ.clear() encoding.environ.update(newenv)
--- a/tests/test-chg.t Mon Dec 09 22:20:35 2019 -0500 +++ b/tests/test-chg.t Thu Dec 05 14:28:21 2019 -0800 @@ -331,3 +331,25 @@ YYYY/MM/DD HH:MM:SS (PID)> loaded repo into cache: $TESTTMP/cached2 (in ...s) YYYY/MM/DD HH:MM:SS (PID)> log -R cached YYYY/MM/DD HH:MM:SS (PID)> loaded repo into cache: $TESTTMP/cached (in ...s) + +Test that chg works even when python "coerces" the locale (py3.7+, which is done +by default if none of LC_ALL, LC_CTYPE, or LANG are set in the environment) + + $ cat > $TESTTMP/debugenv.py <<EOF + > from mercurial import encoding + > from mercurial import registrar + > cmdtable = {} + > command = registrar.command(cmdtable) + > @command(b'debugenv', [], b'', norepo=True) + > def debugenv(ui): + > for k in [b'LC_ALL', b'LC_CTYPE', b'LANG']: + > v = encoding.environ.get(k) + > if v is not None: + > ui.write(b'%s=%s\n' % (k, encoding.environ[k])) + > EOF + $ LANG= LC_ALL= LC_CTYPE= chg \ + > --config extensions.debugenv=$TESTTMP/debugenv.py debugenv + LC_ALL= + LC_CTYPE=C.UTF-8 (py37 !) + LC_CTYPE= (no-py37 !) + LANG=