comparison mercurial/chgserver.py @ 44261:04a3ae7aba14

chg: force-set LC_CTYPE on server start to actual value from the environment Python 3.7+ will "coerce" the LC_CTYPE variable in many instances, and this can cause issues with chg being able to start up. D7550 attempted to fix this, but a combination of a misreading of the way that python3.7 does the coercion and an untested state (LC_CTYPE being set to an invalid value) meant that this was still not quite working. This change will cause differences between chg and hg: hg will have the LC_CTYPE environment variable coerced, while chg will not. This is unlikely to cause any detectable behavior differences in what Mercurial itself outputs, but it does have two known effects: - When using hg, the coerced LC_CTYPE will be passed to subprocesses, even non-python ones. Using chg will remove the coercion, and this will not happen. This is arguably more correct behavior on chg's part. - On macOS, if you set your region to Brazil but your language to English, this isn't representable in locale strings, so macOS sets LC_CTYPE=UTF-8. If this value is passed along when ssh'ing to a non-macOS machine, some functions (such as locale.setlocale()) may raise an exception due to an unsupported locale setting. This is most easily encountered when doing an interactive commit/split/etc. when using ui.interface=curses. Differential Revision: https://phab.mercurial-scm.org/D8039
author Kyle Lippincott <spectral@google.com>
date Wed, 29 Jan 2020 13:39:50 -0800
parents a61287a95dc3
children a69c08cdb2a8
comparison
equal deleted inserted replaced
44260:216fc4633800 44261:04a3ae7aba14
548 newenv = dict(s.split(b'=', 1) for s in l) 548 newenv = dict(s.split(b'=', 1) for s in l)
549 except ValueError: 549 except ValueError:
550 raise ValueError(b'unexpected value in setenv request') 550 raise ValueError(b'unexpected value in setenv request')
551 self.ui.log(b'chgserver', b'setenv: %r\n', sorted(newenv.keys())) 551 self.ui.log(b'chgserver', b'setenv: %r\n', sorted(newenv.keys()))
552 552
553 # Python3 has some logic to "coerce" the C locale to a UTF-8 capable
554 # one, and it sets LC_CTYPE in the environment to C.UTF-8 if none of
555 # 'LC_CTYPE', 'LC_ALL' or 'LANG' are set (to any value). This can be
556 # disabled with PYTHONCOERCECLOCALE=0 in the environment.
557 #
558 # When fromui is called via _inithashstate, python has already set
559 # this, so that's in the environment right when we start up the hg
560 # process. Then chg will call us and tell us to set the environment to
561 # the one it has; this might NOT have LC_CTYPE, so we'll need to
562 # carry-forward the LC_CTYPE that was coerced in these situations.
563 #
564 # If this is not handled, we will fail config+env validation and fail
565 # to start chg. If this is just ignored instead of carried forward, we
566 # may have different behavior between chg and non-chg.
567 if pycompat.ispy3:
568 # Rename for wordwrapping purposes
569 oldenv = encoding.environ
570 if not any(
571 e.get(b'PYTHONCOERCECLOCALE') == b'0' for e in [oldenv, newenv]
572 ):
573 keys = [b'LC_CTYPE', b'LC_ALL', b'LANG']
574 old_keys = [k for k, v in oldenv.items() if k in keys and v]
575 new_keys = [k for k, v in newenv.items() if k in keys and v]
576 # If the user's environment (from chg) doesn't have ANY of the
577 # keys that python looks for, and the environment (from
578 # initialization) has ONLY LC_CTYPE and it's set to C.UTF-8,
579 # carry it forward.
580 if (
581 not new_keys
582 and old_keys == [b'LC_CTYPE']
583 and oldenv[b'LC_CTYPE'] == b'C.UTF-8'
584 ):
585 newenv[b'LC_CTYPE'] = oldenv[b'LC_CTYPE']
586
587 encoding.environ.clear() 553 encoding.environ.clear()
588 encoding.environ.update(newenv) 554 encoding.environ.update(newenv)
589 555
590 capabilities = commandserver.server.capabilities.copy() 556 capabilities = commandserver.server.capabilities.copy()
591 capabilities.update( 557 capabilities.update(
728 # demandimport or detecting chg client started by chg client. When executed 694 # demandimport or detecting chg client started by chg client. When executed
729 # here, CHGINTERNALMARK is no longer useful and hence dropped to make 695 # here, CHGINTERNALMARK is no longer useful and hence dropped to make
730 # environ cleaner. 696 # environ cleaner.
731 if b'CHGINTERNALMARK' in encoding.environ: 697 if b'CHGINTERNALMARK' in encoding.environ:
732 del encoding.environ[b'CHGINTERNALMARK'] 698 del encoding.environ[b'CHGINTERNALMARK']
699 # Python3.7+ "coerces" the LC_CTYPE environment variable to a UTF-8 one if
700 # it thinks the current value is "C". This breaks the hash computation and
701 # causes chg to restart loop.
702 if b'CHGORIG_LC_CTYPE' in encoding.environ:
703 encoding.environ[b'LC_CTYPE'] = encoding.environ[b'CHGORIG_LC_CTYPE']
704 del encoding.environ[b'CHGORIG_LC_CTYPE']
705 elif b'CHG_CLEAR_LC_CTYPE' in encoding.environ:
706 if b'LC_CTYPE' in encoding.environ:
707 del encoding.environ[b'LC_CTYPE']
708 del encoding.environ[b'CHG_CLEAR_LC_CTYPE']
733 709
734 if repo: 710 if repo:
735 # one chgserver can serve multiple repos. drop repo information 711 # one chgserver can serve multiple repos. drop repo information
736 ui.setconfig(b'bundle', b'mainreporoot', b'', b'repo') 712 ui.setconfig(b'bundle', b'mainreporoot', b'', b'repo')
737 h = chgunixservicehandler(ui) 713 h = chgunixservicehandler(ui)