encoding: improve handling of buggy getpreferredencoding() on Mac OS X
Prior to version 2.7, calling locale.getpreferredencoding() would
always return 'mac-roman' on Mac OS X. Previously, this was handled by
a call to locale.setlocale(). Unfortunately, Python 2.6.5 and older
have a bug where isspace() would incorrectly report True for 0x85 and
0xa0 after such a call.
In order to fix this, we replace the previous _encodingfixup mapping
to an _encodingfixers mapping. Rather than mapping encodings to their
replacement, it maps them to a function returning the
replacement. This allows us to provide an simplified implementation of
getpreferredencoding() which extracts the expected encoding and
restores the locale.
This fix is based on a patch originally submitted by Martijn Pieters
as well as feedback from Brodie Rao.
#!/bin/sh
#
# Corrupt an hg repo with two pulls.
#
# create one repo with a long history
hg init source1
cd source1
touch foo
hg add foo
for i in 1 2 3 4 5 6 7 8 9 10; do
echo $i >> foo
hg ci -m $i
done
cd ..
# create a third repo to pull both other repos into it
hg init version2
hg -R version2 pull source1 &
sleep 1
hg clone --pull -U version2 corrupted
wait
hg -R corrupted verify
hg -R version2 verify