tests: escape bytes setting MSB in input of grep for portability
GNU grep (2.21-2 or later) assumes that input is encoded in LC_CTYPE,
and input is binary if it contains byte sequence not valid for that
encoding.
For example, if locale is configured as C, a byte setting most
significant bit (MSB) makes such GNU grep show "Binary file <FILENAME>
matches" message instead of matched lines unintentionally.
This behavior is recognized as a bug, and fixed in GNU grep 2.25-1 or
later. But some distributions are shipped with such buggy version
(e.g. Ubuntu xenial, which is used by launchpad buildbot).
http://debbugs.gnu.org/cgi/bugreport.cgi?bug=19230
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=800670
http://packages.ubuntu.com/xenial/grep
This causes failure of test-commit-interactive.t, which applies grep
on CP932 byte sequence since 1111e84de635.
But, explicit setting LC_CTYPE for CP932 might cause another problem,
because it can't be assumed that all environment running Mercurial
tests allows arbitrary locale setting.
To resolve this issue, this patch escapes bytes setting MSB in input
of grep.
For this purpose:
- str.encode('string-escape') isn't useful, because it escapes also
control code (less than 0x20), and makes EOL handling complicated
- "f --hexdump" isn't useful, because it isn't line-oriented
- "sed -n" seems reasonable, but "sed" itself sometimes causes
portability issue, too (e.g. 900767dfa80d or afb86ee925bf)
This patch is posted with "stable" flag, because 1111e84de635 is on
stable branch.
$ hg init repo
$ cd repo
$ i=0; while [ "$i" -lt 213 ]; do echo a >> a; i=`expr $i + 1`; done
$ hg add a
$ cp a b
$ hg add b
Wide diffstat:
$ hg diff --stat
a | 213 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
b | 213 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 426 insertions(+), 0 deletions(-)
diffstat width:
$ COLUMNS=24 hg diff --config ui.interactive=true --stat
a | 213 ++++++++++++++
b | 213 ++++++++++++++
2 files changed, 426 insertions(+), 0 deletions(-)
$ hg ci -m adda
$ cat >> a <<EOF
> a
> a
> a
> EOF
Narrow diffstat:
$ hg diff --stat
a | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)
$ hg ci -m appenda
>>> open("c", "wb").write("\0")
$ touch d
$ hg add c d
Binary diffstat:
$ hg diff --stat
c | Bin
1 files changed, 0 insertions(+), 0 deletions(-)
Binary git diffstat:
$ hg diff --stat --git
c | Bin
d | 0
2 files changed, 0 insertions(+), 0 deletions(-)
$ hg ci -m createb
>>> open("file with spaces", "wb").write("\0")
$ hg add "file with spaces"
Filename with spaces diffstat:
$ hg diff --stat
file with spaces | Bin
1 files changed, 0 insertions(+), 0 deletions(-)
Filename with spaces git diffstat:
$ hg diff --stat --git
file with spaces | Bin
1 files changed, 0 insertions(+), 0 deletions(-)
diffstat within directories:
$ hg rm -f 'file with spaces'
$ mkdir dir1 dir2
$ echo new1 > dir1/new
$ echo new2 > dir2/new
$ hg add dir1/new dir2/new
$ hg diff --stat
dir1/new | 1 +
dir2/new | 1 +
2 files changed, 2 insertions(+), 0 deletions(-)
$ hg diff --stat --root dir1
new | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
$ hg diff --stat --root dir1 dir2
warning: dir2 not inside relative root dir1
$ hg diff --stat --root dir1 -I dir1/old
$ cd dir1
$ hg diff --stat .
dir1/new | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
$ hg diff --stat --root .
new | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
$ hg diff --stat --root ../dir1 ../dir2
warning: ../dir2 not inside relative root . (glob)
$ hg diff --stat --root . -I old
$ cd ..