contrib/dumprevlog
author FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
Sat, 21 May 2016 02:48:51 +0900
branchstable
changeset 29180 8c5e880c7e25
parent 14233 659f34b833b9
child 29165 a212ca70205c
permissions -rwxr-xr-x
tests: escape bytes setting MSB in input of grep for portability GNU grep (2.21-2 or later) assumes that input is encoded in LC_CTYPE, and input is binary if it contains byte sequence not valid for that encoding. For example, if locale is configured as C, a byte setting most significant bit (MSB) makes such GNU grep show "Binary file <FILENAME> matches" message instead of matched lines unintentionally. This behavior is recognized as a bug, and fixed in GNU grep 2.25-1 or later. But some distributions are shipped with such buggy version (e.g. Ubuntu xenial, which is used by launchpad buildbot). http://debbugs.gnu.org/cgi/bugreport.cgi?bug=19230 https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=800670 http://packages.ubuntu.com/xenial/grep This causes failure of test-commit-interactive.t, which applies grep on CP932 byte sequence since 1111e84de635. But, explicit setting LC_CTYPE for CP932 might cause another problem, because it can't be assumed that all environment running Mercurial tests allows arbitrary locale setting. To resolve this issue, this patch escapes bytes setting MSB in input of grep. For this purpose: - str.encode('string-escape') isn't useful, because it escapes also control code (less than 0x20), and makes EOL handling complicated - "f --hexdump" isn't useful, because it isn't line-oriented - "sed -n" seems reasonable, but "sed" itself sometimes causes portability issue, too (e.g. 900767dfa80d or afb86ee925bf) This patch is posted with "stable" flag, because 1111e84de635 is on stable branch.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
6433
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     1
#!/usr/bin/env python
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     2
# Dump revlogs as raw data stream
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     3
# $ find .hg/store/ -name "*.i" | xargs dumprevlog > repo.dump
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     4
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     5
import sys
6466
9c426da6b03b contrib: fix binary file issues with dumprevlog on Windows
Adrian Buehlmann <adrian@cadifra.com>
parents: 6433
diff changeset
     6
from mercurial import revlog, node, util
9c426da6b03b contrib: fix binary file issues with dumprevlog on Windows
Adrian Buehlmann <adrian@cadifra.com>
parents: 6433
diff changeset
     7
9c426da6b03b contrib: fix binary file issues with dumprevlog on Windows
Adrian Buehlmann <adrian@cadifra.com>
parents: 6433
diff changeset
     8
for fp in (sys.stdin, sys.stdout, sys.stderr):
14233
659f34b833b9 rename util.set_binary to setbinary
Adrian Buehlmann <adrian@cadifra.com>
parents: 7361
diff changeset
     9
    util.setbinary(fp)
6433
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    10
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    11
for f in sys.argv[1:]:
6466
9c426da6b03b contrib: fix binary file issues with dumprevlog on Windows
Adrian Buehlmann <adrian@cadifra.com>
parents: 6433
diff changeset
    12
    binopen = lambda fn: open(fn, 'rb')
9c426da6b03b contrib: fix binary file issues with dumprevlog on Windows
Adrian Buehlmann <adrian@cadifra.com>
parents: 6433
diff changeset
    13
    r = revlog.revlog(binopen, f)
6433
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    14
    print "file:", f
6750
fb42030d79d6 add __len__ and __iter__ methods to repo and revlog
Matt Mackall <mpm@selenic.com>
parents: 6466
diff changeset
    15
    for i in r:
6433
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    16
        n = r.node(i)
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    17
        p = r.parents(n)
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    18
        d = r.revision(n)
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    19
        print "node:", node.hex(n)
7361
9fe97eea5510 linkrev: take a revision number rather than a hash
Matt Mackall <mpm@selenic.com>
parents: 6750
diff changeset
    20
        print "linkrev:", r.linkrev(i)
6433
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    21
        print "parents:", node.hex(p[0]), node.hex(p[1])
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    22
        print "length:", len(d)
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    23
        print "-start-"
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    24
        print d
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    25
        print "-end-"