view tests/test-parseindex @ 12717:89df79b3c011 stable

convert/darcs: support changelogs with bytes 0x7F-0xFF (issue2411) This is a followup to 4481f8a93c7a, which only fixed the conversion of patches with UTF-8 metadata. This patch allows a changelog to have any bytes with values 0x7F-0xFF. It parses the XML changelog as Latin-1 and uses converter_source.recode() to decode the data as UTF-8/Latin-1. Caveats: - Since the convert extension doesn't provide any way to specify the source encoding, users are still limited to UTF-8 and Latin-1. - etree will still complain if the changelog has bytes with values 0x00-0x19. XML only allows printable characters.
author Brodie Rao <brodie@bitheap.org>
date Fri, 01 Oct 2010 10:15:04 -0500
parents fb42030d79d6
children 4c94b6d0fb1c
line wrap: on
line source

#!/bin/sh
#
# revlog.parseindex must be able to parse the index file even if
# an index entry is split between two 64k blocks.  The ideal test
# would be to create an index file with inline data where
# 64k < size < 64k + 64 (64k is the size of the read buffer, 64 is
# the size of an index entry) and with an index entry starting right
# before the 64k block boundary, and try to read it.
#
# We approximate that by reducing the read buffer to 1 byte.
#

hg init a
cd a
echo abc > foo
hg add foo
hg commit -m 'add foo' -d '1000000 0'

echo >> foo
hg commit -m 'change foo' -d '1000001 0'
hg log -r 0:

cat >> test.py << EOF
from mercurial import changelog, util
from mercurial.node import *

class singlebyteread(object):
    def __init__(self, real):
        self.real = real

    def read(self, size=-1):
        if size == 65536:
            size = 1
        return self.real.read(size)

    def __getattr__(self, key):
        return getattr(self.real, key)

def opener(*args):
    o = util.opener(*args)
    def wrapper(*a):
        f = o(*a)
        return singlebyteread(f)
    return wrapper

cl = changelog.changelog(opener('.hg/store'))
print len(cl), 'revisions:'
for r in cl:
    print short(cl.node(r))
EOF

python test.py