Mercurial > hg
view tests/test-minirst.py @ 42377:0546ead39a7e stable
manifest: avoid corruption by dropping removed files with pure (issue5801)
Previously, removed files would simply be marked by overwriting the first byte
with NUL and dropping their entry in `self.position`. But no effort was made to
ignore them when compacting the dictionary into text form. This allowed them to
slip into the manifest revision, since the code seems to be trying to minimize
the string operations by copying as large a chunk as possible. As part of this,
compact() walks the existing text based on entries in the `positions` list, and
consumed everything up to the next position entry. This typically resulted in
a ValueError complaining about unsorted manifest entries.
Sometimes it seems that files do get dropped in large repos- it seems to
correspond to there being a new entry that would take the same slot. A much
more trivial problem is that if the only changes were removals, `_compact()`
didn't even run because `__delitem__` doesn't add anything to `self.extradata`.
Now there's an explicit variable to flag this, both to allow `_compact()` to
run, and to avoid searching the manifest in cases where there are no removals.
In practice, this behavior was mostly obscured by the check in fastdelta() which
takes a different path that explicitly drops removed files if there are fewer
than 1000 changes. However, timeless has a repo where after rebasing tens of
commits, a totally different path[1] is taken that bypasses the change count
check and hits this problem.
[1] https://www.mercurial-scm.org/repo/hg/file/2338bdea4474/mercurial/manifest.py#l1511
author | Matt Harbison <matt_harbison@yahoo.com> |
---|---|
date | Thu, 23 May 2019 21:54:24 -0400 |
parents | a2a5d4ad5276 |
children | 2372284d9457 |
line wrap: on
line source
from __future__ import absolute_import, print_function from mercurial import ( minirst, ) from mercurial.utils import ( stringutil, ) def debugformat(text, form, **kwargs): blocks, pruned = minirst.parse(text, **kwargs) if form == b'html': print("html format:") out = minirst.format(text, style=form, **kwargs) else: print("%d column format:" % form) out = minirst.format(text, width=form, **kwargs) print("-" * 70) print(out[:-1].decode('utf8')) if kwargs.get('keep'): print("-" * 70) print(stringutil.pprint(pruned).decode('utf8')) print("-" * 70) print() def debugformats(title, text, **kwargs): print("== %s ==" % title) debugformat(text, 60, **kwargs) debugformat(text, 30, **kwargs) debugformat(text, b'html', **kwargs) paragraphs = b""" This is some text in the first paragraph. A small indented paragraph. It is followed by some lines containing random whitespace. \n \n \nThe third and final paragraph. """ debugformats('paragraphs', paragraphs) definitions = b""" A Term Definition. The indented lines make up the definition. Another Term Another definition. The final line in the definition determines the indentation, so this will be indented with four spaces. A Nested/Indented Term Definition. """ debugformats('definitions', definitions) literals = br""" The fully minimized form is the most convenient form:: Hello literal world In the partially minimized form a paragraph simply ends with space-double-colon. :: //////////////////////////////////////// long un-wrapped line in a literal block \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ :: This literal block is started with '::', the so-called expanded form. The paragraph with '::' disappears in the final output. """ debugformats('literals', literals) lists = b""" - This is the first list item. Second paragraph in the first list item. - List items need not be separated by a blank line. - And will be rendered without one in any case. We can have indented lists: - This is an indented list item - Another indented list item:: - A literal block in the middle of an indented list. (The above is not a list item since we are in the literal block.) :: Literal block with no indentation (apart from the two spaces added to all literal blocks). 1. This is an enumerated list (first item). 2. Continuing with the second item. (1) foo (2) bar 1) Another 2) List Line blocks are also a form of list: | This is the first line. The line continues here. | This is the second line. Bullet lists are also detected: * This is the first bullet * This is the second bullet It has 2 lines * This is the third bullet """ debugformats('lists', lists) options = b""" There is support for simple option lists, but only with long options: -X, --exclude filter an option with a short and long option with an argument -I, --include an option with both a short option and a long option --all Output all. --both Output both (this description is quite long). --long Output all day long. --par This option has two paragraphs in its description. This is the first. This is the second. Blank lines may be omitted between options (as above) or left in (as here). The next paragraph looks like an option list, but lacks the two-space marker after the option. It is treated as a normal paragraph: --foo bar baz """ debugformats('options', options) fields = b""" :a: First item. :ab: Second item. Indentation and wrapping is handled automatically. Next list: :small: The larger key below triggers full indentation here. :much too large: This key is big enough to get its own line. """ debugformats('fields', fields) containers = b""" Normal output. .. container:: debug Initial debug output. .. container:: verbose Verbose output. .. container:: debug Debug output. """ debugformats('containers (normal)', containers) debugformats('containers (verbose)', containers, keep=[b'verbose']) debugformats('containers (debug)', containers, keep=[b'debug']) debugformats('containers (verbose debug)', containers, keep=[b'verbose', b'debug']) roles = b"""Please see :hg:`add`.""" debugformats('roles', roles) sections = b""" Title ===== Section ------- Subsection '''''''''' Markup: ``foo`` and :hg:`help` ------------------------------ """ debugformats('sections', sections) admonitions = b""" .. note:: This is a note - Bullet 1 - Bullet 2 .. warning:: This is a warning Second input line of warning .. danger:: This is danger """ debugformats('admonitions', admonitions) comments = b""" Some text. .. A comment .. An indented comment Some indented text. .. Empty comment above """ debugformats('comments', comments) data = [[b'a', b'b', b'c'], [b'1', b'2', b'3'], [b'foo', b'bar', b'baz this list is very very very long man']] rst = minirst.maketable(data, 2, True) table = b''.join(rst) print(table.decode('utf8')) debugformats('table', table) data = [[b's', b'long', b'line\ngoes on here'], [b'', b'xy', b'tried to fix here\n by indenting']] rst = minirst.maketable(data, 1, False) table = b''.join(rst) print(table.decode('utf8')) debugformats('table+nl', table)