Mercurial > hg
view mercurial/diffhelper.py @ 45637:ad6ebb6f0dfe
copies: make two version of the changeset centric algorithm
They are two main ways to run the changeset-centric copy-tracing algorithm. One
fed from data stored in side-data and still in development, and one based on
data stored in extra (with a "compatibility" mode).
The `extra` based is used in production at Google, but still experimental in
code. It is mostly unsuitable for other users because it affects the hash.
The side-data based storage and algorithm have been evolving to store more data, cover more cases
(mostly around merge, that Google do not really care about) and use lower level
storage for efficiency.
All this changes make is increasingly hard to maintain de common code base,
without impacting code complexity and performance. For example, the
compatibility mode requires to keep things at different level than what we
need for side-data.
So, I am duplicating the involved functions. The newly added `_extra` variants
will be kept as today, while I will do some deeper rework of the side data
versions.
Long terms, the side-data version should be more featureful and performant than
the extra based version, so I expect the duplicated `_extra` functions to
eventually get dropped.
Differential Revision: https://phab.mercurial-scm.org/D9114
author | Pierre-Yves David <pierre-yves.david@octobus.net> |
---|---|
date | Fri, 25 Sep 2020 14:39:04 +0200 |
parents | 10f48720ef95 |
children | d4ba4d51f85f |
line wrap: on
line source
# diffhelper.py - helper routines for patch # # Copyright 2009 Matt Mackall <mpm@selenic.com> and others # # This software may be used and distributed according to the terms of the # GNU General Public License version 2 or any later version. from __future__ import absolute_import from .i18n import _ from . import ( error, pycompat, ) MISSING_NEWLINE_MARKER = b'\\ No newline at end of file\n' def addlines(fp, hunk, lena, lenb, a, b): """Read lines from fp into the hunk The hunk is parsed into two arrays, a and b. a gets the old state of the text, b gets the new state. The control char from the hunk is saved when inserting into a, but not b (for performance while deleting files.) """ while True: todoa = lena - len(a) todob = lenb - len(b) num = max(todoa, todob) if num == 0: break for i in pycompat.xrange(num): s = fp.readline() if not s: raise error.ParseError(_(b'incomplete hunk')) if s == MISSING_NEWLINE_MARKER: fixnewline(hunk, a, b) continue if s == b'\n' or s == b'\r\n': # Some patches may be missing the control char # on empty lines. Supply a leading space. s = b' ' + s hunk.append(s) if s.startswith(b'+'): b.append(s[1:]) elif s.startswith(b'-'): a.append(s) else: b.append(s[1:]) a.append(s) def fixnewline(hunk, a, b): """Fix up the last lines of a and b when the patch has no newline at EOF""" l = hunk[-1] # tolerate CRLF in last line if l.endswith(b'\r\n'): hline = l[:-2] else: hline = l[:-1] if hline.startswith((b' ', b'+')): b[-1] = hline[1:] if hline.startswith((b' ', b'-')): a[-1] = hline hunk[-1] = hline def testhunk(a, b, bstart): """Compare the lines in a with the lines in b a is assumed to have a control char at the start of each line, this char is ignored in the compare. """ alen = len(a) blen = len(b) if alen > blen - bstart or bstart < 0: return False for i in pycompat.xrange(alen): if a[i][1:] != b[i + bstart]: return False return True