view mercurial/pushkey.py @ 29322:66dbdd3cc2b9 stable

bdiff: extend matches across popular lines For very large diffs that have large numbers of identical lines (JSON dumps) that also have large blocks of identical text, bdiff could become confused about which block matches which because it can only match very limited regions. The result is very large diffs for small sets of edits. The earlier recursion rebalancing fix made this behavior more frequent because it's now more prone to match block 1 to block 2. One frequent user of large JSON files reported being unable to pass the resulting diffs through their code review system. Prior to this change, bdiff would calculate the length of a match at (i, j) as 1 + length found at (i-1, j-1). With large number of popular (ignored) lines, this often meant matches couldn't be extended backwards at all and thus all matching regions were very small. Disabling the popularity threshold is not an option because it brings back quadratic behavior. Instead, we extend a match backwards until we either found a previously discovered match or we find a mismatching line. This thus successfully bridges over any popular lines inside and before a matching region. The larger regions then significant reduce the probability of confusion.
author Matt Mackall <mpm@selenic.com>
date Thu, 02 Jun 2016 17:09:06 -0500
parents 7b200566e474
children 57875cf423c9
line wrap: on
line source

# pushkey.py - dispatching for pushing and pulling keys
#
# Copyright 2010 Matt Mackall <mpm@selenic.com>
#
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.

from __future__ import absolute_import

from . import (
    bookmarks,
    encoding,
    obsolete,
    phases,
)

def _nslist(repo):
    n = {}
    for k in _namespaces:
        n[k] = ""
    if not obsolete.isenabled(repo, obsolete.exchangeopt):
        n.pop('obsolete')
    return n

_namespaces = {"namespaces": (lambda *x: False, _nslist),
               "bookmarks": (bookmarks.pushbookmark, bookmarks.listbookmarks),
               "phases": (phases.pushphase, phases.listphases),
               "obsolete": (obsolete.pushmarker, obsolete.listmarkers),
              }

def register(namespace, pushkey, listkeys):
    _namespaces[namespace] = (pushkey, listkeys)

def _get(namespace):
    return _namespaces.get(namespace, (lambda *x: False, lambda *x: {}))

def push(repo, namespace, key, old, new):
    '''should succeed iff value was old'''
    pk = _get(namespace)[0]
    return pk(repo, key, old, new)

def list(repo, namespace):
    '''return a dict'''
    lk = _get(namespace)[1]
    return lk(repo)

encode = encoding.fromlocal

decode = encoding.tolocal

def encodekeys(keys):
    """encode the content of a pushkey namespace for exchange over the wire"""
    return '\n'.join(['%s\t%s' % (encode(k), encode(v)) for k, v in keys])

def decodekeys(data):
    """decode the content of a pushkey namespace from exchange over the wire"""
    result = {}
    for l in data.splitlines():
        k, v = l.split('\t')
        result[decode(k)] = decode(v)
    return result