mercurial/similar.py
author Pierre-Yves David <pierre-yves.david@octobus.net>
Mon, 03 May 2021 12:35:25 +0200
changeset 47252 2219853a1503
parent 46819 d4ba4d51f85f
child 48966 6000f5b25c9b
permissions -rw-r--r--
revlogv2: track pending write in the docket and expose it to hooks The docket is now able to write pending data. We could have used a distinct intermediate files, however keeping everything in the same file will make it simpler to keep track of the various involved files if necessary. However it might prove more complicated for streaming clone. This will be dealt with later. Note that we lifted the stderr redirection in the test since we no longer suffer from "unkown working directory parent" message. Differential Revision: https://phab.mercurial-scm.org/D10631
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
11059
ef4aa90b1e58 Move 'findrenames' code into its own file.
David Greenaway <hg-dev@davidgreenaway.com>
parents:
diff changeset
     1
# similar.py - mechanisms for finding similar files
ef4aa90b1e58 Move 'findrenames' code into its own file.
David Greenaway <hg-dev@davidgreenaway.com>
parents:
diff changeset
     2
#
46819
d4ba4d51f85f contributor: change mentions of mpm to olivia
Raphaël Gomès <rgomes@octobus.net>
parents: 45957
diff changeset
     3
# Copyright 2005-2007 Olivia Mackall <olivia@selenic.com>
11059
ef4aa90b1e58 Move 'findrenames' code into its own file.
David Greenaway <hg-dev@davidgreenaway.com>
parents:
diff changeset
     4
#
ef4aa90b1e58 Move 'findrenames' code into its own file.
David Greenaway <hg-dev@davidgreenaway.com>
parents:
diff changeset
     5
# This software may be used and distributed according to the terms of the
ef4aa90b1e58 Move 'findrenames' code into its own file.
David Greenaway <hg-dev@davidgreenaway.com>
parents:
diff changeset
     6
# GNU General Public License version 2 or any later version.
ef4aa90b1e58 Move 'findrenames' code into its own file.
David Greenaway <hg-dev@davidgreenaway.com>
parents:
diff changeset
     7
27359
a56c47ed3885 similar: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 16683
diff changeset
     8
from __future__ import absolute_import
a56c47ed3885 similar: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 16683
diff changeset
     9
a56c47ed3885 similar: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 16683
diff changeset
    10
from .i18n import _
43106
d783f945a701 py3: finish porting iteritems() to pycompat and remove source transformer
Gregory Szorc <gregory.szorc@gmail.com>
parents: 43077
diff changeset
    11
from . import (
d783f945a701 py3: finish porting iteritems() to pycompat and remove source transformer
Gregory Szorc <gregory.szorc@gmail.com>
parents: 43077
diff changeset
    12
    mdiff,
d783f945a701 py3: finish porting iteritems() to pycompat and remove source transformer
Gregory Szorc <gregory.szorc@gmail.com>
parents: 43077
diff changeset
    13
    pycompat,
d783f945a701 py3: finish porting iteritems() to pycompat and remove source transformer
Gregory Szorc <gregory.szorc@gmail.com>
parents: 43077
diff changeset
    14
)
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38401
diff changeset
    15
11059
ef4aa90b1e58 Move 'findrenames' code into its own file.
David Greenaway <hg-dev@davidgreenaway.com>
parents:
diff changeset
    16
11060
e6df01776e08 findrenames: Optimise "addremove -s100" by matching files by their SHA1 hashes.
David Greenaway <hg-dev@davidgreenaway.com>
parents: 11059
diff changeset
    17
def _findexactmatches(repo, added, removed):
45957
89a2afe31e82 formating: upgrade to black 20.8b1
Augie Fackler <raf@durin42.com>
parents: 43106
diff changeset
    18
    """find renamed files that have no changes
11060
e6df01776e08 findrenames: Optimise "addremove -s100" by matching files by their SHA1 hashes.
David Greenaway <hg-dev@davidgreenaway.com>
parents: 11059
diff changeset
    19
e6df01776e08 findrenames: Optimise "addremove -s100" by matching files by their SHA1 hashes.
David Greenaway <hg-dev@davidgreenaway.com>
parents: 11059
diff changeset
    20
    Takes a list of new filectxs and a list of removed filectxs, and yields
e6df01776e08 findrenames: Optimise "addremove -s100" by matching files by their SHA1 hashes.
David Greenaway <hg-dev@davidgreenaway.com>
parents: 11059
diff changeset
    21
    (before, after) tuples of exact matches.
45957
89a2afe31e82 formating: upgrade to black 20.8b1
Augie Fackler <raf@durin42.com>
parents: 43106
diff changeset
    22
    """
31590
985a98c6bad0 similar: use cheaper hash() function to test exact matches
Yuya Nishihara <yuya@tcha.org>
parents: 31589
diff changeset
    23
    # Build table of removed files: {hash(fctx.data()): [fctx, ...]}.
985a98c6bad0 similar: use cheaper hash() function to test exact matches
Yuya Nishihara <yuya@tcha.org>
parents: 31589
diff changeset
    24
    # We use hash() to discard fctx.data() from memory.
11060
e6df01776e08 findrenames: Optimise "addremove -s100" by matching files by their SHA1 hashes.
David Greenaway <hg-dev@davidgreenaway.com>
parents: 11059
diff changeset
    25
    hashes = {}
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38401
diff changeset
    26
    progress = repo.ui.makeprogress(
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
    27
        _(b'searching for exact renames'),
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38401
diff changeset
    28
        total=(len(added) + len(removed)),
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
    29
        unit=_(b'files'),
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38401
diff changeset
    30
    )
38354
cd196be26cb7 similar: use progress helper
Martin von Zweigbergk <martinvonz@google.com>
parents: 32246
diff changeset
    31
    for fctx in removed:
cd196be26cb7 similar: use progress helper
Martin von Zweigbergk <martinvonz@google.com>
parents: 32246
diff changeset
    32
        progress.increment()
31590
985a98c6bad0 similar: use cheaper hash() function to test exact matches
Yuya Nishihara <yuya@tcha.org>
parents: 31589
diff changeset
    33
        h = hash(fctx.data())
985a98c6bad0 similar: use cheaper hash() function to test exact matches
Yuya Nishihara <yuya@tcha.org>
parents: 31589
diff changeset
    34
        if h not in hashes:
985a98c6bad0 similar: use cheaper hash() function to test exact matches
Yuya Nishihara <yuya@tcha.org>
parents: 31589
diff changeset
    35
            hashes[h] = [fctx]
985a98c6bad0 similar: use cheaper hash() function to test exact matches
Yuya Nishihara <yuya@tcha.org>
parents: 31589
diff changeset
    36
        else:
985a98c6bad0 similar: use cheaper hash() function to test exact matches
Yuya Nishihara <yuya@tcha.org>
parents: 31589
diff changeset
    37
            hashes[h].append(fctx)
11060
e6df01776e08 findrenames: Optimise "addremove -s100" by matching files by their SHA1 hashes.
David Greenaway <hg-dev@davidgreenaway.com>
parents: 11059
diff changeset
    38
e6df01776e08 findrenames: Optimise "addremove -s100" by matching files by their SHA1 hashes.
David Greenaway <hg-dev@davidgreenaway.com>
parents: 11059
diff changeset
    39
    # For each added file, see if it corresponds to a removed file.
38354
cd196be26cb7 similar: use progress helper
Martin von Zweigbergk <martinvonz@google.com>
parents: 32246
diff changeset
    40
    for fctx in added:
cd196be26cb7 similar: use progress helper
Martin von Zweigbergk <martinvonz@google.com>
parents: 32246
diff changeset
    41
        progress.increment()
31220
e1d035905b2e similar: compare between actual file contents for exact identity
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 30809
diff changeset
    42
        adata = fctx.data()
31590
985a98c6bad0 similar: use cheaper hash() function to test exact matches
Yuya Nishihara <yuya@tcha.org>
parents: 31589
diff changeset
    43
        h = hash(adata)
985a98c6bad0 similar: use cheaper hash() function to test exact matches
Yuya Nishihara <yuya@tcha.org>
parents: 31589
diff changeset
    44
        for rfctx in hashes.get(h, []):
31220
e1d035905b2e similar: compare between actual file contents for exact identity
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 30809
diff changeset
    45
            # compare between actual file contents for exact identity
e1d035905b2e similar: compare between actual file contents for exact identity
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 30809
diff changeset
    46
            if adata == rfctx.data():
e1d035905b2e similar: compare between actual file contents for exact identity
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 30809
diff changeset
    47
                yield (rfctx, fctx)
31590
985a98c6bad0 similar: use cheaper hash() function to test exact matches
Yuya Nishihara <yuya@tcha.org>
parents: 31589
diff changeset
    48
                break
11060
e6df01776e08 findrenames: Optimise "addremove -s100" by matching files by their SHA1 hashes.
David Greenaway <hg-dev@davidgreenaway.com>
parents: 11059
diff changeset
    49
e6df01776e08 findrenames: Optimise "addremove -s100" by matching files by their SHA1 hashes.
David Greenaway <hg-dev@davidgreenaway.com>
parents: 11059
diff changeset
    50
    # Done
38379
ef692614e601 progress: hide update(None) in a new complete() method
Martin von Zweigbergk <martinvonz@google.com>
parents: 38354
diff changeset
    51
    progress.complete()
11060
e6df01776e08 findrenames: Optimise "addremove -s100" by matching files by their SHA1 hashes.
David Greenaway <hg-dev@davidgreenaway.com>
parents: 11059
diff changeset
    52
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38401
diff changeset
    53
30805
0ae287eb6a4f similar: move score function to module level
Sean Farley <sean@farley.io>
parents: 30791
diff changeset
    54
def _ctxdata(fctx):
0ae287eb6a4f similar: move score function to module level
Sean Farley <sean@farley.io>
parents: 30791
diff changeset
    55
    # lazily load text
0ae287eb6a4f similar: move score function to module level
Sean Farley <sean@farley.io>
parents: 30791
diff changeset
    56
    orig = fctx.data()
0ae287eb6a4f similar: move score function to module level
Sean Farley <sean@farley.io>
parents: 30791
diff changeset
    57
    return orig, mdiff.splitnewlines(orig)
0ae287eb6a4f similar: move score function to module level
Sean Farley <sean@farley.io>
parents: 30791
diff changeset
    58
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38401
diff changeset
    59
30809
8614546154cb similar: remove caching from the module level
Pierre-Yves David <pierre-yves.david@ens-lyon.org>
parents: 30805
diff changeset
    60
def _score(fctx, otherdata):
8614546154cb similar: remove caching from the module level
Pierre-Yves David <pierre-yves.david@ens-lyon.org>
parents: 30805
diff changeset
    61
    orig, lines = otherdata
8614546154cb similar: remove caching from the module level
Pierre-Yves David <pierre-yves.david@ens-lyon.org>
parents: 30805
diff changeset
    62
    text = fctx.data()
32246
ded48ad55146 bdiff: proxy through mdiff module
Yuya Nishihara <yuya@tcha.org>
parents: 31590
diff changeset
    63
    # mdiff.blocks() returns blocks of matching lines
30805
0ae287eb6a4f similar: move score function to module level
Sean Farley <sean@farley.io>
parents: 30791
diff changeset
    64
    # count the number of bytes in each
0ae287eb6a4f similar: move score function to module level
Sean Farley <sean@farley.io>
parents: 30791
diff changeset
    65
    equal = 0
32246
ded48ad55146 bdiff: proxy through mdiff module
Yuya Nishihara <yuya@tcha.org>
parents: 31590
diff changeset
    66
    matches = mdiff.blocks(text, orig)
30805
0ae287eb6a4f similar: move score function to module level
Sean Farley <sean@farley.io>
parents: 30791
diff changeset
    67
    for x1, x2, y1, y2 in matches:
0ae287eb6a4f similar: move score function to module level
Sean Farley <sean@farley.io>
parents: 30791
diff changeset
    68
        for line in lines[y1:y2]:
0ae287eb6a4f similar: move score function to module level
Sean Farley <sean@farley.io>
parents: 30791
diff changeset
    69
            equal += len(line)
0ae287eb6a4f similar: move score function to module level
Sean Farley <sean@farley.io>
parents: 30791
diff changeset
    70
0ae287eb6a4f similar: move score function to module level
Sean Farley <sean@farley.io>
parents: 30791
diff changeset
    71
    lengths = len(text) + len(orig)
0ae287eb6a4f similar: move score function to module level
Sean Farley <sean@farley.io>
parents: 30791
diff changeset
    72
    return equal * 2.0 / lengths
0ae287eb6a4f similar: move score function to module level
Sean Farley <sean@farley.io>
parents: 30791
diff changeset
    73
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38401
diff changeset
    74
30809
8614546154cb similar: remove caching from the module level
Pierre-Yves David <pierre-yves.david@ens-lyon.org>
parents: 30805
diff changeset
    75
def score(fctx1, fctx2):
8614546154cb similar: remove caching from the module level
Pierre-Yves David <pierre-yves.david@ens-lyon.org>
parents: 30805
diff changeset
    76
    return _score(fctx1, _ctxdata(fctx2))
8614546154cb similar: remove caching from the module level
Pierre-Yves David <pierre-yves.david@ens-lyon.org>
parents: 30805
diff changeset
    77
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38401
diff changeset
    78
11060
e6df01776e08 findrenames: Optimise "addremove -s100" by matching files by their SHA1 hashes.
David Greenaway <hg-dev@davidgreenaway.com>
parents: 11059
diff changeset
    79
def _findsimilarmatches(repo, added, removed, threshold):
45957
89a2afe31e82 formating: upgrade to black 20.8b1
Augie Fackler <raf@durin42.com>
parents: 43106
diff changeset
    80
    """find potentially renamed files based on similar file content
11060
e6df01776e08 findrenames: Optimise "addremove -s100" by matching files by their SHA1 hashes.
David Greenaway <hg-dev@davidgreenaway.com>
parents: 11059
diff changeset
    81
e6df01776e08 findrenames: Optimise "addremove -s100" by matching files by their SHA1 hashes.
David Greenaway <hg-dev@davidgreenaway.com>
parents: 11059
diff changeset
    82
    Takes a list of new filectxs and a list of removed filectxs, and yields
e6df01776e08 findrenames: Optimise "addremove -s100" by matching files by their SHA1 hashes.
David Greenaway <hg-dev@davidgreenaway.com>
parents: 11059
diff changeset
    83
    (before, after, score) tuples of partial matches.
45957
89a2afe31e82 formating: upgrade to black 20.8b1
Augie Fackler <raf@durin42.com>
parents: 43106
diff changeset
    84
    """
11059
ef4aa90b1e58 Move 'findrenames' code into its own file.
David Greenaway <hg-dev@davidgreenaway.com>
parents:
diff changeset
    85
    copies = {}
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38401
diff changeset
    86
    progress = repo.ui.makeprogress(
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
    87
        _(b'searching for similar files'), unit=_(b'files'), total=len(removed)
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38401
diff changeset
    88
    )
38401
59c9d3cc810f similar: use progress helper
Martin von Zweigbergk <martinvonz@google.com>
parents: 38379
diff changeset
    89
    for r in removed:
59c9d3cc810f similar: use progress helper
Martin von Zweigbergk <martinvonz@google.com>
parents: 38379
diff changeset
    90
        progress.increment()
30809
8614546154cb similar: remove caching from the module level
Pierre-Yves David <pierre-yves.david@ens-lyon.org>
parents: 30805
diff changeset
    91
        data = None
11059
ef4aa90b1e58 Move 'findrenames' code into its own file.
David Greenaway <hg-dev@davidgreenaway.com>
parents:
diff changeset
    92
        for a in added:
ef4aa90b1e58 Move 'findrenames' code into its own file.
David Greenaway <hg-dev@davidgreenaway.com>
parents:
diff changeset
    93
            bestscore = copies.get(a, (None, threshold))[1]
30809
8614546154cb similar: remove caching from the module level
Pierre-Yves David <pierre-yves.david@ens-lyon.org>
parents: 30805
diff changeset
    94
            if data is None:
8614546154cb similar: remove caching from the module level
Pierre-Yves David <pierre-yves.david@ens-lyon.org>
parents: 30805
diff changeset
    95
                data = _ctxdata(r)
8614546154cb similar: remove caching from the module level
Pierre-Yves David <pierre-yves.david@ens-lyon.org>
parents: 30805
diff changeset
    96
            myscore = _score(a, data)
31589
2efd9771323e similar: take the first match instead of the last
Yuya Nishihara <yuya@tcha.org>
parents: 31588
diff changeset
    97
            if myscore > bestscore:
11059
ef4aa90b1e58 Move 'findrenames' code into its own file.
David Greenaway <hg-dev@davidgreenaway.com>
parents:
diff changeset
    98
                copies[a] = (r, myscore)
38401
59c9d3cc810f similar: use progress helper
Martin von Zweigbergk <martinvonz@google.com>
parents: 38379
diff changeset
    99
    progress.complete()
11059
ef4aa90b1e58 Move 'findrenames' code into its own file.
David Greenaway <hg-dev@davidgreenaway.com>
parents:
diff changeset
   100
43106
d783f945a701 py3: finish porting iteritems() to pycompat and remove source transformer
Gregory Szorc <gregory.szorc@gmail.com>
parents: 43077
diff changeset
   101
    for dest, v in pycompat.iteritems(copies):
30791
ada160a8cfd8 similar: rename local variable to not collide with previous
Sean Farley <sean@farley.io>
parents: 29341
diff changeset
   102
        source, bscore = v
ada160a8cfd8 similar: rename local variable to not collide with previous
Sean Farley <sean@farley.io>
parents: 29341
diff changeset
   103
        yield source, dest, bscore
11059
ef4aa90b1e58 Move 'findrenames' code into its own file.
David Greenaway <hg-dev@davidgreenaway.com>
parents:
diff changeset
   104
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38401
diff changeset
   105
31588
2e254165a37c similar: do not look up and create filectx more than once
Yuya Nishihara <yuya@tcha.org>
parents: 31587
diff changeset
   106
def _dropempty(fctxs):
2e254165a37c similar: do not look up and create filectx more than once
Yuya Nishihara <yuya@tcha.org>
parents: 31587
diff changeset
   107
    return [x for x in fctxs if x.size() > 0]
2e254165a37c similar: do not look up and create filectx more than once
Yuya Nishihara <yuya@tcha.org>
parents: 31587
diff changeset
   108
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38401
diff changeset
   109
11060
e6df01776e08 findrenames: Optimise "addremove -s100" by matching files by their SHA1 hashes.
David Greenaway <hg-dev@davidgreenaway.com>
parents: 11059
diff changeset
   110
def findrenames(repo, added, removed, threshold):
e6df01776e08 findrenames: Optimise "addremove -s100" by matching files by their SHA1 hashes.
David Greenaway <hg-dev@davidgreenaway.com>
parents: 11059
diff changeset
   111
    '''find renamed files -- yields (before, after, score) tuples'''
31587
b1528d195a13 similar: use common names for changectx variables
Yuya Nishihara <yuya@tcha.org>
parents: 31586
diff changeset
   112
    wctx = repo[None]
b1528d195a13 similar: use common names for changectx variables
Yuya Nishihara <yuya@tcha.org>
parents: 31586
diff changeset
   113
    pctx = wctx.p1()
11059
ef4aa90b1e58 Move 'findrenames' code into its own file.
David Greenaway <hg-dev@davidgreenaway.com>
parents:
diff changeset
   114
11060
e6df01776e08 findrenames: Optimise "addremove -s100" by matching files by their SHA1 hashes.
David Greenaway <hg-dev@davidgreenaway.com>
parents: 11059
diff changeset
   115
    # Zero length files will be frequently unrelated to each other, and
e6df01776e08 findrenames: Optimise "addremove -s100" by matching files by their SHA1 hashes.
David Greenaway <hg-dev@davidgreenaway.com>
parents: 11059
diff changeset
   116
    # tracking the deletion/addition of such a file will probably cause more
e6df01776e08 findrenames: Optimise "addremove -s100" by matching files by their SHA1 hashes.
David Greenaway <hg-dev@davidgreenaway.com>
parents: 11059
diff changeset
   117
    # harm than good. We strip them out here to avoid matching them later on.
31588
2e254165a37c similar: do not look up and create filectx more than once
Yuya Nishihara <yuya@tcha.org>
parents: 31587
diff changeset
   118
    addedfiles = _dropempty(wctx[fp] for fp in sorted(added))
2e254165a37c similar: do not look up and create filectx more than once
Yuya Nishihara <yuya@tcha.org>
parents: 31587
diff changeset
   119
    removedfiles = _dropempty(pctx[fp] for fp in sorted(removed) if fp in pctx)
11060
e6df01776e08 findrenames: Optimise "addremove -s100" by matching files by their SHA1 hashes.
David Greenaway <hg-dev@davidgreenaway.com>
parents: 11059
diff changeset
   120
e6df01776e08 findrenames: Optimise "addremove -s100" by matching files by their SHA1 hashes.
David Greenaway <hg-dev@davidgreenaway.com>
parents: 11059
diff changeset
   121
    # Find exact matches.
31586
d3e2af4e0128 similar: get rid of quadratic addedfiles.remove()
Yuya Nishihara <yuya@tcha.org>
parents: 31585
diff changeset
   122
    matchedfiles = set()
d3e2af4e0128 similar: get rid of quadratic addedfiles.remove()
Yuya Nishihara <yuya@tcha.org>
parents: 31585
diff changeset
   123
    for (a, b) in _findexactmatches(repo, addedfiles, removedfiles):
d3e2af4e0128 similar: get rid of quadratic addedfiles.remove()
Yuya Nishihara <yuya@tcha.org>
parents: 31585
diff changeset
   124
        matchedfiles.add(b)
11060
e6df01776e08 findrenames: Optimise "addremove -s100" by matching files by their SHA1 hashes.
David Greenaway <hg-dev@davidgreenaway.com>
parents: 11059
diff changeset
   125
        yield (a.path(), b.path(), 1.0)
e6df01776e08 findrenames: Optimise "addremove -s100" by matching files by their SHA1 hashes.
David Greenaway <hg-dev@davidgreenaway.com>
parents: 11059
diff changeset
   126
e6df01776e08 findrenames: Optimise "addremove -s100" by matching files by their SHA1 hashes.
David Greenaway <hg-dev@davidgreenaway.com>
parents: 11059
diff changeset
   127
    # If the user requested similar files to be matched, search for them also.
e6df01776e08 findrenames: Optimise "addremove -s100" by matching files by their SHA1 hashes.
David Greenaway <hg-dev@davidgreenaway.com>
parents: 11059
diff changeset
   128
    if threshold < 1.0:
31586
d3e2af4e0128 similar: get rid of quadratic addedfiles.remove()
Yuya Nishihara <yuya@tcha.org>
parents: 31585
diff changeset
   129
        addedfiles = [x for x in addedfiles if x not in matchedfiles]
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38401
diff changeset
   130
        for (a, b, score) in _findsimilarmatches(
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38401
diff changeset
   131
            repo, addedfiles, removedfiles, threshold
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38401
diff changeset
   132
        ):
11060
e6df01776e08 findrenames: Optimise "addremove -s100" by matching files by their SHA1 hashes.
David Greenaway <hg-dev@davidgreenaway.com>
parents: 11059
diff changeset
   133
            yield (a.path(), b.path(), score)