hgext/automv.py
author Arseniy Alekseyev <aalekseyev@janestreet.com>
Thu, 18 May 2023 19:23:59 +0100
changeset 50673 5d84b1385f7f
parent 50124 18149ecb5122
child 50870 47af00b2217f
permissions -rw-r--r--
treemanifest: make `updatecaches` update the nodemaps for all directories Without this, if the cache for a nested directory is in a bad state, it's very hard to repair it.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
     1
# automv.py
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
     2
#
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
     3
# Copyright 2013-2016 Facebook, Inc.
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
     4
#
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
     5
# This software may be used and distributed according to the terms of the
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
     6
# GNU General Public License version 2 or any later version.
31599
e4aefdb58ebe automv: use lowercase for docstring title
Jun Wu <quark@fb.com>
parents: 29205
diff changeset
     7
"""check for unrecorded moves at commit time (EXPERIMENTAL)
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
     8
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
     9
This extension checks at commit/amend time if any of the committed files
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
    10
comes from an unrecorded mv.
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
    11
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
    12
The threshold at which a file is considered a move can be set with the
28152
5ec1ce8fdf0a automv: switch to specifying the similarity as an integer (0-100)
Martijn Pieters <mjpieters@fb.com>
parents: 28151
diff changeset
    13
``automv.similarity`` config option. This option takes a percentage between 0
28183
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
    14
(disabled) and 100 (files must be identical), the default is 95.
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
    15
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
    16
"""
28183
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
    17
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
    18
# Using 95 as a default similarity is based on an analysis of the mercurial
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
    19
# repositories of the cpython, mozilla-central & mercurial repositories, as
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
    20
# well as 2 very large facebook repositories. At 95 50% of all potential
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
    21
# missed moves would be caught, as well as correspond with 87% of all
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
    22
# explicitly marked moves.  Together, 80% of moved files are 95% similar or
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
    23
# more.
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
    24
#
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
    25
# See http://markmail.org/thread/5pxnljesvufvom57 for context.
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
    26
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
    27
29205
a0939666b836 py3: move up symbol imports to enforce import-checker rules
Yuya Nishihara <yuya@tcha.org>
parents: 28183
diff changeset
    28
from mercurial.i18n import _
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
    29
from mercurial import (
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
    30
    commands,
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
    31
    copies,
28183
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
    32
    error,
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
    33
    extensions,
34971
38637dd39cfd py3: handle keyword arguments in hgext/automv.py
Pulkit Goyal <7895pulkit@gmail.com>
parents: 33188
diff changeset
    34
    pycompat,
33188
54bc88c56ec8 configitems: register the 'automv.similarity' config
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 31599
diff changeset
    35
    registrar,
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
    36
    scmutil,
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42544
diff changeset
    37
    similar,
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
    38
)
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
    39
33188
54bc88c56ec8 configitems: register the 'automv.similarity' config
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 31599
diff changeset
    40
configtable = {}
54bc88c56ec8 configitems: register the 'automv.similarity' config
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 31599
diff changeset
    41
configitem = registrar.configitem(configtable)
54bc88c56ec8 configitems: register the 'automv.similarity' config
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 31599
diff changeset
    42
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42544
diff changeset
    43
configitem(
45942
89a2afe31e82 formating: upgrade to black 20.8b1
Augie Fackler <raf@durin42.com>
parents: 43077
diff changeset
    44
    b'automv',
89a2afe31e82 formating: upgrade to black 20.8b1
Augie Fackler <raf@durin42.com>
parents: 43077
diff changeset
    45
    b'similarity',
89a2afe31e82 formating: upgrade to black 20.8b1
Augie Fackler <raf@durin42.com>
parents: 43077
diff changeset
    46
    default=95,
33188
54bc88c56ec8 configitems: register the 'automv.similarity' config
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 31599
diff changeset
    47
)
54bc88c56ec8 configitems: register the 'automv.similarity' config
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 31599
diff changeset
    48
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42544
diff changeset
    49
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
    50
def extsetup(ui):
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
    51
    entry = extensions.wrapcommand(commands.table, b'commit', mvcheck)
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
    52
    entry[1].append(
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
    53
        (b'', b'no-automv', None, _(b'disable automatic file move detection'))
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42544
diff changeset
    54
    )
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42544
diff changeset
    55
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
    56
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
    57
def mvcheck(orig, ui, repo, *pats, **opts):
28149
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
    58
    """Hook to check for moves at commit time"""
34971
38637dd39cfd py3: handle keyword arguments in hgext/automv.py
Pulkit Goyal <7895pulkit@gmail.com>
parents: 33188
diff changeset
    59
    opts = pycompat.byteskwargs(opts)
28151
74e3d634a30e automv: do not release lock between marking files and the actual commit
Martijn Pieters <mjpieters@fb.com>
parents: 28150
diff changeset
    60
    renames = None
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
    61
    disabled = opts.pop(b'no_automv', False)
50124
18149ecb5122 automv: lock the repository before searching for renames
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 50043
diff changeset
    62
    with repo.wlock():
18149ecb5122 automv: lock the repository before searching for renames
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 50043
diff changeset
    63
        if not disabled:
18149ecb5122 automv: lock the repository before searching for renames
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 50043
diff changeset
    64
            threshold = ui.configint(b'automv', b'similarity')
18149ecb5122 automv: lock the repository before searching for renames
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 50043
diff changeset
    65
            if not 0 <= threshold <= 100:
18149ecb5122 automv: lock the repository before searching for renames
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 50043
diff changeset
    66
                raise error.Abort(
18149ecb5122 automv: lock the repository before searching for renames
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 50043
diff changeset
    67
                    _(b'automv.similarity must be between 0 and 100')
18149ecb5122 automv: lock the repository before searching for renames
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 50043
diff changeset
    68
                )
18149ecb5122 automv: lock the repository before searching for renames
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 50043
diff changeset
    69
            if threshold > 0:
18149ecb5122 automv: lock the repository before searching for renames
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 50043
diff changeset
    70
                match = scmutil.match(repo[None], pats, opts)
18149ecb5122 automv: lock the repository before searching for renames
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 50043
diff changeset
    71
                added, removed = _interestingfiles(repo, match)
18149ecb5122 automv: lock the repository before searching for renames
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 50043
diff changeset
    72
                uipathfn = scmutil.getuipathfn(repo, legacyrelativevalue=True)
18149ecb5122 automv: lock the repository before searching for renames
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 50043
diff changeset
    73
                renames = _findrenames(
18149ecb5122 automv: lock the repository before searching for renames
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 50043
diff changeset
    74
                    repo, uipathfn, added, removed, threshold / 100.0
18149ecb5122 automv: lock the repository before searching for renames
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 50043
diff changeset
    75
                )
28151
74e3d634a30e automv: do not release lock between marking files and the actual commit
Martijn Pieters <mjpieters@fb.com>
parents: 28150
diff changeset
    76
74e3d634a30e automv: do not release lock between marking files and the actual commit
Martijn Pieters <mjpieters@fb.com>
parents: 28150
diff changeset
    77
        if renames is not None:
50043
5cfc48354d0f dirstate: use `dirstate.change_files` to scope the change in `automv`
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 48875
diff changeset
    78
            with repo.dirstate.changing_files(repo):
5cfc48354d0f dirstate: use `dirstate.change_files` to scope the change in `automv`
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 48875
diff changeset
    79
                # XXX this should be wider and integrated with the commit
5cfc48354d0f dirstate: use `dirstate.change_files` to scope the change in `automv`
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 48875
diff changeset
    80
                # transaction. At the same time as we do the `addremove` logic
5cfc48354d0f dirstate: use `dirstate.change_files` to scope the change in `automv`
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 48875
diff changeset
    81
                # for commit.  However we can't really do better with the
5cfc48354d0f dirstate: use `dirstate.change_files` to scope the change in `automv`
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 48875
diff changeset
    82
                # current extension structure, and this is not worse than what
5cfc48354d0f dirstate: use `dirstate.change_files` to scope the change in `automv`
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 48875
diff changeset
    83
                # happened before.
5cfc48354d0f dirstate: use `dirstate.change_files` to scope the change in `automv`
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 48875
diff changeset
    84
                scmutil._markchanges(repo, (), (), renames)
34971
38637dd39cfd py3: handle keyword arguments in hgext/automv.py
Pulkit Goyal <7895pulkit@gmail.com>
parents: 33188
diff changeset
    85
        return orig(ui, repo, *pats, **pycompat.strkwargs(opts))
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
    86
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42544
diff changeset
    87
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
    88
def _interestingfiles(repo, matcher):
28149
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
    89
    """Find what files were added or removed in this commit.
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
    90
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
    91
    Returns a tuple of two lists: (added, removed). Only files not *already*
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
    92
    marked as moved are included in the added list.
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
    93
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
    94
    """
28146
28024d0d42dc automv: simplify retrieving the status
Martijn Pieters <mjpieters@fb.com>
parents: 28129
diff changeset
    95
    stat = repo.status(match=matcher)
42544
2702dfc7e029 automv: access status fields by name, not index
Martin von Zweigbergk <martinvonz@google.com>
parents: 42543
diff changeset
    96
    added = stat.added
2702dfc7e029 automv: access status fields by name, not index
Martin von Zweigbergk <martinvonz@google.com>
parents: 42543
diff changeset
    97
    removed = stat.removed
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
    98
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
    99
    copy = copies.pathcopies(repo[b'.'], repo[None], matcher)
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
   100
    # remove the copy files for which we already have copy info
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
   101
    added = [f for f in added if f not in copy]
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
   102
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
   103
    return added, removed
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
   104
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42544
diff changeset
   105
41660
f89aad980025 automv: respect ui.relative-paths
Martin von Zweigbergk <martinvonz@google.com>
parents: 34971
diff changeset
   106
def _findrenames(repo, uipathfn, added, removed, similarity):
28149
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
   107
    """Find what files in added are really moved files.
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
   108
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
   109
    Any file named in removed that is at least similarity% similar to a file
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
   110
    in added is seen as a rename.
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
   111
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
   112
    """
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
   113
    renames = {}
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
   114
    if similarity > 0:
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
   115
        for src, dst, score in similar.findrenames(
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42544
diff changeset
   116
            repo, added, removed, similarity
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42544
diff changeset
   117
        ):
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
   118
            if repo.ui.verbose:
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
   119
                repo.ui.status(
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   120
                    _(b'detected move of %s as %s (%d%% similar)\n')
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42544
diff changeset
   121
                    % (uipathfn(src), uipathfn(dst), score * 100)
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42544
diff changeset
   122
                )
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
   123
            renames[dst] = src
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
   124
    if renames:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   125
        repo.ui.status(_(b'detected move of %d files\n') % len(renames))
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
   126
    return renames