annotate hgext/automv.py @ 51907:bd7359c18d69

rev-branch-cache: fallback on "v1" data if no v2 is found This will help smooth the transition to the v2 format for existing large repository.
author Pierre-Yves David <pierre-yves.david@octobus.net>
date Tue, 24 Sep 2024 15:44:10 +0200
parents f4733654f144
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
1 # automv.py
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
2 #
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
3 # Copyright 2013-2016 Facebook, Inc.
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
4 #
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
5 # This software may be used and distributed according to the terms of the
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
6 # GNU General Public License version 2 or any later version.
31599
e4aefdb58ebe automv: use lowercase for docstring title
Jun Wu <quark@fb.com>
parents: 29205
diff changeset
7 """check for unrecorded moves at commit time (EXPERIMENTAL)
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
8
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
9 This extension checks at commit/amend time if any of the committed files
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
10 comes from an unrecorded mv.
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
11
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
12 The threshold at which a file is considered a move can be set with the
28152
5ec1ce8fdf0a automv: switch to specifying the similarity as an integer (0-100)
Martijn Pieters <mjpieters@fb.com>
parents: 28151
diff changeset
13 ``automv.similarity`` config option. This option takes a percentage between 0
28183
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
14 (disabled) and 100 (files must be identical), the default is 95.
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
15
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
16 """
28183
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
17
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
18 # Using 95 as a default similarity is based on an analysis of the mercurial
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
19 # repositories of the cpython, mozilla-central & mercurial repositories, as
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
20 # well as 2 very large facebook repositories. At 95 50% of all potential
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
21 # missed moves would be caught, as well as correspond with 87% of all
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
22 # explicitly marked moves. Together, 80% of moved files are 95% similar or
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
23 # more.
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
24 #
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
25 # See http://markmail.org/thread/5pxnljesvufvom57 for context.
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
26
51863
f4733654f144 typing: add `from __future__ import annotations` to most files
Matt Harbison <matt_harbison@yahoo.com>
parents: 50870
diff changeset
27 from __future__ import annotations
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
28
29205
a0939666b836 py3: move up symbol imports to enforce import-checker rules
Yuya Nishihara <yuya@tcha.org>
parents: 28183
diff changeset
29 from mercurial.i18n import _
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
30 from mercurial import (
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
31 commands,
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
32 copies,
28183
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
33 error,
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
34 extensions,
34971
38637dd39cfd py3: handle keyword arguments in hgext/automv.py
Pulkit Goyal <7895pulkit@gmail.com>
parents: 33188
diff changeset
35 pycompat,
33188
54bc88c56ec8 configitems: register the 'automv.similarity' config
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 31599
diff changeset
36 registrar,
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
37 scmutil,
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42544
diff changeset
38 similar,
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
39 )
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
40
33188
54bc88c56ec8 configitems: register the 'automv.similarity' config
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 31599
diff changeset
41 configtable = {}
54bc88c56ec8 configitems: register the 'automv.similarity' config
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 31599
diff changeset
42 configitem = registrar.configitem(configtable)
54bc88c56ec8 configitems: register the 'automv.similarity' config
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 31599
diff changeset
43
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42544
diff changeset
44 configitem(
45942
89a2afe31e82 formating: upgrade to black 20.8b1
Augie Fackler <raf@durin42.com>
parents: 43077
diff changeset
45 b'automv',
89a2afe31e82 formating: upgrade to black 20.8b1
Augie Fackler <raf@durin42.com>
parents: 43077
diff changeset
46 b'similarity',
89a2afe31e82 formating: upgrade to black 20.8b1
Augie Fackler <raf@durin42.com>
parents: 43077
diff changeset
47 default=95,
33188
54bc88c56ec8 configitems: register the 'automv.similarity' config
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 31599
diff changeset
48 )
54bc88c56ec8 configitems: register the 'automv.similarity' config
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 31599
diff changeset
49
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42544
diff changeset
50
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
51 def extsetup(ui):
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
52 entry = extensions.wrapcommand(commands.table, b'commit', mvcheck)
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
53 entry[1].append(
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
54 (b'', b'no-automv', None, _(b'disable automatic file move detection'))
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42544
diff changeset
55 )
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42544
diff changeset
56
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
57
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
58 def mvcheck(orig, ui, repo, *pats, **opts):
28149
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
59 """Hook to check for moves at commit time"""
28151
74e3d634a30e automv: do not release lock between marking files and the actual commit
Martijn Pieters <mjpieters@fb.com>
parents: 28150
diff changeset
60 renames = None
50870
47af00b2217f automv: migrate `opts` to native kwargs
Matt Harbison <matt_harbison@yahoo.com>
parents: 50124
diff changeset
61 disabled = opts.pop('no_automv', False)
50124
18149ecb5122 automv: lock the repository before searching for renames
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 50043
diff changeset
62 with repo.wlock():
18149ecb5122 automv: lock the repository before searching for renames
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 50043
diff changeset
63 if not disabled:
18149ecb5122 automv: lock the repository before searching for renames
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 50043
diff changeset
64 threshold = ui.configint(b'automv', b'similarity')
18149ecb5122 automv: lock the repository before searching for renames
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 50043
diff changeset
65 if not 0 <= threshold <= 100:
18149ecb5122 automv: lock the repository before searching for renames
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 50043
diff changeset
66 raise error.Abort(
18149ecb5122 automv: lock the repository before searching for renames
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 50043
diff changeset
67 _(b'automv.similarity must be between 0 and 100')
18149ecb5122 automv: lock the repository before searching for renames
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 50043
diff changeset
68 )
18149ecb5122 automv: lock the repository before searching for renames
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 50043
diff changeset
69 if threshold > 0:
50870
47af00b2217f automv: migrate `opts` to native kwargs
Matt Harbison <matt_harbison@yahoo.com>
parents: 50124
diff changeset
70 match = scmutil.match(
47af00b2217f automv: migrate `opts` to native kwargs
Matt Harbison <matt_harbison@yahoo.com>
parents: 50124
diff changeset
71 repo[None], pats, pycompat.byteskwargs(opts)
47af00b2217f automv: migrate `opts` to native kwargs
Matt Harbison <matt_harbison@yahoo.com>
parents: 50124
diff changeset
72 )
50124
18149ecb5122 automv: lock the repository before searching for renames
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 50043
diff changeset
73 added, removed = _interestingfiles(repo, match)
18149ecb5122 automv: lock the repository before searching for renames
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 50043
diff changeset
74 uipathfn = scmutil.getuipathfn(repo, legacyrelativevalue=True)
18149ecb5122 automv: lock the repository before searching for renames
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 50043
diff changeset
75 renames = _findrenames(
18149ecb5122 automv: lock the repository before searching for renames
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 50043
diff changeset
76 repo, uipathfn, added, removed, threshold / 100.0
18149ecb5122 automv: lock the repository before searching for renames
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 50043
diff changeset
77 )
28151
74e3d634a30e automv: do not release lock between marking files and the actual commit
Martijn Pieters <mjpieters@fb.com>
parents: 28150
diff changeset
78
74e3d634a30e automv: do not release lock between marking files and the actual commit
Martijn Pieters <mjpieters@fb.com>
parents: 28150
diff changeset
79 if renames is not None:
50043
5cfc48354d0f dirstate: use `dirstate.change_files` to scope the change in `automv`
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 48875
diff changeset
80 with repo.dirstate.changing_files(repo):
5cfc48354d0f dirstate: use `dirstate.change_files` to scope the change in `automv`
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 48875
diff changeset
81 # XXX this should be wider and integrated with the commit
5cfc48354d0f dirstate: use `dirstate.change_files` to scope the change in `automv`
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 48875
diff changeset
82 # transaction. At the same time as we do the `addremove` logic
5cfc48354d0f dirstate: use `dirstate.change_files` to scope the change in `automv`
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 48875
diff changeset
83 # for commit. However we can't really do better with the
5cfc48354d0f dirstate: use `dirstate.change_files` to scope the change in `automv`
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 48875
diff changeset
84 # current extension structure, and this is not worse than what
5cfc48354d0f dirstate: use `dirstate.change_files` to scope the change in `automv`
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 48875
diff changeset
85 # happened before.
5cfc48354d0f dirstate: use `dirstate.change_files` to scope the change in `automv`
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 48875
diff changeset
86 scmutil._markchanges(repo, (), (), renames)
50870
47af00b2217f automv: migrate `opts` to native kwargs
Matt Harbison <matt_harbison@yahoo.com>
parents: 50124
diff changeset
87 return orig(ui, repo, *pats, **opts)
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
88
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42544
diff changeset
89
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
90 def _interestingfiles(repo, matcher):
28149
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
91 """Find what files were added or removed in this commit.
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
92
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
93 Returns a tuple of two lists: (added, removed). Only files not *already*
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
94 marked as moved are included in the added list.
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
95
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
96 """
28146
28024d0d42dc automv: simplify retrieving the status
Martijn Pieters <mjpieters@fb.com>
parents: 28129
diff changeset
97 stat = repo.status(match=matcher)
42544
2702dfc7e029 automv: access status fields by name, not index
Martin von Zweigbergk <martinvonz@google.com>
parents: 42543
diff changeset
98 added = stat.added
2702dfc7e029 automv: access status fields by name, not index
Martin von Zweigbergk <martinvonz@google.com>
parents: 42543
diff changeset
99 removed = stat.removed
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
100
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
101 copy = copies.pathcopies(repo[b'.'], repo[None], matcher)
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
102 # remove the copy files for which we already have copy info
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
103 added = [f for f in added if f not in copy]
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
104
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
105 return added, removed
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
106
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42544
diff changeset
107
41660
f89aad980025 automv: respect ui.relative-paths
Martin von Zweigbergk <martinvonz@google.com>
parents: 34971
diff changeset
108 def _findrenames(repo, uipathfn, added, removed, similarity):
28149
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
109 """Find what files in added are really moved files.
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
110
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
111 Any file named in removed that is at least similarity% similar to a file
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
112 in added is seen as a rename.
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
113
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
114 """
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
115 renames = {}
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
116 if similarity > 0:
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
117 for src, dst, score in similar.findrenames(
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42544
diff changeset
118 repo, added, removed, similarity
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42544
diff changeset
119 ):
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
120 if repo.ui.verbose:
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
121 repo.ui.status(
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
122 _(b'detected move of %s as %s (%d%% similar)\n')
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42544
diff changeset
123 % (uipathfn(src), uipathfn(dst), score * 100)
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42544
diff changeset
124 )
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
125 renames[dst] = src
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
126 if renames:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
127 repo.ui.status(_(b'detected move of %d files\n') % len(renames))
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
128 return renames