annotate hgext/automv.py @ 42353:f22131315791

tests: make the grep pattern in remotefilelog-gcrepack portable (issue6122) test-remotefilelog-gcrepack was using "\" to escape "|" in the grep pattern. The most of implementations ignore "\" when it is followed by "|", so the regex works. However, OpenBSD doesn't ignore "\" and considers "|" part of the text instead of create two branches. Neither of both behaviors violate POSIX. This change removes the unnecessary escape character and changes grep to egrep, so the extended regular expression works on every unix. This is part of the bug 6122. Tested on OpenBSD, GNU, FreeBSD, NetBSD, Solaris 11 and BusyBox. Credits to Todd C. Miller, Paul de Weerd and Ingo Schwarze for helping me with it.
author Juan Francisco Cantero Hurtado <iam@juanfra.info>
date Tue, 21 May 2019 19:23:14 +0200
parents f89aad980025
children abd902a85040
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
1 # automv.py
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
2 #
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
3 # Copyright 2013-2016 Facebook, Inc.
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
4 #
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
5 # This software may be used and distributed according to the terms of the
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
6 # GNU General Public License version 2 or any later version.
31599
e4aefdb58ebe automv: use lowercase for docstring title
Jun Wu <quark@fb.com>
parents: 29205
diff changeset
7 """check for unrecorded moves at commit time (EXPERIMENTAL)
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
8
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
9 This extension checks at commit/amend time if any of the committed files
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
10 comes from an unrecorded mv.
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
11
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
12 The threshold at which a file is considered a move can be set with the
28152
5ec1ce8fdf0a automv: switch to specifying the similarity as an integer (0-100)
Martijn Pieters <mjpieters@fb.com>
parents: 28151
diff changeset
13 ``automv.similarity`` config option. This option takes a percentage between 0
28183
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
14 (disabled) and 100 (files must be identical), the default is 95.
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
15
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
16 """
28183
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
17
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
18 # Using 95 as a default similarity is based on an analysis of the mercurial
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
19 # repositories of the cpython, mozilla-central & mercurial repositories, as
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
20 # well as 2 very large facebook repositories. At 95 50% of all potential
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
21 # missed moves would be caught, as well as correspond with 87% of all
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
22 # explicitly marked moves. Together, 80% of moved files are 95% similar or
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
23 # more.
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
24 #
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
25 # See http://markmail.org/thread/5pxnljesvufvom57 for context.
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
26
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
27 from __future__ import absolute_import
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
28
29205
a0939666b836 py3: move up symbol imports to enforce import-checker rules
Yuya Nishihara <yuya@tcha.org>
parents: 28183
diff changeset
29 from mercurial.i18n import _
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
30 from mercurial import (
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
31 commands,
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
32 copies,
28183
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
33 error,
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
34 extensions,
34971
38637dd39cfd py3: handle keyword arguments in hgext/automv.py
Pulkit Goyal <7895pulkit@gmail.com>
parents: 33188
diff changeset
35 pycompat,
33188
54bc88c56ec8 configitems: register the 'automv.similarity' config
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 31599
diff changeset
36 registrar,
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
37 scmutil,
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
38 similar
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
39 )
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
40
33188
54bc88c56ec8 configitems: register the 'automv.similarity' config
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 31599
diff changeset
41 configtable = {}
54bc88c56ec8 configitems: register the 'automv.similarity' config
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 31599
diff changeset
42 configitem = registrar.configitem(configtable)
54bc88c56ec8 configitems: register the 'automv.similarity' config
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 31599
diff changeset
43
54bc88c56ec8 configitems: register the 'automv.similarity' config
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 31599
diff changeset
44 configitem('automv', 'similarity',
54bc88c56ec8 configitems: register the 'automv.similarity' config
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 31599
diff changeset
45 default=95,
54bc88c56ec8 configitems: register the 'automv.similarity' config
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 31599
diff changeset
46 )
54bc88c56ec8 configitems: register the 'automv.similarity' config
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 31599
diff changeset
47
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
48 def extsetup(ui):
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
49 entry = extensions.wrapcommand(
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
50 commands.table, 'commit', mvcheck)
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
51 entry[1].append(
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
52 ('', 'no-automv', None,
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
53 _('disable automatic file move detection')))
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
54
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
55 def mvcheck(orig, ui, repo, *pats, **opts):
28149
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
56 """Hook to check for moves at commit time"""
34971
38637dd39cfd py3: handle keyword arguments in hgext/automv.py
Pulkit Goyal <7895pulkit@gmail.com>
parents: 33188
diff changeset
57 opts = pycompat.byteskwargs(opts)
28151
74e3d634a30e automv: do not release lock between marking files and the actual commit
Martijn Pieters <mjpieters@fb.com>
parents: 28150
diff changeset
58 renames = None
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
59 disabled = opts.pop('no_automv', False)
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
60 if not disabled:
33188
54bc88c56ec8 configitems: register the 'automv.similarity' config
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 31599
diff changeset
61 threshold = ui.configint('automv', 'similarity')
28183
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
62 if not 0 <= threshold <= 100:
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
63 raise error.Abort(_('automv.similarity must be between 0 and 100'))
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
64 if threshold > 0:
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
65 match = scmutil.match(repo[None], pats, opts)
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
66 added, removed = _interestingfiles(repo, match)
41660
f89aad980025 automv: respect ui.relative-paths
Martin von Zweigbergk <martinvonz@google.com>
parents: 34971
diff changeset
67 uipathfn = scmutil.getuipathfn(repo, legacyrelativevalue=True)
f89aad980025 automv: respect ui.relative-paths
Martin von Zweigbergk <martinvonz@google.com>
parents: 34971
diff changeset
68 renames = _findrenames(repo, uipathfn, added, removed,
28152
5ec1ce8fdf0a automv: switch to specifying the similarity as an integer (0-100)
Martijn Pieters <mjpieters@fb.com>
parents: 28151
diff changeset
69 threshold / 100.0)
28151
74e3d634a30e automv: do not release lock between marking files and the actual commit
Martijn Pieters <mjpieters@fb.com>
parents: 28150
diff changeset
70
74e3d634a30e automv: do not release lock between marking files and the actual commit
Martijn Pieters <mjpieters@fb.com>
parents: 28150
diff changeset
71 with repo.wlock():
74e3d634a30e automv: do not release lock between marking files and the actual commit
Martijn Pieters <mjpieters@fb.com>
parents: 28150
diff changeset
72 if renames is not None:
28150
7a984cece04a automv: reuse existing scutil._markchanges() function
Martijn Pieters <mjpieters@fb.com>
parents: 28149
diff changeset
73 scmutil._markchanges(repo, (), (), renames)
34971
38637dd39cfd py3: handle keyword arguments in hgext/automv.py
Pulkit Goyal <7895pulkit@gmail.com>
parents: 33188
diff changeset
74 return orig(ui, repo, *pats, **pycompat.strkwargs(opts))
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
75
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
76 def _interestingfiles(repo, matcher):
28149
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
77 """Find what files were added or removed in this commit.
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
78
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
79 Returns a tuple of two lists: (added, removed). Only files not *already*
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
80 marked as moved are included in the added list.
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
81
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
82 """
28146
28024d0d42dc automv: simplify retrieving the status
Martijn Pieters <mjpieters@fb.com>
parents: 28129
diff changeset
83 stat = repo.status(match=matcher)
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
84 added = stat[1]
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
85 removed = stat[2]
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
86
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
87 copy = copies._forwardcopies(repo['.'], repo[None], matcher)
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
88 # remove the copy files for which we already have copy info
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
89 added = [f for f in added if f not in copy]
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
90
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
91 return added, removed
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
92
41660
f89aad980025 automv: respect ui.relative-paths
Martin von Zweigbergk <martinvonz@google.com>
parents: 34971
diff changeset
93 def _findrenames(repo, uipathfn, added, removed, similarity):
28149
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
94 """Find what files in added are really moved files.
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
95
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
96 Any file named in removed that is at least similarity% similar to a file
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
97 in added is seen as a rename.
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
98
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
99 """
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
100 renames = {}
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
101 if similarity > 0:
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
102 for src, dst, score in similar.findrenames(
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
103 repo, added, removed, similarity):
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
104 if repo.ui.verbose:
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
105 repo.ui.status(
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
106 _('detected move of %s as %s (%d%% similar)\n') % (
41660
f89aad980025 automv: respect ui.relative-paths
Martin von Zweigbergk <martinvonz@google.com>
parents: 34971
diff changeset
107 uipathfn(src), uipathfn(dst), score * 100))
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
108 renames[dst] = src
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
109 if renames:
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
110 repo.ui.status(_('detected move of %d files\n') % len(renames))
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
111 return renames