annotate hgext/automv.py @ 33649:377e8ddaebef stable

pathauditor: disable cache of audited paths by default (issue5628) The initial attempt was to discard cache when appropriate, but it appears to be error prone. We had to carefully inspect all places where audit() is called e.g. without actually updating filesystem, before removing files and directories, etc. So, this patch disables the cache of audited paths by default, and enables it only for the following cases: - short-lived auditor objects - repo.vfs, repo.svfs, and repo.cachevfs, which are managed directories and considered sort of append-only (a file/directory would never be replaced with a symlink) There would be more cacheable vfs objects (e.g. mq.queue.opener), but I decided not to inspect all of them in this patch. We can make them cached later. Benchmark result: - using old clone of http://selenic.com/repo/linux-2.6/ (38319 files) - on tmpfs - run HGRCPATH=/dev/null hg up -q --time tip && hg up -q null - try 4 times and take the last three results original: real 7.480 secs (user 1.140+22.760 sys 0.150+1.690) real 8.010 secs (user 1.070+22.280 sys 0.170+2.120) real 7.470 secs (user 1.120+22.390 sys 0.120+1.910) clearcache (the other series): real 7.680 secs (user 1.120+23.420 sys 0.140+1.970) real 7.670 secs (user 1.110+23.620 sys 0.130+1.810) real 7.740 secs (user 1.090+23.510 sys 0.160+1.940) enable cache only for vfs and svfs (this series): real 8.730 secs (user 1.500+25.190 sys 0.260+2.260) real 8.750 secs (user 1.490+25.170 sys 0.250+2.340) real 9.010 secs (user 1.680+25.340 sys 0.280+2.540) remove cache function at all (for reference): real 9.620 secs (user 1.440+27.120 sys 0.250+2.980) real 9.420 secs (user 1.400+26.940 sys 0.320+3.130) real 9.760 secs (user 1.530+27.270 sys 0.250+2.970)
author Yuya Nishihara <yuya@tcha.org>
date Wed, 26 Jul 2017 22:10:15 +0900
parents 54bc88c56ec8
children 38637dd39cfd
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
1 # automv.py
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
2 #
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
3 # Copyright 2013-2016 Facebook, Inc.
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
4 #
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
5 # This software may be used and distributed according to the terms of the
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
6 # GNU General Public License version 2 or any later version.
31599
e4aefdb58ebe automv: use lowercase for docstring title
Jun Wu <quark@fb.com>
parents: 29205
diff changeset
7 """check for unrecorded moves at commit time (EXPERIMENTAL)
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
8
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
9 This extension checks at commit/amend time if any of the committed files
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
10 comes from an unrecorded mv.
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
11
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
12 The threshold at which a file is considered a move can be set with the
28152
5ec1ce8fdf0a automv: switch to specifying the similarity as an integer (0-100)
Martijn Pieters <mjpieters@fb.com>
parents: 28151
diff changeset
13 ``automv.similarity`` config option. This option takes a percentage between 0
28183
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
14 (disabled) and 100 (files must be identical), the default is 95.
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
15
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
16 """
28183
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
17
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
18 # Using 95 as a default similarity is based on an analysis of the mercurial
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
19 # repositories of the cpython, mozilla-central & mercurial repositories, as
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
20 # well as 2 very large facebook repositories. At 95 50% of all potential
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
21 # missed moves would be caught, as well as correspond with 87% of all
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
22 # explicitly marked moves. Together, 80% of moved files are 95% similar or
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
23 # more.
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
24 #
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
25 # See http://markmail.org/thread/5pxnljesvufvom57 for context.
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
26
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
27 from __future__ import absolute_import
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
28
29205
a0939666b836 py3: move up symbol imports to enforce import-checker rules
Yuya Nishihara <yuya@tcha.org>
parents: 28183
diff changeset
29 from mercurial.i18n import _
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
30 from mercurial import (
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
31 commands,
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
32 copies,
28183
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
33 error,
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
34 extensions,
33188
54bc88c56ec8 configitems: register the 'automv.similarity' config
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 31599
diff changeset
35 registrar,
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
36 scmutil,
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
37 similar
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
38 )
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
39
33188
54bc88c56ec8 configitems: register the 'automv.similarity' config
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 31599
diff changeset
40 configtable = {}
54bc88c56ec8 configitems: register the 'automv.similarity' config
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 31599
diff changeset
41 configitem = registrar.configitem(configtable)
54bc88c56ec8 configitems: register the 'automv.similarity' config
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 31599
diff changeset
42
54bc88c56ec8 configitems: register the 'automv.similarity' config
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 31599
diff changeset
43 configitem('automv', 'similarity',
54bc88c56ec8 configitems: register the 'automv.similarity' config
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 31599
diff changeset
44 default=95,
54bc88c56ec8 configitems: register the 'automv.similarity' config
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 31599
diff changeset
45 )
54bc88c56ec8 configitems: register the 'automv.similarity' config
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 31599
diff changeset
46
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
47 def extsetup(ui):
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
48 entry = extensions.wrapcommand(
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
49 commands.table, 'commit', mvcheck)
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
50 entry[1].append(
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
51 ('', 'no-automv', None,
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
52 _('disable automatic file move detection')))
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
53
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
54 def mvcheck(orig, ui, repo, *pats, **opts):
28149
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
55 """Hook to check for moves at commit time"""
28151
74e3d634a30e automv: do not release lock between marking files and the actual commit
Martijn Pieters <mjpieters@fb.com>
parents: 28150
diff changeset
56 renames = None
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
57 disabled = opts.pop('no_automv', False)
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
58 if not disabled:
33188
54bc88c56ec8 configitems: register the 'automv.similarity' config
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 31599
diff changeset
59 threshold = ui.configint('automv', 'similarity')
28183
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
60 if not 0 <= threshold <= 100:
e07daee83029 automv: use 95 as the default similarity threshold
Martijn Pieters <mjpieters@fb.com>
parents: 28152
diff changeset
61 raise error.Abort(_('automv.similarity must be between 0 and 100'))
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
62 if threshold > 0:
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
63 match = scmutil.match(repo[None], pats, opts)
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
64 added, removed = _interestingfiles(repo, match)
28152
5ec1ce8fdf0a automv: switch to specifying the similarity as an integer (0-100)
Martijn Pieters <mjpieters@fb.com>
parents: 28151
diff changeset
65 renames = _findrenames(repo, match, added, removed,
5ec1ce8fdf0a automv: switch to specifying the similarity as an integer (0-100)
Martijn Pieters <mjpieters@fb.com>
parents: 28151
diff changeset
66 threshold / 100.0)
28151
74e3d634a30e automv: do not release lock between marking files and the actual commit
Martijn Pieters <mjpieters@fb.com>
parents: 28150
diff changeset
67
74e3d634a30e automv: do not release lock between marking files and the actual commit
Martijn Pieters <mjpieters@fb.com>
parents: 28150
diff changeset
68 with repo.wlock():
74e3d634a30e automv: do not release lock between marking files and the actual commit
Martijn Pieters <mjpieters@fb.com>
parents: 28150
diff changeset
69 if renames is not None:
28150
7a984cece04a automv: reuse existing scutil._markchanges() function
Martijn Pieters <mjpieters@fb.com>
parents: 28149
diff changeset
70 scmutil._markchanges(repo, (), (), renames)
28151
74e3d634a30e automv: do not release lock between marking files and the actual commit
Martijn Pieters <mjpieters@fb.com>
parents: 28150
diff changeset
71 return orig(ui, repo, *pats, **opts)
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
72
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
73 def _interestingfiles(repo, matcher):
28149
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
74 """Find what files were added or removed in this commit.
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
75
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
76 Returns a tuple of two lists: (added, removed). Only files not *already*
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
77 marked as moved are included in the added list.
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
78
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
79 """
28146
28024d0d42dc automv: simplify retrieving the status
Martijn Pieters <mjpieters@fb.com>
parents: 28129
diff changeset
80 stat = repo.status(match=matcher)
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
81 added = stat[1]
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
82 removed = stat[2]
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
83
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
84 copy = copies._forwardcopies(repo['.'], repo[None], matcher)
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
85 # remove the copy files for which we already have copy info
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
86 added = [f for f in added if f not in copy]
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
87
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
88 return added, removed
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
89
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
90 def _findrenames(repo, matcher, added, removed, similarity):
28149
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
91 """Find what files in added are really moved files.
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
92
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
93 Any file named in removed that is at least similarity% similar to a file
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
94 in added is seen as a rename.
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
95
d356d5250ab2 automv: improve function docstrings
Martijn Pieters <mjpieters@fb.com>
parents: 28148
diff changeset
96 """
28129
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
97 renames = {}
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
98 if similarity > 0:
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
99 for src, dst, score in similar.findrenames(
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
100 repo, added, removed, similarity):
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
101 if repo.ui.verbose:
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
102 repo.ui.status(
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
103 _('detected move of %s as %s (%d%% similar)\n') % (
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
104 matcher.rel(src), matcher.rel(dst), score * 100))
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
105 renames[dst] = src
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
106 if renames:
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
107 repo.ui.status(_('detected move of %d files\n') % len(renames))
7c40b4b7f8f1 automv: new experimental extension
Martijn Pieters <mjpieters@fb.com>
parents:
diff changeset
108 return renames