automv: use 95 as the default similarity threshold
The motivation for the change from 100 to 95 is included in a comment.
* Updated the tests to include a change to a moved file that still should be
caught as a move.
* Use ui.configint() to non-integer configuration entries more gracefully. Also
complain if a similarity outside of the acceptable range is set.
--- a/hgext/automv.py Fri Feb 19 22:28:09 2016 +0100
+++ b/hgext/automv.py Tue Feb 16 15:58:32 2016 +0000
@@ -11,14 +11,25 @@
The threshold at which a file is considered a move can be set with the
``automv.similarity`` config option. This option takes a percentage between 0
-(disabled) and 100 (files must be identical), the default is 100.
+(disabled) and 100 (files must be identical), the default is 95.
"""
+
+# Using 95 as a default similarity is based on an analysis of the mercurial
+# repositories of the cpython, mozilla-central & mercurial repositories, as
+# well as 2 very large facebook repositories. At 95 50% of all potential
+# missed moves would be caught, as well as correspond with 87% of all
+# explicitly marked moves. Together, 80% of moved files are 95% similar or
+# more.
+#
+# See http://markmail.org/thread/5pxnljesvufvom57 for context.
+
from __future__ import absolute_import
from mercurial import (
commands,
copies,
+ error,
extensions,
scmutil,
similar
@@ -37,7 +48,9 @@
renames = None
disabled = opts.pop('no_automv', False)
if not disabled:
- threshold = float(ui.config('automv', 'similarity', '100'))
+ threshold = ui.configint('automv', 'similarity', 95)
+ if not 0 <= threshold <= 100:
+ raise error.Abort(_('automv.similarity must be between 0 and 100'))
if threshold > 0:
match = scmutil.match(repo[None], pats, opts)
added, removed = _interestingfiles(repo, match)
--- a/tests/test-automv.t Fri Feb 19 22:28:09 2016 +0100
+++ b/tests/test-automv.t Tue Feb 16 15:58:32 2016 +0000
@@ -13,7 +13,7 @@
Test automv command for commit
- $ echo 'foo' > a.txt
+ $ printf 'foo\nbar\nbaz\n' > a.txt
$ hg add a.txt
$ hg commit -m 'init repo with a'
@@ -37,6 +37,24 @@
$ mv a.txt b.txt
$ hg rm a.txt
$ hg add b.txt
+ $ printf '\n' >> b.txt
+ $ hg status -C
+ A b.txt
+ R a.txt
+ $ hg commit -m 'msg'
+ detected move of 1 files
+ created new head
+ $ hg status --change . -C
+ A b.txt
+ a.txt
+ R a.txt
+ $ hg up -r 0
+ 1 files updated, 0 files merged, 1 files removed, 0 files unresolved
+
+mv/rm/add/modif
+ $ mv a.txt b.txt
+ $ hg rm a.txt
+ $ hg add b.txt
$ printf '\nfoo\n' >> b.txt
$ hg status -C
A b.txt
@@ -161,6 +179,29 @@
$ mv a.txt b.txt
$ hg rm a.txt
$ hg add b.txt
+ $ printf '\n' >> b.txt
+ $ hg status -C
+ A b.txt
+ R a.txt
+ $ hg commit --amend -m 'amended'
+ detected move of 1 files
+ saved backup bundle to $TESTTMP/repo/.hg/strip-backup/*-amend-backup.hg (glob)
+ $ hg status --change . -C
+ A b.txt
+ a.txt
+ A c.txt
+ R a.txt
+ $ hg up -r 0
+ 1 files updated, 0 files merged, 2 files removed, 0 files unresolved
+
+mv/rm/add/modif
+ $ echo 'c' > c.txt
+ $ hg add c.txt
+ $ hg commit -m 'revision to amend to'
+ created new head
+ $ mv a.txt b.txt
+ $ hg rm a.txt
+ $ hg add b.txt
$ printf '\nfoo\n' >> b.txt
$ hg status -C
A b.txt
@@ -285,3 +326,13 @@
$ hg status --change . -C
A b.txt
R a.txt
+
+error conditions
+
+ $ cat >> $HGRCPATH << EOF
+ > [automv]
+ > similarity=110
+ > EOF
+ $ hg commit -m 'revision to amend to'
+ abort: automv.similarity must be between 0 and 100
+ [255]