Mercurial > hg-stable
annotate tests/test-addremove-similar @ 4135:6cb6cfe43c5d
Avoid some false positives for addremove -s
The original code uses the similary score
1 - len(diff(after, before)) / len(after)
The diff can at most be the size of the 'before' file, so any small
'before' file would be considered very similar. Removing an empty file
would cause all files added in the same revision to be considered
copies of the removed file.
This changes the metric to
bytes_overlap(before, after) / len(before + after)
i.e. the actual percentage of bytes shared between the two files.
author | Erling Ellingsen <erlingalf@gmail.com> |
---|---|
date | Sun, 18 Feb 2007 20:39:25 +0100 |
parents | |
children | 736e49292809 |
rev | line source |
---|---|
4135
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
1 #!/bin/sh |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
2 |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
3 hg init rep; cd rep |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
4 |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
5 touch empty-file |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
6 python -c 'for x in range(10000): print x' > large-file |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
7 |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
8 hg addremove |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
9 |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
10 hg commit -m A |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
11 |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
12 rm large-file empty-file |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
13 python -c 'for x in range(10,10000): print x' > another-file |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
14 |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
15 hg addremove -s50 |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
16 |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
17 hg commit -m B |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
18 |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
19 cd .. |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
20 |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
21 hg init rep2; cd rep2 |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
22 |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
23 python -c 'for x in range(10000): print x' > large-file |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
24 python -c 'for x in range(50): print x' > tiny-file |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
25 |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
26 hg addremove |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
27 |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
28 hg commit -m A |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
29 |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
30 python -c 'for x in range(70): print x' > small-file |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
31 rm tiny-file |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
32 rm large-file |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
33 |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
34 hg addremove -s50 |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
35 |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
36 hg commit -m B |
6cb6cfe43c5d
Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff
changeset
|
37 |