tests/test-addremove-similar
author David Greenaway <hg-dev@davidgreenaway.com>
Sat, 03 Apr 2010 11:58:16 +1100
changeset 11060 e6df01776e08
parent 8489 1a96f1d9599b
permissions -rwxr-xr-x
findrenames: Optimise "addremove -s100" by matching files by their SHA1 hashes. We speed up 'findrenames' for the usecase when a user specifies they want a similarity of 100% by matching files by their exact SHA1 hash value. This reduces the number of comparisons required to find exact matches from O(n^2) to O(n). While it would be nice if we could just use mercurial's pre-calculated SHA1 hash for existing files, this hash includes the file's ancestor information making it unsuitable for our purposes. Instead, we calculate the hash of old content from scratch. The following benchmarks were taken on the current head of crew: addremove 100% similarity: rm -rf *; hg up -C; mv tests tests.new hg --time addremove -s100 --dry-run before: real 176.350 secs (user 128.890+0.000 sys 47.430+0.000) after: real 2.130 secs (user 1.890+0.000 sys 0.240+0.000) addremove 75% similarity: rm -rf *; hg up -C; mv tests tests.new; \ for i in tests.new/*; do echo x >> $i; done hg --time addremove -s75 --dry-run before: real 264.560 secs (user 215.130+0.000 sys 49.410+0.000) after: real 218.710 secs (user 172.790+0.000 sys 45.870+0.000)
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
4135
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
     1
#!/bin/sh
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
     2
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
     3
hg init rep; cd rep
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
     4
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
     5
touch empty-file
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
     6
python -c 'for x in range(10000): print x' > large-file
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
     7
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
     8
hg addremove
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
     9
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    10
hg commit -m A
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    11
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    12
rm large-file empty-file
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    13
python -c 'for x in range(10,10000): print x' > another-file
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    14
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    15
hg addremove -s50
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    16
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    17
hg commit -m B
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    18
4472
736e49292809 addremove: comparing two empty files caused ZeroDivisionError
Thomas Arendsen Hein <thomas@intevation.de>
parents: 4135
diff changeset
    19
echo % comparing two empty files caused ZeroDivisionError in the past
736e49292809 addremove: comparing two empty files caused ZeroDivisionError
Thomas Arendsen Hein <thomas@intevation.de>
parents: 4135
diff changeset
    20
hg update -C 0
736e49292809 addremove: comparing two empty files caused ZeroDivisionError
Thomas Arendsen Hein <thomas@intevation.de>
parents: 4135
diff changeset
    21
rm empty-file
736e49292809 addremove: comparing two empty files caused ZeroDivisionError
Thomas Arendsen Hein <thomas@intevation.de>
parents: 4135
diff changeset
    22
touch another-empty-file
736e49292809 addremove: comparing two empty files caused ZeroDivisionError
Thomas Arendsen Hein <thomas@intevation.de>
parents: 4135
diff changeset
    23
hg addremove -s50
736e49292809 addremove: comparing two empty files caused ZeroDivisionError
Thomas Arendsen Hein <thomas@intevation.de>
parents: 4135
diff changeset
    24
4135
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    25
cd ..
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    26
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    27
hg init rep2; cd rep2
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    28
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    29
python -c 'for x in range(10000): print x' > large-file
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    30
python -c 'for x in range(50): print x' > tiny-file
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    31
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    32
hg addremove
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    33
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    34
hg commit -m A
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    35
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    36
python -c 'for x in range(70): print x' > small-file
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    37
rm tiny-file
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    38
rm large-file
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    39
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    40
hg addremove -s50
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    41
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    42
hg commit -m B
6cb6cfe43c5d Avoid some false positives for addremove -s
Erling Ellingsen <erlingalf@gmail.com>
parents:
diff changeset
    43
4966
8d982aef0be1 addremove: print meaningful error message if --similar not numeric
Bryan O'Sullivan <bos@serpentine.com>
parents: 4472
diff changeset
    44
echo % should all fail
8d982aef0be1 addremove: print meaningful error message if --similar not numeric
Bryan O'Sullivan <bos@serpentine.com>
parents: 4472
diff changeset
    45
hg addremove -s foo
8d982aef0be1 addremove: print meaningful error message if --similar not numeric
Bryan O'Sullivan <bos@serpentine.com>
parents: 4472
diff changeset
    46
hg addremove -s -1
8d982aef0be1 addremove: print meaningful error message if --similar not numeric
Bryan O'Sullivan <bos@serpentine.com>
parents: 4472
diff changeset
    47
hg addremove -s 1e6
8d982aef0be1 addremove: print meaningful error message if --similar not numeric
Bryan O'Sullivan <bos@serpentine.com>
parents: 4472
diff changeset
    48
8489
1a96f1d9599b addremove/findrenames: find renames according to the match object (issue1527)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 4966
diff changeset
    49
cd ..
1a96f1d9599b addremove/findrenames: find renames according to the match object (issue1527)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 4966
diff changeset
    50
1a96f1d9599b addremove/findrenames: find renames according to the match object (issue1527)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 4966
diff changeset
    51
echo '% issue 1527'
1a96f1d9599b addremove/findrenames: find renames according to the match object (issue1527)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 4966
diff changeset
    52
hg init rep3; cd rep3
1a96f1d9599b addremove/findrenames: find renames according to the match object (issue1527)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 4966
diff changeset
    53
mkdir d
1a96f1d9599b addremove/findrenames: find renames according to the match object (issue1527)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 4966
diff changeset
    54
echo a > d/a
1a96f1d9599b addremove/findrenames: find renames according to the match object (issue1527)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 4966
diff changeset
    55
hg add d/a
1a96f1d9599b addremove/findrenames: find renames according to the match object (issue1527)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 4966
diff changeset
    56
hg commit -m 1
1a96f1d9599b addremove/findrenames: find renames according to the match object (issue1527)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 4966
diff changeset
    57
1a96f1d9599b addremove/findrenames: find renames according to the match object (issue1527)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 4966
diff changeset
    58
mv d/a d/b
1a96f1d9599b addremove/findrenames: find renames according to the match object (issue1527)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 4966
diff changeset
    59
hg addremove -s80
1a96f1d9599b addremove/findrenames: find renames according to the match object (issue1527)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 4966
diff changeset
    60
hg debugstate
1a96f1d9599b addremove/findrenames: find renames according to the match object (issue1527)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 4966
diff changeset
    61
mv d/b c
1a96f1d9599b addremove/findrenames: find renames according to the match object (issue1527)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 4966
diff changeset
    62
echo "% no copies found here (since the target isn't in d"
1a96f1d9599b addremove/findrenames: find renames according to the match object (issue1527)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 4966
diff changeset
    63
hg addremove -s80 d
1a96f1d9599b addremove/findrenames: find renames according to the match object (issue1527)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 4966
diff changeset
    64
echo "% copies here"
1a96f1d9599b addremove/findrenames: find renames according to the match object (issue1527)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 4966
diff changeset
    65
hg addremove -s80
1a96f1d9599b addremove/findrenames: find renames according to the match object (issue1527)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 4966
diff changeset
    66
4966
8d982aef0be1 addremove: print meaningful error message if --similar not numeric
Bryan O'Sullivan <bos@serpentine.com>
parents: 4472
diff changeset
    67
true