tests/md5sum.py
author Kyle Lippincott <spectral@google.com>
Wed, 31 Mar 2021 12:46:54 -0700
changeset 46872 8bca353b1ebc
parent 45830 c102b704edb5
child 48875 6000f5b25c9b
permissions -rwxr-xr-x
match: convert O(n) to O(log n) in exactmatcher.visitchildrenset When using narrow, during rebase this is called (at least) once per directory in the set of files in the commit being rebased. Every time it's called, we did the set arithmetic (now extracted and cached), which was probably pretty cheap but not necessary to repeat each time, looped over every item in the matcher and kept things that started with the directory we were querying. With very large narrowspecs, and a commit that touched a file in a large number of directories, this was slow. In a pathological repo, the rebase of a single commit (that touched over 17k files, I believe in approximately as many directories) with a narrowspec that had >32k entries took 8,246s of profiled time, with 5,007s of that spent in visitchildrenset (transitively). With this change, the time spent in visitchildrenset is less than 34s (which is where my profile cut off). Most of the remaining time was network access due to our custom remotefilelog-based setup not properly prefetching. Differential Revision: https://phab.mercurial-scm.org/D10294
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
45830
c102b704edb5 global: use python3 in shebangs
Gregory Szorc <gregory.szorc@gmail.com>
parents: 43076
diff changeset
     1
#!/usr/bin/env python3
1928
50e1c90b0fcf clarify license on md5sum.py
Peter van Dijk <peter@dataloss.nl>
parents: 1924
diff changeset
     2
#
50e1c90b0fcf clarify license on md5sum.py
Peter van Dijk <peter@dataloss.nl>
parents: 1924
diff changeset
     3
# Based on python's Tools/scripts/md5sum.py
50e1c90b0fcf clarify license on md5sum.py
Peter van Dijk <peter@dataloss.nl>
parents: 1924
diff changeset
     4
#
50e1c90b0fcf clarify license on md5sum.py
Peter van Dijk <peter@dataloss.nl>
parents: 1924
diff changeset
     5
# This software may be used and distributed according to the terms
50e1c90b0fcf clarify license on md5sum.py
Peter van Dijk <peter@dataloss.nl>
parents: 1924
diff changeset
     6
# of the PYTHON SOFTWARE FOUNDATION LICENSE VERSION 2, which is
50e1c90b0fcf clarify license on md5sum.py
Peter van Dijk <peter@dataloss.nl>
parents: 1924
diff changeset
     7
# GPL-compatible.
50e1c90b0fcf clarify license on md5sum.py
Peter van Dijk <peter@dataloss.nl>
parents: 1924
diff changeset
     8
29485
6a98f9408a50 py3: make files use absolute_import and print_function
Pulkit Goyal <7895pulkit@gmail.com>
parents: 25660
diff changeset
     9
from __future__ import absolute_import
6a98f9408a50 py3: make files use absolute_import and print_function
Pulkit Goyal <7895pulkit@gmail.com>
parents: 25660
diff changeset
    10
33873
904bc1dc2694 md5sum: assume hashlib exists now that we're 2.7 only
Augie Fackler <raf@durin42.com>
parents: 32852
diff changeset
    11
import hashlib
29485
6a98f9408a50 py3: make files use absolute_import and print_function
Pulkit Goyal <7895pulkit@gmail.com>
parents: 25660
diff changeset
    12
import os
6a98f9408a50 py3: make files use absolute_import and print_function
Pulkit Goyal <7895pulkit@gmail.com>
parents: 25660
diff changeset
    13
import sys
6470
ac0bcd951c2c python 2.6 compatibility: compatibility wrappers for hash functions
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 6212
diff changeset
    14
ac0bcd951c2c python 2.6 compatibility: compatibility wrappers for hash functions
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 6212
diff changeset
    15
try:
7080
a6477aa893b8 tests: Windows compatibility fixes
Patrick Mezard <pmezard@gmail.com>
parents: 6470
diff changeset
    16
    import msvcrt
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 33873
diff changeset
    17
7080
a6477aa893b8 tests: Windows compatibility fixes
Patrick Mezard <pmezard@gmail.com>
parents: 6470
diff changeset
    18
    msvcrt.setmode(sys.stdout.fileno(), os.O_BINARY)
a6477aa893b8 tests: Windows compatibility fixes
Patrick Mezard <pmezard@gmail.com>
parents: 6470
diff changeset
    19
    msvcrt.setmode(sys.stderr.fileno(), os.O_BINARY)
a6477aa893b8 tests: Windows compatibility fixes
Patrick Mezard <pmezard@gmail.com>
parents: 6470
diff changeset
    20
except ImportError:
a6477aa893b8 tests: Windows compatibility fixes
Patrick Mezard <pmezard@gmail.com>
parents: 6470
diff changeset
    21
    pass
a6477aa893b8 tests: Windows compatibility fixes
Patrick Mezard <pmezard@gmail.com>
parents: 6470
diff changeset
    22
1924
46fb38ef9a91 add md5sum.py required by fix in previous changeset
Peter van Dijk <peter@dataloss.nl>
parents:
diff changeset
    23
for filename in sys.argv[1:]:
46fb38ef9a91 add md5sum.py required by fix in previous changeset
Peter van Dijk <peter@dataloss.nl>
parents:
diff changeset
    24
    try:
46fb38ef9a91 add md5sum.py required by fix in previous changeset
Peter van Dijk <peter@dataloss.nl>
parents:
diff changeset
    25
        fp = open(filename, 'rb')
25660
328739ea70c3 global: mass rewrite to use modern exception syntax
Gregory Szorc <gregory.szorc@gmail.com>
parents: 14494
diff changeset
    26
    except IOError as msg:
1924
46fb38ef9a91 add md5sum.py required by fix in previous changeset
Peter van Dijk <peter@dataloss.nl>
parents:
diff changeset
    27
        sys.stderr.write('%s: Can\'t open: %s\n' % (filename, msg))
46fb38ef9a91 add md5sum.py required by fix in previous changeset
Peter van Dijk <peter@dataloss.nl>
parents:
diff changeset
    28
        sys.exit(1)
3223
53e843840349 Whitespace/Tab cleanup
Thomas Arendsen Hein <thomas@intevation.de>
parents: 1928
diff changeset
    29
33873
904bc1dc2694 md5sum: assume hashlib exists now that we're 2.7 only
Augie Fackler <raf@durin42.com>
parents: 32852
diff changeset
    30
    m = hashlib.md5()
1924
46fb38ef9a91 add md5sum.py required by fix in previous changeset
Peter van Dijk <peter@dataloss.nl>
parents:
diff changeset
    31
    try:
32852
3a64ac39b893 md5sum: adapt for python 3 support
Augie Fackler <augie@google.com>
parents: 29731
diff changeset
    32
        for data in iter(lambda: fp.read(8192), b''):
1924
46fb38ef9a91 add md5sum.py required by fix in previous changeset
Peter van Dijk <peter@dataloss.nl>
parents:
diff changeset
    33
            m.update(data)
25660
328739ea70c3 global: mass rewrite to use modern exception syntax
Gregory Szorc <gregory.szorc@gmail.com>
parents: 14494
diff changeset
    34
    except IOError as msg:
1924
46fb38ef9a91 add md5sum.py required by fix in previous changeset
Peter van Dijk <peter@dataloss.nl>
parents:
diff changeset
    35
        sys.stderr.write('%s: I/O error: %s\n' % (filename, msg))
46fb38ef9a91 add md5sum.py required by fix in previous changeset
Peter van Dijk <peter@dataloss.nl>
parents:
diff changeset
    36
        sys.exit(1)
46fb38ef9a91 add md5sum.py required by fix in previous changeset
Peter van Dijk <peter@dataloss.nl>
parents:
diff changeset
    37
    sys.stdout.write('%s  %s\n' % (m.hexdigest(), filename))
46fb38ef9a91 add md5sum.py required by fix in previous changeset
Peter van Dijk <peter@dataloss.nl>
parents:
diff changeset
    38
46fb38ef9a91 add md5sum.py required by fix in previous changeset
Peter van Dijk <peter@dataloss.nl>
parents:
diff changeset
    39
sys.exit(0)