contrib/dumprevlog
author Mads Kiilerich <madski@unity3d.com>
Mon, 28 Jan 2013 15:19:44 +0100
branchstable
changeset 18488 a977b42df8b3
parent 14233 659f34b833b9
child 29165 a212ca70205c
permissions -rwxr-xr-x
largefiles: don't verify largefile hashes on servers when processing statlfile When changesets referencing largefiles are pushed then the corresponding largefiles will be pushed too - unless the target already has them. The client will use statlfile to make sure it only sends largefiles that the target doesn't have. The server would however on every statlfile check that the content of the largefile had the expected hash. What should be cheap thus became an expensive operation that trashed the disk and the cache. Largefile hashes are already checked by putlfile before being stored on the server. A server should thus be able to keep its largefile store free of errors - even more than it can keep revlogs free of errors. Verification should happen when running 'hg verify' locally on the server. Rehashing every largefile on every remote stat is too expensive. Clients will also stat lfiles before downloading them. When the server verified the hash in stat it meant that it had to read the file twice to serve it. With this change the server will assume its own hashes are ok without checking them on every statlfile. Some consequences of this change: - in case of server side corruption the problem will be detected by the existing check on the client side - not on server side - clients that could upload an uncorrupted largefile when pushing will no longer magically heal the server (and break hardlinks) - a client will now only upload its uncorrupted files after the corrupted file has been removed on the server side - client side verify will no longer report corruption in files it doesn't have (Issue3123 discussed related problems - and how they have been fixed.)
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
6433
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     1
#!/usr/bin/env python
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     2
# Dump revlogs as raw data stream
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     3
# $ find .hg/store/ -name "*.i" | xargs dumprevlog > repo.dump
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     4
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     5
import sys
6466
9c426da6b03b contrib: fix binary file issues with dumprevlog on Windows
Adrian Buehlmann <adrian@cadifra.com>
parents: 6433
diff changeset
     6
from mercurial import revlog, node, util
9c426da6b03b contrib: fix binary file issues with dumprevlog on Windows
Adrian Buehlmann <adrian@cadifra.com>
parents: 6433
diff changeset
     7
9c426da6b03b contrib: fix binary file issues with dumprevlog on Windows
Adrian Buehlmann <adrian@cadifra.com>
parents: 6433
diff changeset
     8
for fp in (sys.stdin, sys.stdout, sys.stderr):
14233
659f34b833b9 rename util.set_binary to setbinary
Adrian Buehlmann <adrian@cadifra.com>
parents: 7361
diff changeset
     9
    util.setbinary(fp)
6433
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    10
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    11
for f in sys.argv[1:]:
6466
9c426da6b03b contrib: fix binary file issues with dumprevlog on Windows
Adrian Buehlmann <adrian@cadifra.com>
parents: 6433
diff changeset
    12
    binopen = lambda fn: open(fn, 'rb')
9c426da6b03b contrib: fix binary file issues with dumprevlog on Windows
Adrian Buehlmann <adrian@cadifra.com>
parents: 6433
diff changeset
    13
    r = revlog.revlog(binopen, f)
6433
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    14
    print "file:", f
6750
fb42030d79d6 add __len__ and __iter__ methods to repo and revlog
Matt Mackall <mpm@selenic.com>
parents: 6466
diff changeset
    15
    for i in r:
6433
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    16
        n = r.node(i)
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    17
        p = r.parents(n)
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    18
        d = r.revision(n)
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    19
        print "node:", node.hex(n)
7361
9fe97eea5510 linkrev: take a revision number rather than a hash
Matt Mackall <mpm@selenic.com>
parents: 6750
diff changeset
    20
        print "linkrev:", r.linkrev(i)
6433
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    21
        print "parents:", node.hex(p[0]), node.hex(p[1])
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    22
        print "length:", len(d)
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    23
        print "-start-"
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    24
        print d
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    25
        print "-end-"