contrib/dumprevlog
author Nicolas Dumazet <nicdumz.commits@gmail.com>
Tue, 26 May 2009 23:00:35 +0900
changeset 9115 b55d44719b47
parent 7361 9fe97eea5510
child 14233 659f34b833b9
permissions -rw-r--r--
inotify: server: new data structure to keep track of changes. == Rationale for the new structure == Current structure was a dictionary tree. One directory was tracked as a dictionary: - keys: file/subdir name - values: - for a file, the status (a/r/m/...) - for a subdir, the directory representing the subdir It allowed efficient lookups, no matter of the type of the terminal leaf: for part in path.split('/'): tree = tree[part] However, there is no way to represent a directory and a file with the same name because keys are conflicting in the dictionary. Concrete example: Initial state: root dir |- foo (file) |- bar (file) # data state is: {'foo': 'n', 'bar': 'n'} Remove foo: root dir |- bar (file) # Data becomes {'foo': 'r'} until next commit. Add foo, as a directory, and foo/barbar file: root dir |- bar (file) |-> foo (dir) |- barbar (file) # New state should be represented as: {'foo': {'barbar': 'a'}, 'bar': 'n'} however, the key "foo" is already used and represents the old file. The dirstate: D foo A foo/barbar cannot be represented, hence the need for a new structure. == The new structure == 'directory' class. Represents one directory level. * Notable attributes: Two dictionaries: - 'files' Maps filename -> status for the current dir. - 'dirs' Maps subdir's name -> directory object representing the subdir * methods - walk(), formerly server.walk - lookup(), old server.lookup - dir(), old server.dir This new class allows embedding all the tree walks/lookups in its own class, instead of having everything mixed together in server. Incidently, since files and directories are not stored in the same dictionaries, we are solving the previous key conflict problem. The small drawback is that lookup operation is a bit more complex: for a path a/b/c/d/e we have to check twice the leaf, if e is a directory or a file.

#!/usr/bin/env python
# Dump revlogs as raw data stream
# $ find .hg/store/ -name "*.i" | xargs dumprevlog > repo.dump

import sys
from mercurial import revlog, node, util

for fp in (sys.stdin, sys.stdout, sys.stderr):
    util.set_binary(fp)

for f in sys.argv[1:]:
    binopen = lambda fn: open(fn, 'rb')
    r = revlog.revlog(binopen, f)
    print "file:", f
    for i in r:
        n = r.node(i)
        p = r.parents(n)
        d = r.revision(n)
        print "node:", node.hex(n)
        print "linkrev:", r.linkrev(i)
        print "parents:", node.hex(p[0]), node.hex(p[1])
        print "length:", len(d)
        print "-start-"
        print d
        print "-end-"