hgext/censor.py
author Gregory Szorc <gregory.szorc@gmail.com>
Sat, 26 Sep 2015 21:43:13 -0700
changeset 26380 56a640b0f656
parent 25806 5e18f6e39006
child 26587 56b2bcea2529
permissions -rw-r--r--
revlog: don't flush data file after every added revision The current behavior of revlogs is to flush the data file when writing data to it. Tracing system calls revealed that changegroup processing incurred numerous write(2) calls for values much smaller than the default buffer size (Python defaults to 4096, but it can be adjusted based on detected block size at run time by CPython). The reason we flush revlogs is so readers have all data available. For example, the current code in revlog.py will re-open the revlog file (instead of seeking an existing file handle) to read the text of a revision. This happens when starting a new delta chain when adding several revisions from changegroups, for example. Yes, this is likely sub-optimal (we should probably be sharing file descriptors between readers and writers to avoid the flushing and associated overhead of re-opening files). While flushing revlogs is necessary, it appears all callers are diligent about flushing files before a read is performed (see buildtext() in _addrevision()), making the flush in _writeentry() redundant and unncessary. So, we remove it. In practice, this means we incur a write(2) a) when the buffer is full (typically 4096 bytes) b) when a new delta chain is created rather than after every added revision. This applies to every revlog, but by volume it mostly impacts filelogs. Removing the redundant flush from _writeentry() significantly reduces the number of write(2) calls during changegroup processing on my Linux machine. When applying a changegroup of the hg repo based on my local repo, the total number of write(2) calls during application of the mercurial/localrepo.py revlogs dropped from 1,320 to 217 with this patch applied. Total I/O related system calls dropped from 1,577 to 474. When unbundling a mozilla-central gzipped bundle (264,403 changesets with 1,492,215 changes to 222,507 files), total write(2) calls dropped from 1,252,881 to 827,106 and total system calls dropped from 3,601,259 to 3,178,636 - a reduction of 425,775! While the system call reduction is significant, it appears to have no impact on wall time on my Linux and Windows machines. Still, fewer syscalls is fewer syscalls. Surely this can't hurt. If nothing else, it makes examining remaining system call usage simpler and opens the door to experimenting with the performance impact of different buffer sizes.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
     1
# Copyright (C) 2015 - Mike Edgar <adgar@google.com>
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
     2
#
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
     3
# This extension enables removal of file content at a given revision,
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
     4
# rewriting the data/metadata of successive revisions to preserve revision log
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
     5
# integrity.
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
     6
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
     7
"""erase file content at a given revision
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
     8
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
     9
The censor command instructs Mercurial to erase all content of a file at a given
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    10
revision *without updating the changeset hash.* This allows existing history to
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    11
remain valid while preventing future clones/pulls from receiving the erased
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    12
data.
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    13
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    14
Typical uses for censor are due to security or legal requirements, including::
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    15
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    16
 * Passwords, private keys, crytographic material
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    17
 * Licensed data/code/libraries for which the license has expired
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    18
 * Personally Identifiable Information or other private data
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    19
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    20
Censored nodes can interrupt mercurial's typical operation whenever the excised
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    21
data needs to be materialized. Some commands, like ``hg cat``/``hg revert``,
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    22
simply fail when asked to produce censored data. Others, like ``hg verify`` and
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    23
``hg update``, must be capable of tolerating censored data to continue to
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    24
function in a meaningful way. Such commands only tolerate censored file
24890
cba84b06b702 censor: fix incorrect configuration name for ignoring error at censored file
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 24880
diff changeset
    25
revisions if they are allowed by the "censor.policy=ignore" config option.
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    26
"""
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    27
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    28
from mercurial.node import short
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    29
from mercurial import cmdutil, error, filelog, revlog, scmutil, util
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    30
from mercurial.i18n import _
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    31
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    32
cmdtable = {}
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    33
command = cmdutil.command(cmdtable)
25186
80c5b2666a96 extensions: document that `testedwith = 'internal'` is special
Augie Fackler <augie@google.com>
parents: 24890
diff changeset
    34
# Note for extension authors: ONLY specify testedwith = 'internal' for
80c5b2666a96 extensions: document that `testedwith = 'internal'` is special
Augie Fackler <augie@google.com>
parents: 24890
diff changeset
    35
# extensions which SHIP WITH MERCURIAL. Non-mainline extensions should
80c5b2666a96 extensions: document that `testedwith = 'internal'` is special
Augie Fackler <augie@google.com>
parents: 24890
diff changeset
    36
# be specifying the version(s) of Mercurial they are tested with, or
80c5b2666a96 extensions: document that `testedwith = 'internal'` is special
Augie Fackler <augie@google.com>
parents: 24890
diff changeset
    37
# leave the attribute unspecified.
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    38
testedwith = 'internal'
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    39
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    40
@command('censor',
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    41
    [('r', 'rev', '', _('censor file from specified revision'), _('REV')),
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    42
     ('t', 'tombstone', '', _('replacement tombstone data'), _('TEXT'))],
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    43
    _('-r REV [-t TEXT] [FILE]'))
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    44
def censor(ui, repo, path, rev='', tombstone='', **opts):
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    45
    if not path:
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    46
        raise util.Abort(_('must specify file path to censor'))
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    47
    if not rev:
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    48
        raise util.Abort(_('must specify revision to censor'))
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    49
25806
5e18f6e39006 censor: make various path forms available like other Mercurial commands
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 25660
diff changeset
    50
    wctx = repo[None]
5e18f6e39006 censor: make various path forms available like other Mercurial commands
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 25660
diff changeset
    51
5e18f6e39006 censor: make various path forms available like other Mercurial commands
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 25660
diff changeset
    52
    m = scmutil.match(wctx, (path,))
5e18f6e39006 censor: make various path forms available like other Mercurial commands
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 25660
diff changeset
    53
    if m.anypats() or len(m.files()) != 1:
5e18f6e39006 censor: make various path forms available like other Mercurial commands
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 25660
diff changeset
    54
        raise util.Abort(_('can only specify an explicit filename'))
5e18f6e39006 censor: make various path forms available like other Mercurial commands
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 25660
diff changeset
    55
    path = m.files()[0]
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    56
    flog = repo.file(path)
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    57
    if not len(flog):
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    58
        raise util.Abort(_('cannot censor file with no history'))
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    59
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    60
    rev = scmutil.revsingle(repo, rev, rev).rev()
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    61
    try:
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    62
        ctx = repo[rev]
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    63
    except KeyError:
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    64
        raise util.Abort(_('invalid revision identifier %s') % rev)
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    65
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    66
    try:
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    67
        fctx = ctx.filectx(path)
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    68
    except error.LookupError:
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    69
        raise util.Abort(_('file does not exist at revision %s') % rev)
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    70
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    71
    fnode = fctx.filenode()
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    72
    headctxs = [repo[c] for c in repo.heads()]
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    73
    heads = [c for c in headctxs if path in c and c.filenode(path) == fnode]
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    74
    if heads:
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    75
        headlist = ', '.join([short(c.node()) for c in heads])
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    76
        raise util.Abort(_('cannot censor file in heads (%s)') % headlist,
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    77
            hint=_('clean/delete and commit first'))
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    78
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    79
    wp = wctx.parents()
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    80
    if ctx.node() in [p.node() for p in wp]:
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    81
        raise util.Abort(_('cannot censor working directory'),
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    82
            hint=_('clean/delete/update first'))
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    83
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    84
    flogv = flog.version & 0xFFFF
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    85
    if flogv != revlog.REVLOGNG:
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    86
        raise util.Abort(
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    87
            _('censor does not support revlog version %d') % (flogv,))
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    88
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    89
    tombstone = filelog.packmeta({"censored": tombstone}, "")
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    90
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    91
    crev = fctx.filerev()
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    92
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    93
    if len(tombstone) > flog.rawsize(crev):
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    94
        raise util.Abort(_(
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    95
            'censor tombstone must be no longer than censored data'))
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    96
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    97
    # Using two files instead of one makes it easy to rewrite entry-by-entry
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    98
    idxread = repo.svfs(flog.indexfile, 'r')
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    99
    idxwrite = repo.svfs(flog.indexfile, 'wb', atomictemp=True)
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   100
    if flog.version & revlog.REVLOGNGINLINEDATA:
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   101
        dataread, datawrite = idxread, idxwrite
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   102
    else:
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   103
        dataread = repo.svfs(flog.datafile, 'r')
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   104
        datawrite = repo.svfs(flog.datafile, 'wb', atomictemp=True)
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   105
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   106
    # Copy all revlog data up to the entry to be censored.
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   107
    rio = revlog.revlogio()
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   108
    offset = flog.start(crev)
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   109
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   110
    for chunk in util.filechunkiter(idxread, limit=crev * rio.size):
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   111
        idxwrite.write(chunk)
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   112
    for chunk in util.filechunkiter(dataread, limit=offset):
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   113
        datawrite.write(chunk)
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   114
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   115
    def rewriteindex(r, newoffs, newdata=None):
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   116
        """Rewrite the index entry with a new data offset and optional new data.
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   117
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   118
        The newdata argument, if given, is a tuple of three positive integers:
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   119
        (new compressed, new uncompressed, added flag bits).
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   120
        """
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   121
        offlags, comp, uncomp, base, link, p1, p2, nodeid = flog.index[r]
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   122
        flags = revlog.gettype(offlags)
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   123
        if newdata:
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   124
            comp, uncomp, nflags = newdata
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   125
            flags |= nflags
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   126
        offlags = revlog.offset_type(newoffs, flags)
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   127
        e = (offlags, comp, uncomp, r, link, p1, p2, nodeid)
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   128
        idxwrite.write(rio.packentry(e, None, flog.version, r))
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   129
        idxread.seek(rio.size, 1)
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   130
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   131
    def rewrite(r, offs, data, nflags=revlog.REVIDX_DEFAULT_FLAGS):
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   132
        """Write the given full text to the filelog with the given data offset.
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   133
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   134
        Returns:
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   135
            The integer number of data bytes written, for tracking data offsets.
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   136
        """
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   137
        flag, compdata = flog.compress(data)
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   138
        newcomp = len(flag) + len(compdata)
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   139
        rewriteindex(r, offs, (newcomp, len(data), nflags))
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   140
        datawrite.write(flag)
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   141
        datawrite.write(compdata)
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   142
        dataread.seek(flog.length(r), 1)
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   143
        return newcomp
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   144
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   145
    # Rewrite censored revlog entry with (padded) tombstone data.
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   146
    pad = ' ' * (flog.rawsize(crev) - len(tombstone))
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   147
    offset += rewrite(crev, offset, tombstone + pad, revlog.REVIDX_ISCENSORED)
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   148
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   149
    # Rewrite all following filelog revisions fixing up offsets and deltas.
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   150
    for srev in xrange(crev + 1, len(flog)):
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   151
        if crev in flog.parentrevs(srev):
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   152
            # Immediate children of censored node must be re-added as fulltext.
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   153
            try:
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   154
                revdata = flog.revision(srev)
25660
328739ea70c3 global: mass rewrite to use modern exception syntax
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25186
diff changeset
   155
            except error.CensoredNodeError as e:
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   156
                revdata = e.tombstone
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   157
            dlen = rewrite(srev, offset, revdata)
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   158
        else:
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   159
            # Copy any other revision data verbatim after fixing up the offset.
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   160
            rewriteindex(srev, offset)
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   161
            dlen = flog.length(srev)
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   162
            for chunk in util.filechunkiter(dataread, limit=dlen):
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   163
                datawrite.write(chunk)
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   164
        offset += dlen
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   165
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   166
    idxread.close()
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   167
    idxwrite.close()
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   168
    if dataread is not idxread:
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   169
        dataread.close()
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   170
        datawrite.close()