hgext/censor.py
author Simon Sapin <simon.sapin@octobus.net>
Mon, 12 Jul 2021 22:46:52 +0200
changeset 47675 48aec076b8fb
parent 43434 bec734015b70
child 48118 5105a9975407
permissions -rw-r--r--
dirstate-v2: Enforce data size read from the docket file The data file may not be shorter than its size given by the docket. It may be longer, but additional data is ignored. Differential Revision: https://phab.mercurial-scm.org/D11089
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
     1
# Copyright (C) 2015 - Mike Edgar <adgar@google.com>
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
     2
#
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
     3
# This extension enables removal of file content at a given revision,
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
     4
# rewriting the data/metadata of successive revisions to preserve revision log
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
     5
# integrity.
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
     6
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
     7
"""erase file content at a given revision
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
     8
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
     9
The censor command instructs Mercurial to erase all content of a file at a given
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    10
revision *without updating the changeset hash.* This allows existing history to
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    11
remain valid while preventing future clones/pulls from receiving the erased
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    12
data.
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    13
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    14
Typical uses for censor are due to security or legal requirements, including::
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    15
26781
1aee2ab0f902 spelling: trivial spell checking
Mads Kiilerich <madski@unity3d.com>
parents: 26587
diff changeset
    16
 * Passwords, private keys, cryptographic material
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    17
 * Licensed data/code/libraries for which the license has expired
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    18
 * Personally Identifiable Information or other private data
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    19
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    20
Censored nodes can interrupt mercurial's typical operation whenever the excised
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    21
data needs to be materialized. Some commands, like ``hg cat``/``hg revert``,
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    22
simply fail when asked to produce censored data. Others, like ``hg verify`` and
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    23
``hg update``, must be capable of tolerating censored data to continue to
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    24
function in a meaningful way. Such commands only tolerate censored file
24890
cba84b06b702 censor: fix incorrect configuration name for ignoring error at censored file
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 24880
diff changeset
    25
revisions if they are allowed by the "censor.policy=ignore" config option.
43434
bec734015b70 censor: document that some commands simply ignore censored data
Jordi Gutiérrez Hermoso <jordigh@octave.org>
parents: 43077
diff changeset
    26
bec734015b70 censor: document that some commands simply ignore censored data
Jordi Gutiérrez Hermoso <jordigh@octave.org>
parents: 43077
diff changeset
    27
A few informative commands such as ``hg grep`` will unconditionally
bec734015b70 censor: document that some commands simply ignore censored data
Jordi Gutiérrez Hermoso <jordigh@octave.org>
parents: 43077
diff changeset
    28
ignore censored data and merely report that it was encountered.
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    29
"""
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    30
28092
5166b7a84b72 censor: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 27290
diff changeset
    31
from __future__ import absolute_import
5166b7a84b72 censor: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 27290
diff changeset
    32
5166b7a84b72 censor: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 27290
diff changeset
    33
from mercurial.i18n import _
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    34
from mercurial.node import short
28092
5166b7a84b72 censor: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 27290
diff changeset
    35
5166b7a84b72 censor: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 27290
diff changeset
    36
from mercurial import (
5166b7a84b72 censor: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 27290
diff changeset
    37
    error,
32337
46ba2cdda476 registrar: move cmdutil.command to registrar module (API)
Yuya Nishihara <yuya@tcha.org>
parents: 32315
diff changeset
    38
    registrar,
28092
5166b7a84b72 censor: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 27290
diff changeset
    39
    scmutil,
5166b7a84b72 censor: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 27290
diff changeset
    40
)
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    41
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    42
cmdtable = {}
32337
46ba2cdda476 registrar: move cmdutil.command to registrar module (API)
Yuya Nishihara <yuya@tcha.org>
parents: 32315
diff changeset
    43
command = registrar.command(cmdtable)
29841
d5883fd055c6 extensions: change magic "shipped with hg" string
Augie Fackler <augie@google.com>
parents: 28092
diff changeset
    44
# Note for extension authors: ONLY specify testedwith = 'ships-with-hg-core' for
25186
80c5b2666a96 extensions: document that `testedwith = 'internal'` is special
Augie Fackler <augie@google.com>
parents: 24890
diff changeset
    45
# extensions which SHIP WITH MERCURIAL. Non-mainline extensions should
80c5b2666a96 extensions: document that `testedwith = 'internal'` is special
Augie Fackler <augie@google.com>
parents: 24890
diff changeset
    46
# be specifying the version(s) of Mercurial they are tested with, or
80c5b2666a96 extensions: document that `testedwith = 'internal'` is special
Augie Fackler <augie@google.com>
parents: 24890
diff changeset
    47
# leave the attribute unspecified.
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
    48
testedwith = b'ships-with-hg-core'
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    49
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40293
diff changeset
    50
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40293
diff changeset
    51
@command(
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
    52
    b'censor',
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40293
diff changeset
    53
    [
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
    54
        (
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
    55
            b'r',
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
    56
            b'rev',
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
    57
            b'',
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
    58
            _(b'censor file from specified revision'),
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
    59
            _(b'REV'),
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
    60
        ),
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
    61
        (b't', b'tombstone', b'', _(b'replacement tombstone data'), _(b'TEXT')),
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40293
diff changeset
    62
    ],
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
    63
    _(b'-r REV [-t TEXT] [FILE]'),
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40293
diff changeset
    64
    helpcategory=command.CATEGORY_MAINTENANCE,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40293
diff changeset
    65
)
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
    66
def censor(ui, repo, path, rev=b'', tombstone=b'', **opts):
38441
e219e355e088 censor: use context manager for lock management
Matt Harbison <matt_harbison@yahoo.com>
parents: 37442
diff changeset
    67
    with repo.wlock(), repo.lock():
27290
525d9b3f0a31 censor: make censor acquire locks before processing
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 26781
diff changeset
    68
        return _docensor(ui, repo, path, rev, tombstone, **opts)
525d9b3f0a31 censor: make censor acquire locks before processing
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 26781
diff changeset
    69
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40293
diff changeset
    70
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
    71
def _docensor(ui, repo, path, rev=b'', tombstone=b'', **opts):
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    72
    if not path:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
    73
        raise error.Abort(_(b'must specify file path to censor'))
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    74
    if not rev:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
    75
        raise error.Abort(_(b'must specify revision to censor'))
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    76
25806
5e18f6e39006 censor: make various path forms available like other Mercurial commands
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 25660
diff changeset
    77
    wctx = repo[None]
5e18f6e39006 censor: make various path forms available like other Mercurial commands
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 25660
diff changeset
    78
5e18f6e39006 censor: make various path forms available like other Mercurial commands
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 25660
diff changeset
    79
    m = scmutil.match(wctx, (path,))
5e18f6e39006 censor: make various path forms available like other Mercurial commands
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 25660
diff changeset
    80
    if m.anypats() or len(m.files()) != 1:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
    81
        raise error.Abort(_(b'can only specify an explicit filename'))
25806
5e18f6e39006 censor: make various path forms available like other Mercurial commands
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 25660
diff changeset
    82
    path = m.files()[0]
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    83
    flog = repo.file(path)
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    84
    if not len(flog):
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
    85
        raise error.Abort(_(b'cannot censor file with no history'))
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    86
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    87
    rev = scmutil.revsingle(repo, rev, rev).rev()
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    88
    try:
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    89
        ctx = repo[rev]
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    90
    except KeyError:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
    91
        raise error.Abort(_(b'invalid revision identifier %s') % rev)
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    92
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    93
    try:
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    94
        fctx = ctx.filectx(path)
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    95
    except error.LookupError:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
    96
        raise error.Abort(_(b'file does not exist at revision %s') % rev)
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    97
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    98
    fnode = fctx.filenode()
39615
a658f97c1ce4 censor: use a reasonable amount of memory
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 38783
diff changeset
    99
    heads = []
a658f97c1ce4 censor: use a reasonable amount of memory
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 38783
diff changeset
   100
    for headnode in repo.heads():
39661
8bfbb25859f1 censor: rename loop variable to silence pyflakes warning
Yuya Nishihara <yuya@tcha.org>
parents: 39615
diff changeset
   101
        hc = repo[headnode]
8bfbb25859f1 censor: rename loop variable to silence pyflakes warning
Yuya Nishihara <yuya@tcha.org>
parents: 39615
diff changeset
   102
        if path in hc and hc.filenode(path) == fnode:
8bfbb25859f1 censor: rename loop variable to silence pyflakes warning
Yuya Nishihara <yuya@tcha.org>
parents: 39615
diff changeset
   103
            heads.append(hc)
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   104
    if heads:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   105
        headlist = b', '.join([short(c.node()) for c in heads])
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40293
diff changeset
   106
        raise error.Abort(
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   107
            _(b'cannot censor file in heads (%s)') % headlist,
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   108
            hint=_(b'clean/delete and commit first'),
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40293
diff changeset
   109
        )
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   110
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   111
    wp = wctx.parents()
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   112
    if ctx.node() in [p.node() for p in wp]:
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40293
diff changeset
   113
        raise error.Abort(
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   114
            _(b'cannot censor working directory'),
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   115
            hint=_(b'clean/delete/update first'),
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40293
diff changeset
   116
        )
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
   117
39778
a6b3c4c1019f revlog: move censor logic out of censor extension
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39661
diff changeset
   118
    with repo.transaction(b'censor') as tr:
a6b3c4c1019f revlog: move censor logic out of censor extension
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39661
diff changeset
   119
        flog.censorrevision(tr, fnode, tombstone=tombstone)