hgext/censor.py
author Martin von Zweigbergk <martinvonz@google.com>
Fri, 05 Oct 2018 11:07:34 -0700
changeset 40344 2c5835b4246b
parent 40293 c303d65d2e34
child 43076 2372284d9457
permissions -rw-r--r--
narrow: when widening, don't include manifests the client already has When widening, we already don't include the changelog (since f1844a10ee19) and files that the client already has (since c73c7653dfb9). However, we still include all manifests needed for the new narrowspec. When using flat manifests, that means we resend all the manifests even though the client necessarily has all of them. For tree manifests, we unnecessarily resend the root manifests and any subdirectory manifests that the client already has. This patch makes it so we no longer resend manifests that the client already has. It does so by passing an extra matcher to the changegroup packer and it uses that for filtering out directories matching the old matcher's visitdir(). For consistency between directories and files, it also makes the filtering of files look at both old and new matcher rather than passing in a diff matcher as we did before. Differential Revision: https://phab.mercurial-scm.org/D4895
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
     1
# Copyright (C) 2015 - Mike Edgar <adgar@google.com>
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
     2
#
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
     3
# This extension enables removal of file content at a given revision,
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
     4
# rewriting the data/metadata of successive revisions to preserve revision log
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
     5
# integrity.
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
     6
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
     7
"""erase file content at a given revision
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
     8
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
     9
The censor command instructs Mercurial to erase all content of a file at a given
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    10
revision *without updating the changeset hash.* This allows existing history to
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    11
remain valid while preventing future clones/pulls from receiving the erased
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    12
data.
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    13
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    14
Typical uses for censor are due to security or legal requirements, including::
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    15
26781
1aee2ab0f902 spelling: trivial spell checking
Mads Kiilerich <madski@unity3d.com>
parents: 26587
diff changeset
    16
 * Passwords, private keys, cryptographic material
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    17
 * Licensed data/code/libraries for which the license has expired
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    18
 * Personally Identifiable Information or other private data
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    19
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    20
Censored nodes can interrupt mercurial's typical operation whenever the excised
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    21
data needs to be materialized. Some commands, like ``hg cat``/``hg revert``,
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    22
simply fail when asked to produce censored data. Others, like ``hg verify`` and
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    23
``hg update``, must be capable of tolerating censored data to continue to
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    24
function in a meaningful way. Such commands only tolerate censored file
24890
cba84b06b702 censor: fix incorrect configuration name for ignoring error at censored file
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 24880
diff changeset
    25
revisions if they are allowed by the "censor.policy=ignore" config option.
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    26
"""
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    27
28092
5166b7a84b72 censor: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 27290
diff changeset
    28
from __future__ import absolute_import
5166b7a84b72 censor: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 27290
diff changeset
    29
5166b7a84b72 censor: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 27290
diff changeset
    30
from mercurial.i18n import _
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    31
from mercurial.node import short
28092
5166b7a84b72 censor: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 27290
diff changeset
    32
5166b7a84b72 censor: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 27290
diff changeset
    33
from mercurial import (
5166b7a84b72 censor: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 27290
diff changeset
    34
    error,
32337
46ba2cdda476 registrar: move cmdutil.command to registrar module (API)
Yuya Nishihara <yuya@tcha.org>
parents: 32315
diff changeset
    35
    registrar,
28092
5166b7a84b72 censor: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 27290
diff changeset
    36
    scmutil,
5166b7a84b72 censor: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 27290
diff changeset
    37
)
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    38
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    39
cmdtable = {}
32337
46ba2cdda476 registrar: move cmdutil.command to registrar module (API)
Yuya Nishihara <yuya@tcha.org>
parents: 32315
diff changeset
    40
command = registrar.command(cmdtable)
29841
d5883fd055c6 extensions: change magic "shipped with hg" string
Augie Fackler <augie@google.com>
parents: 28092
diff changeset
    41
# Note for extension authors: ONLY specify testedwith = 'ships-with-hg-core' for
25186
80c5b2666a96 extensions: document that `testedwith = 'internal'` is special
Augie Fackler <augie@google.com>
parents: 24890
diff changeset
    42
# extensions which SHIP WITH MERCURIAL. Non-mainline extensions should
80c5b2666a96 extensions: document that `testedwith = 'internal'` is special
Augie Fackler <augie@google.com>
parents: 24890
diff changeset
    43
# be specifying the version(s) of Mercurial they are tested with, or
80c5b2666a96 extensions: document that `testedwith = 'internal'` is special
Augie Fackler <augie@google.com>
parents: 24890
diff changeset
    44
# leave the attribute unspecified.
29841
d5883fd055c6 extensions: change magic "shipped with hg" string
Augie Fackler <augie@google.com>
parents: 28092
diff changeset
    45
testedwith = 'ships-with-hg-core'
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    46
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    47
@command('censor',
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    48
    [('r', 'rev', '', _('censor file from specified revision'), _('REV')),
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    49
     ('t', 'tombstone', '', _('replacement tombstone data'), _('TEXT'))],
40293
c303d65d2e34 help: assigning categories to existing commands
rdamazio@google.com
parents: 39778
diff changeset
    50
    _('-r REV [-t TEXT] [FILE]'),
c303d65d2e34 help: assigning categories to existing commands
rdamazio@google.com
parents: 39778
diff changeset
    51
    helpcategory=command.CATEGORY_MAINTENANCE)
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    52
def censor(ui, repo, path, rev='', tombstone='', **opts):
38441
e219e355e088 censor: use context manager for lock management
Matt Harbison <matt_harbison@yahoo.com>
parents: 37442
diff changeset
    53
    with repo.wlock(), repo.lock():
27290
525d9b3f0a31 censor: make censor acquire locks before processing
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 26781
diff changeset
    54
        return _docensor(ui, repo, path, rev, tombstone, **opts)
525d9b3f0a31 censor: make censor acquire locks before processing
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 26781
diff changeset
    55
525d9b3f0a31 censor: make censor acquire locks before processing
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 26781
diff changeset
    56
def _docensor(ui, repo, path, rev='', tombstone='', **opts):
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    57
    if not path:
26587
56b2bcea2529 error: get Abort from 'error' instead of 'util'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 25806
diff changeset
    58
        raise error.Abort(_('must specify file path to censor'))
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    59
    if not rev:
26587
56b2bcea2529 error: get Abort from 'error' instead of 'util'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 25806
diff changeset
    60
        raise error.Abort(_('must specify revision to censor'))
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    61
25806
5e18f6e39006 censor: make various path forms available like other Mercurial commands
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 25660
diff changeset
    62
    wctx = repo[None]
5e18f6e39006 censor: make various path forms available like other Mercurial commands
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 25660
diff changeset
    63
5e18f6e39006 censor: make various path forms available like other Mercurial commands
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 25660
diff changeset
    64
    m = scmutil.match(wctx, (path,))
5e18f6e39006 censor: make various path forms available like other Mercurial commands
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 25660
diff changeset
    65
    if m.anypats() or len(m.files()) != 1:
26587
56b2bcea2529 error: get Abort from 'error' instead of 'util'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 25806
diff changeset
    66
        raise error.Abort(_('can only specify an explicit filename'))
25806
5e18f6e39006 censor: make various path forms available like other Mercurial commands
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 25660
diff changeset
    67
    path = m.files()[0]
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    68
    flog = repo.file(path)
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    69
    if not len(flog):
26587
56b2bcea2529 error: get Abort from 'error' instead of 'util'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 25806
diff changeset
    70
        raise error.Abort(_('cannot censor file with no history'))
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    71
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    72
    rev = scmutil.revsingle(repo, rev, rev).rev()
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    73
    try:
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    74
        ctx = repo[rev]
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    75
    except KeyError:
26587
56b2bcea2529 error: get Abort from 'error' instead of 'util'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 25806
diff changeset
    76
        raise error.Abort(_('invalid revision identifier %s') % rev)
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    77
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    78
    try:
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    79
        fctx = ctx.filectx(path)
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    80
    except error.LookupError:
26587
56b2bcea2529 error: get Abort from 'error' instead of 'util'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 25806
diff changeset
    81
        raise error.Abort(_('file does not exist at revision %s') % rev)
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    82
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    83
    fnode = fctx.filenode()
39615
a658f97c1ce4 censor: use a reasonable amount of memory
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 38783
diff changeset
    84
    heads = []
a658f97c1ce4 censor: use a reasonable amount of memory
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 38783
diff changeset
    85
    for headnode in repo.heads():
39661
8bfbb25859f1 censor: rename loop variable to silence pyflakes warning
Yuya Nishihara <yuya@tcha.org>
parents: 39615
diff changeset
    86
        hc = repo[headnode]
8bfbb25859f1 censor: rename loop variable to silence pyflakes warning
Yuya Nishihara <yuya@tcha.org>
parents: 39615
diff changeset
    87
        if path in hc and hc.filenode(path) == fnode:
8bfbb25859f1 censor: rename loop variable to silence pyflakes warning
Yuya Nishihara <yuya@tcha.org>
parents: 39615
diff changeset
    88
            heads.append(hc)
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    89
    if heads:
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    90
        headlist = ', '.join([short(c.node()) for c in heads])
26587
56b2bcea2529 error: get Abort from 'error' instead of 'util'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 25806
diff changeset
    91
        raise error.Abort(_('cannot censor file in heads (%s)') % headlist,
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    92
            hint=_('clean/delete and commit first'))
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    93
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    94
    wp = wctx.parents()
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    95
    if ctx.node() in [p.node() for p in wp]:
26587
56b2bcea2529 error: get Abort from 'error' instead of 'util'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 25806
diff changeset
    96
        raise error.Abort(_('cannot censor working directory'),
24347
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    97
            hint=_('clean/delete/update first'))
1bcfecbbf569 censor: add censor command to hgext with basic client-side tests
Mike Edgar <adgar@google.com>
parents:
diff changeset
    98
39778
a6b3c4c1019f revlog: move censor logic out of censor extension
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39661
diff changeset
    99
    with repo.transaction(b'censor') as tr:
a6b3c4c1019f revlog: move censor logic out of censor extension
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39661
diff changeset
   100
        flog.censorrevision(tr, fnode, tombstone=tombstone)