view contrib/check-commit @ 22596:27e2317efe89

filelog: raise CensoredNodeError when hash checks fail with censor metadata With this change, when a revlog revision hash does not match its content, and the content is empty with a special metadata key, the integrity failure is assumed to be intentionally caused to remove sensitive content from repository history. To allow different Mercurial functionality to handle this scenario differently a more specific exception is raised than "ordinary" hash failures. Alternatives to this approach include, but are not limited to: - Calling a hook when hashes mismatch to allow arbitrary tombstone validation. Cons: Irresponsibly easy to disable integrity checking altogether. - Returning empty revision data eagerly instead of raising, masking the error. Cons: Push/pull won't roundtrip the tombstone, so client repos are unusable. - Doing nothing differently at this layer. Callers must do their own detection of tombstoned data if they want to handle some hash checks and not others. - Impacts dozens of callsites, many of which don't have the revision data - Would probably be missing one or two callsites at any given time - Currently we throw a RevlogError, as do 12 other places in revlog.py. Callers would need to parse the exception message and/or ensure RevlogError is not thrown from any other part of their call tree.
author Mike Edgar <adgar@google.com>
date Wed, 03 Sep 2014 22:14:20 -0400
parents 15d0390a27fe
children ba272156113f
line wrap: on
line source

#!/usr/bin/env python
#
# Copyright 2014 Matt Mackall <mpm@selenic.com>
#
# A tool/hook to run basic sanity checks on commits/patches for
# submission to Mercurial. Install by adding the following to your
# .hg/hgrc:
#
# [hooks]
# pretxncommit = contrib/check-commit
#
# The hook can be temporarily bypassed with:
#
# $ BYPASS= hg commit
#
# See also: http://mercurial.selenic.com/wiki/ContributingChanges

import re, sys, os

errors = [
    (r"[(]bc[)]", "(BC) needs to be uppercase"),
    (r"[(]issue \d\d\d", "no space allowed between issue and number"),
    (r"[(]bug", "use (issueDDDD) instead of bug"),
    (r"^# User [^@\n]+$", "username is not an email address"),
    (r"^# .*\n(?!merge with )[^#]\S+[^:] ",
     "summary line doesn't start with 'topic: '"),
    (r"^# .*\n[A-Z][a-z]\S+", "don't capitalize summary lines"),
    (r"^# .*\n.*\.\s+$", "don't add trailing period on summary line"),
    (r"^# .*\n.{78,}", "summary line too long"),
    (r"^\+\n \n", "adds double empty line"),
    (r"\+\s+def [a-z]+_[a-z]", "adds a function with foo_bar naming"),
]

node = os.environ.get("HG_NODE")

if node:
    commit = os.popen("hg export %s" % node).read()
else:
    commit = sys.stdin.read()

exitcode = 0
for exp, msg in errors:
    m = re.search(exp, commit, re.MULTILINE)
    if m:
        pos = 0
        for n, l in enumerate(commit.splitlines(True)):
            pos += len(l)
            if pos >= m.end():
                print "%d: %s" % (n, msg)
                print " %s" % l[:-1]
                if "BYPASS" not in os.environ:
                    exitcode = 1
                break

sys.exit(exitcode)