annotate contrib/check-commit @ 43271:99394e6c5d12

rust-dirstate-status: add first Rust implementation of `dirstate.status` Note: This patch also added the rayon crate as a Cargo dependency. It will help us immensely in making Rust code parallel and easy to maintain. It is a stable, well-known, and supported crate maintained by people on the Rust team. The current `dirstate.status` method has grown over the years through bug reports and new features to the point where it got too big and too complex. This series does not yet improve the logic, but adds a Rust fast-path to speed up certain cases. Tested on mozilla-try-2019-02-18 with zstd compression: - `hg diff` on an empty working copy: - c: 1.64(+-)0.04s - rust+c before this change: 2.84(+-)0.1s - rust+c: 849(+-)40ms - `hg commit` when creating a file: - c: 5.960s - rust+c before this change: 5.828s - rust+c: 4.668s - `hg commit` when updating a file: - c: 4.866s - rust+c before this change: 4.371s - rust+c: 3.855s - `hg status -mard` - c: 1.82(+-)0.04s - rust+c before this change: 2.64(+-)0.1s - rust+c: 896(+-)30ms The numbers are clear: the current Rust `dirstatemap` implementation is super slow, its performance needs to be addressed. This will be done in a future series, immediately after this one, with the goal of getting Rust to be at least to the speed of the Python + C implementation in all cases before the 5.2 freeze. At worse, we gate dirstatemap to only be used in those cases. Cases where the fast-path is not executed: - for commands that need ignore support (`status`, for example) - if subrepos are found (should not be hard to add, but winter is coming) - any other matcher than an `alwaysmatcher`, like patterns, etc. - with extensions like `sparse` and `fsmonitor` The next step after this is to rethink the logic to be closer to Jane Street's Valentin Gatien-Baron's Rust fast-path which does a lot less work when possible. Differential Revision: https://phab.mercurial-scm.org/D7058
author Raphaël Gomès <rgomes@octobus.net>
date Fri, 11 Oct 2019 13:39:57 +0200
parents 24a07347aa60
children 99e231afc29c
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
22043
1274ff3f20a8 contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
1 #!/usr/bin/env python
1274ff3f20a8 contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
2 #
1274ff3f20a8 contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
3 # Copyright 2014 Matt Mackall <mpm@selenic.com>
1274ff3f20a8 contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
4 #
1274ff3f20a8 contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
5 # A tool/hook to run basic sanity checks on commits/patches for
1274ff3f20a8 contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
6 # submission to Mercurial. Install by adding the following to your
1274ff3f20a8 contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
7 # .hg/hgrc:
1274ff3f20a8 contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
8 #
1274ff3f20a8 contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
9 # [hooks]
1274ff3f20a8 contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
10 # pretxncommit = contrib/check-commit
1274ff3f20a8 contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
11 #
1274ff3f20a8 contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
12 # The hook can be temporarily bypassed with:
1274ff3f20a8 contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
13 #
1274ff3f20a8 contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
14 # $ BYPASS= hg commit
1274ff3f20a8 contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
15 #
26421
4b0fc75f9403 urls: bulk-change primary website URLs
Matt Mackall <mpm@selenic.com>
parents: 25643
diff changeset
16 # See also: https://mercurial-scm.org/wiki/ContributingChanges
22043
1274ff3f20a8 contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
17
29164
91f35b1a34cf py3: make contrib/check-commit use print_function
Pulkit Goyal <7895pulkit@gmail.com>
parents: 29163
diff changeset
18 from __future__ import absolute_import, print_function
29163
bf7fd815b083 py3: make contrib/check-commit use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents: 28043
diff changeset
19
bf7fd815b083 py3: make contrib/check-commit use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents: 28043
diff changeset
20 import os
bf7fd815b083 py3: make contrib/check-commit use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents: 28043
diff changeset
21 import re
bf7fd815b083 py3: make contrib/check-commit use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents: 28043
diff changeset
22 import sys
22043
1274ff3f20a8 contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
23
27782
7291c8165e33 check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents: 27781
diff changeset
24 commitheader = r"^(?:# [^\n]*\n)*"
7291c8165e33 check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents: 27781
diff changeset
25 afterheader = commitheader + r"(?!#)"
7291c8165e33 check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents: 27781
diff changeset
26 beforepatch = afterheader + r"(?!\n(?!@@))"
7291c8165e33 check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents: 27781
diff changeset
27
22043
1274ff3f20a8 contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
28 errors = [
27782
7291c8165e33 check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents: 27781
diff changeset
29 (beforepatch + r".*[(]bc[)]", "(BC) needs to be uppercase"),
28042
08e0c4082903 check-commit: wrap too long line
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28013
diff changeset
30 (beforepatch + r".*[(]issue \d\d\d",
08e0c4082903 check-commit: wrap too long line
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28013
diff changeset
31 "no space allowed between issue and number"),
27782
7291c8165e33 check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents: 27781
diff changeset
32 (beforepatch + r".*[(]bug(\d|\s)", "use (issueDDDD) instead of bug"),
7291c8165e33 check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents: 27781
diff changeset
33 (commitheader + r"# User [^@\n]+\n", "username is not an email address"),
7291c8165e33 check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents: 27781
diff changeset
34 (commitheader + r"(?!merge with )[^#]\S+[^:] ",
22043
1274ff3f20a8 contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
35 "summary line doesn't start with 'topic: '"),
27782
7291c8165e33 check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents: 27781
diff changeset
36 (afterheader + r"[A-Z][a-z]\S+", "don't capitalize summary lines"),
40952
811f772b44aa check-commit: disallow capitalization only right after topic
Martin von Zweigbergk <martinvonz@google.com>
parents: 40308
diff changeset
37 (afterheader + r"^\S+: *[A-Z][a-z]\S+", "don't capitalize summary lines"),
30061
8e805cf27caa check-commit: allow underscore as commit topic
Mathias De Maré <mathias.de_mare@nokia.com>
parents: 29716
diff changeset
38 (afterheader + r"\S*[^A-Za-z0-9-_]\S*: ",
27692
e0465035def9 check-commit: try to curb bad commit summary keywords
Matt Mackall <mpm@selenic.com>
parents: 27199
diff changeset
39 "summary keyword should be most user-relevant one-word command or topic"),
27782
7291c8165e33 check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents: 27781
diff changeset
40 (afterheader + r".*\.\s*\n", "don't add trailing period on summary line"),
7291c8165e33 check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents: 27781
diff changeset
41 (afterheader + r".{79,}", "summary line too long (limit is 78)"),
22043
1274ff3f20a8 contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
42 ]
1274ff3f20a8 contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
43
41539
45a4789d3ff2 check-commit: use raw string for regular expression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 40952
diff changeset
44 word = re.compile(r'\S')
27782
7291c8165e33 check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents: 27781
diff changeset
45 def nonempty(first, second):
7291c8165e33 check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents: 27781
diff changeset
46 if word.search(first):
7291c8165e33 check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents: 27781
diff changeset
47 return first
7291c8165e33 check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents: 27781
diff changeset
48 return second
7291c8165e33 check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents: 27781
diff changeset
49
28043
ac4684c21f73 check-commit: omit whitespace
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28042
diff changeset
50 def checkcommit(commit, node=None):
27780
f47185f09533 check-commit: modularize
timeless <timeless@mozdev.org>
parents: 27779
diff changeset
51 exitcode = 0
27781
2af351bd289c check-commit: support REVs as commandline arguments
timeless <timeless@mozdev.org>
parents: 27780
diff changeset
52 printed = node is None
27783
1d095371de47 check-commit: sort errors by line number
timeless <timeless@mozdev.org>
parents: 27782
diff changeset
53 hits = []
30843
2fb3ae89e4e1 contrib: fix check-commit to not reject commits from `hg sign` and `hg tag`
Augie Fackler <augie@google.com>
parents: 30061
diff changeset
54 signtag = (afterheader +
2fb3ae89e4e1 contrib: fix check-commit to not reject commits from `hg sign` and `hg tag`
Augie Fackler <augie@google.com>
parents: 30061
diff changeset
55 r'Added (tag [^ ]+|signature) for changeset [a-f0-9]{12}')
2fb3ae89e4e1 contrib: fix check-commit to not reject commits from `hg sign` and `hg tag`
Augie Fackler <augie@google.com>
parents: 30061
diff changeset
56 if re.search(signtag, commit):
2fb3ae89e4e1 contrib: fix check-commit to not reject commits from `hg sign` and `hg tag`
Augie Fackler <augie@google.com>
parents: 30061
diff changeset
57 return 0
27780
f47185f09533 check-commit: modularize
timeless <timeless@mozdev.org>
parents: 27779
diff changeset
58 for exp, msg in errors:
28012
897b2fcf079f check-commit: scan for multiple instances of error patterns
Matt Mackall <mpm@selenic.com>
parents: 27783
diff changeset
59 for m in re.finditer(exp, commit):
27782
7291c8165e33 check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents: 27781
diff changeset
60 end = m.end()
7291c8165e33 check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents: 27781
diff changeset
61 trailing = re.search(r'(\\n)+$', exp)
7291c8165e33 check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents: 27781
diff changeset
62 if trailing:
7291c8165e33 check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents: 27781
diff changeset
63 end -= len(trailing.group()) / 2
27783
1d095371de47 check-commit: sort errors by line number
timeless <timeless@mozdev.org>
parents: 27782
diff changeset
64 hits.append((end, exp, msg))
1d095371de47 check-commit: sort errors by line number
timeless <timeless@mozdev.org>
parents: 27782
diff changeset
65 if hits:
1d095371de47 check-commit: sort errors by line number
timeless <timeless@mozdev.org>
parents: 27782
diff changeset
66 hits.sort()
1d095371de47 check-commit: sort errors by line number
timeless <timeless@mozdev.org>
parents: 27782
diff changeset
67 pos = 0
1d095371de47 check-commit: sort errors by line number
timeless <timeless@mozdev.org>
parents: 27782
diff changeset
68 last = ''
1d095371de47 check-commit: sort errors by line number
timeless <timeless@mozdev.org>
parents: 27782
diff changeset
69 for n, l in enumerate(commit.splitlines(True)):
1d095371de47 check-commit: sort errors by line number
timeless <timeless@mozdev.org>
parents: 27782
diff changeset
70 pos += len(l)
1d095371de47 check-commit: sort errors by line number
timeless <timeless@mozdev.org>
parents: 27782
diff changeset
71 while len(hits):
1d095371de47 check-commit: sort errors by line number
timeless <timeless@mozdev.org>
parents: 27782
diff changeset
72 end, exp, msg = hits[0]
27782
7291c8165e33 check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents: 27781
diff changeset
73 if pos < end:
27780
f47185f09533 check-commit: modularize
timeless <timeless@mozdev.org>
parents: 27779
diff changeset
74 break
27783
1d095371de47 check-commit: sort errors by line number
timeless <timeless@mozdev.org>
parents: 27782
diff changeset
75 if not printed:
1d095371de47 check-commit: sort errors by line number
timeless <timeless@mozdev.org>
parents: 27782
diff changeset
76 printed = True
29164
91f35b1a34cf py3: make contrib/check-commit use print_function
Pulkit Goyal <7895pulkit@gmail.com>
parents: 29163
diff changeset
77 print("node: %s" % node)
91f35b1a34cf py3: make contrib/check-commit use print_function
Pulkit Goyal <7895pulkit@gmail.com>
parents: 29163
diff changeset
78 print("%d: %s" % (n, msg))
91f35b1a34cf py3: make contrib/check-commit use print_function
Pulkit Goyal <7895pulkit@gmail.com>
parents: 29163
diff changeset
79 print(" %s" % nonempty(l, last)[:-1])
27783
1d095371de47 check-commit: sort errors by line number
timeless <timeless@mozdev.org>
parents: 27782
diff changeset
80 if "BYPASS" not in os.environ:
1d095371de47 check-commit: sort errors by line number
timeless <timeless@mozdev.org>
parents: 27782
diff changeset
81 exitcode = 1
1d095371de47 check-commit: sort errors by line number
timeless <timeless@mozdev.org>
parents: 27782
diff changeset
82 del hits[0]
1d095371de47 check-commit: sort errors by line number
timeless <timeless@mozdev.org>
parents: 27782
diff changeset
83 last = nonempty(l, last)
1d095371de47 check-commit: sort errors by line number
timeless <timeless@mozdev.org>
parents: 27782
diff changeset
84
27780
f47185f09533 check-commit: modularize
timeless <timeless@mozdev.org>
parents: 27779
diff changeset
85 return exitcode
22043
1274ff3f20a8 contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
86
27780
f47185f09533 check-commit: modularize
timeless <timeless@mozdev.org>
parents: 27779
diff changeset
87 def readcommit(node):
f47185f09533 check-commit: modularize
timeless <timeless@mozdev.org>
parents: 27779
diff changeset
88 return os.popen("hg export %s" % node).read()
f47185f09533 check-commit: modularize
timeless <timeless@mozdev.org>
parents: 27779
diff changeset
89
f47185f09533 check-commit: modularize
timeless <timeless@mozdev.org>
parents: 27779
diff changeset
90 if __name__ == "__main__":
27781
2af351bd289c check-commit: support REVs as commandline arguments
timeless <timeless@mozdev.org>
parents: 27780
diff changeset
91 exitcode = 0
27780
f47185f09533 check-commit: modularize
timeless <timeless@mozdev.org>
parents: 27779
diff changeset
92 node = os.environ.get("HG_NODE")
22043
1274ff3f20a8 contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
93
27780
f47185f09533 check-commit: modularize
timeless <timeless@mozdev.org>
parents: 27779
diff changeset
94 if node:
f47185f09533 check-commit: modularize
timeless <timeless@mozdev.org>
parents: 27779
diff changeset
95 commit = readcommit(node)
27781
2af351bd289c check-commit: support REVs as commandline arguments
timeless <timeless@mozdev.org>
parents: 27780
diff changeset
96 exitcode = checkcommit(commit)
2af351bd289c check-commit: support REVs as commandline arguments
timeless <timeless@mozdev.org>
parents: 27780
diff changeset
97 elif sys.argv[1:]:
2af351bd289c check-commit: support REVs as commandline arguments
timeless <timeless@mozdev.org>
parents: 27780
diff changeset
98 for node in sys.argv[1:]:
2af351bd289c check-commit: support REVs as commandline arguments
timeless <timeless@mozdev.org>
parents: 27780
diff changeset
99 exitcode |= checkcommit(readcommit(node), node)
27780
f47185f09533 check-commit: modularize
timeless <timeless@mozdev.org>
parents: 27779
diff changeset
100 else:
f47185f09533 check-commit: modularize
timeless <timeless@mozdev.org>
parents: 27779
diff changeset
101 commit = sys.stdin.read()
27781
2af351bd289c check-commit: support REVs as commandline arguments
timeless <timeless@mozdev.org>
parents: 27780
diff changeset
102 exitcode = checkcommit(commit)
27780
f47185f09533 check-commit: modularize
timeless <timeless@mozdev.org>
parents: 27779
diff changeset
103 sys.exit(exitcode)