Mercurial > hg
annotate contrib/check-commit @ 43271:99394e6c5d12
rust-dirstate-status: add first Rust implementation of `dirstate.status`
Note: This patch also added the rayon crate as a Cargo dependency. It will
help us immensely in making Rust code parallel and easy to maintain. It is
a stable, well-known, and supported crate maintained by people on the Rust
team.
The current `dirstate.status` method has grown over the years through bug
reports and new features to the point where it got too big and too complex.
This series does not yet improve the logic, but adds a Rust fast-path to speed
up certain cases.
Tested on mozilla-try-2019-02-18 with zstd compression:
- `hg diff` on an empty working copy:
- c: 1.64(+-)0.04s
- rust+c before this change: 2.84(+-)0.1s
- rust+c: 849(+-)40ms
- `hg commit` when creating a file:
- c: 5.960s
- rust+c before this change: 5.828s
- rust+c: 4.668s
- `hg commit` when updating a file:
- c: 4.866s
- rust+c before this change: 4.371s
- rust+c: 3.855s
- `hg status -mard`
- c: 1.82(+-)0.04s
- rust+c before this change: 2.64(+-)0.1s
- rust+c: 896(+-)30ms
The numbers are clear: the current Rust `dirstatemap` implementation is super
slow, its performance needs to be addressed.
This will be done in a future series, immediately after this one, with the goal
of getting Rust to be at least to the speed of the Python + C implementation
in all cases before the 5.2 freeze. At worse, we gate dirstatemap to only be used
in those cases.
Cases where the fast-path is not executed:
- for commands that need ignore support (`status`, for example)
- if subrepos are found (should not be hard to add, but winter is coming)
- any other matcher than an `alwaysmatcher`, like patterns, etc.
- with extensions like `sparse` and `fsmonitor`
The next step after this is to rethink the logic to be closer to
Jane Street's Valentin Gatien-Baron's Rust fast-path which does a lot less
work when possible.
Differential Revision: https://phab.mercurial-scm.org/D7058
author | Raphaël Gomès <rgomes@octobus.net> |
---|---|
date | Fri, 11 Oct 2019 13:39:57 +0200 |
parents | 24a07347aa60 |
children | 99e231afc29c |
rev | line source |
---|---|
22043
1274ff3f20a8
contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
1 #!/usr/bin/env python |
1274ff3f20a8
contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
2 # |
1274ff3f20a8
contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
3 # Copyright 2014 Matt Mackall <mpm@selenic.com> |
1274ff3f20a8
contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
4 # |
1274ff3f20a8
contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
5 # A tool/hook to run basic sanity checks on commits/patches for |
1274ff3f20a8
contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
6 # submission to Mercurial. Install by adding the following to your |
1274ff3f20a8
contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
7 # .hg/hgrc: |
1274ff3f20a8
contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
8 # |
1274ff3f20a8
contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
9 # [hooks] |
1274ff3f20a8
contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
10 # pretxncommit = contrib/check-commit |
1274ff3f20a8
contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
11 # |
1274ff3f20a8
contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
12 # The hook can be temporarily bypassed with: |
1274ff3f20a8
contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
13 # |
1274ff3f20a8
contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
14 # $ BYPASS= hg commit |
1274ff3f20a8
contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
15 # |
26421
4b0fc75f9403
urls: bulk-change primary website URLs
Matt Mackall <mpm@selenic.com>
parents:
25643
diff
changeset
|
16 # See also: https://mercurial-scm.org/wiki/ContributingChanges |
22043
1274ff3f20a8
contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
17 |
29164
91f35b1a34cf
py3: make contrib/check-commit use print_function
Pulkit Goyal <7895pulkit@gmail.com>
parents:
29163
diff
changeset
|
18 from __future__ import absolute_import, print_function |
29163
bf7fd815b083
py3: make contrib/check-commit use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents:
28043
diff
changeset
|
19 |
bf7fd815b083
py3: make contrib/check-commit use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents:
28043
diff
changeset
|
20 import os |
bf7fd815b083
py3: make contrib/check-commit use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents:
28043
diff
changeset
|
21 import re |
bf7fd815b083
py3: make contrib/check-commit use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents:
28043
diff
changeset
|
22 import sys |
22043
1274ff3f20a8
contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
23 |
27782
7291c8165e33
check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents:
27781
diff
changeset
|
24 commitheader = r"^(?:# [^\n]*\n)*" |
7291c8165e33
check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents:
27781
diff
changeset
|
25 afterheader = commitheader + r"(?!#)" |
7291c8165e33
check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents:
27781
diff
changeset
|
26 beforepatch = afterheader + r"(?!\n(?!@@))" |
7291c8165e33
check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents:
27781
diff
changeset
|
27 |
22043
1274ff3f20a8
contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
28 errors = [ |
27782
7291c8165e33
check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents:
27781
diff
changeset
|
29 (beforepatch + r".*[(]bc[)]", "(BC) needs to be uppercase"), |
28042
08e0c4082903
check-commit: wrap too long line
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents:
28013
diff
changeset
|
30 (beforepatch + r".*[(]issue \d\d\d", |
08e0c4082903
check-commit: wrap too long line
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents:
28013
diff
changeset
|
31 "no space allowed between issue and number"), |
27782
7291c8165e33
check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents:
27781
diff
changeset
|
32 (beforepatch + r".*[(]bug(\d|\s)", "use (issueDDDD) instead of bug"), |
7291c8165e33
check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents:
27781
diff
changeset
|
33 (commitheader + r"# User [^@\n]+\n", "username is not an email address"), |
7291c8165e33
check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents:
27781
diff
changeset
|
34 (commitheader + r"(?!merge with )[^#]\S+[^:] ", |
22043
1274ff3f20a8
contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
35 "summary line doesn't start with 'topic: '"), |
27782
7291c8165e33
check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents:
27781
diff
changeset
|
36 (afterheader + r"[A-Z][a-z]\S+", "don't capitalize summary lines"), |
40952
811f772b44aa
check-commit: disallow capitalization only right after topic
Martin von Zweigbergk <martinvonz@google.com>
parents:
40308
diff
changeset
|
37 (afterheader + r"^\S+: *[A-Z][a-z]\S+", "don't capitalize summary lines"), |
30061
8e805cf27caa
check-commit: allow underscore as commit topic
Mathias De Maré <mathias.de_mare@nokia.com>
parents:
29716
diff
changeset
|
38 (afterheader + r"\S*[^A-Za-z0-9-_]\S*: ", |
27692
e0465035def9
check-commit: try to curb bad commit summary keywords
Matt Mackall <mpm@selenic.com>
parents:
27199
diff
changeset
|
39 "summary keyword should be most user-relevant one-word command or topic"), |
27782
7291c8165e33
check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents:
27781
diff
changeset
|
40 (afterheader + r".*\.\s*\n", "don't add trailing period on summary line"), |
7291c8165e33
check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents:
27781
diff
changeset
|
41 (afterheader + r".{79,}", "summary line too long (limit is 78)"), |
22043
1274ff3f20a8
contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
42 ] |
1274ff3f20a8
contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
43 |
41539
45a4789d3ff2
check-commit: use raw string for regular expression
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40952
diff
changeset
|
44 word = re.compile(r'\S') |
27782
7291c8165e33
check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents:
27781
diff
changeset
|
45 def nonempty(first, second): |
7291c8165e33
check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents:
27781
diff
changeset
|
46 if word.search(first): |
7291c8165e33
check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents:
27781
diff
changeset
|
47 return first |
7291c8165e33
check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents:
27781
diff
changeset
|
48 return second |
7291c8165e33
check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents:
27781
diff
changeset
|
49 |
28043
ac4684c21f73
check-commit: omit whitespace
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents:
28042
diff
changeset
|
50 def checkcommit(commit, node=None): |
27780 | 51 exitcode = 0 |
27781
2af351bd289c
check-commit: support REVs as commandline arguments
timeless <timeless@mozdev.org>
parents:
27780
diff
changeset
|
52 printed = node is None |
27783
1d095371de47
check-commit: sort errors by line number
timeless <timeless@mozdev.org>
parents:
27782
diff
changeset
|
53 hits = [] |
30843
2fb3ae89e4e1
contrib: fix check-commit to not reject commits from `hg sign` and `hg tag`
Augie Fackler <augie@google.com>
parents:
30061
diff
changeset
|
54 signtag = (afterheader + |
2fb3ae89e4e1
contrib: fix check-commit to not reject commits from `hg sign` and `hg tag`
Augie Fackler <augie@google.com>
parents:
30061
diff
changeset
|
55 r'Added (tag [^ ]+|signature) for changeset [a-f0-9]{12}') |
2fb3ae89e4e1
contrib: fix check-commit to not reject commits from `hg sign` and `hg tag`
Augie Fackler <augie@google.com>
parents:
30061
diff
changeset
|
56 if re.search(signtag, commit): |
2fb3ae89e4e1
contrib: fix check-commit to not reject commits from `hg sign` and `hg tag`
Augie Fackler <augie@google.com>
parents:
30061
diff
changeset
|
57 return 0 |
27780 | 58 for exp, msg in errors: |
28012
897b2fcf079f
check-commit: scan for multiple instances of error patterns
Matt Mackall <mpm@selenic.com>
parents:
27783
diff
changeset
|
59 for m in re.finditer(exp, commit): |
27782
7291c8165e33
check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents:
27781
diff
changeset
|
60 end = m.end() |
7291c8165e33
check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents:
27781
diff
changeset
|
61 trailing = re.search(r'(\\n)+$', exp) |
7291c8165e33
check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents:
27781
diff
changeset
|
62 if trailing: |
7291c8165e33
check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents:
27781
diff
changeset
|
63 end -= len(trailing.group()) / 2 |
27783
1d095371de47
check-commit: sort errors by line number
timeless <timeless@mozdev.org>
parents:
27782
diff
changeset
|
64 hits.append((end, exp, msg)) |
1d095371de47
check-commit: sort errors by line number
timeless <timeless@mozdev.org>
parents:
27782
diff
changeset
|
65 if hits: |
1d095371de47
check-commit: sort errors by line number
timeless <timeless@mozdev.org>
parents:
27782
diff
changeset
|
66 hits.sort() |
1d095371de47
check-commit: sort errors by line number
timeless <timeless@mozdev.org>
parents:
27782
diff
changeset
|
67 pos = 0 |
1d095371de47
check-commit: sort errors by line number
timeless <timeless@mozdev.org>
parents:
27782
diff
changeset
|
68 last = '' |
1d095371de47
check-commit: sort errors by line number
timeless <timeless@mozdev.org>
parents:
27782
diff
changeset
|
69 for n, l in enumerate(commit.splitlines(True)): |
1d095371de47
check-commit: sort errors by line number
timeless <timeless@mozdev.org>
parents:
27782
diff
changeset
|
70 pos += len(l) |
1d095371de47
check-commit: sort errors by line number
timeless <timeless@mozdev.org>
parents:
27782
diff
changeset
|
71 while len(hits): |
1d095371de47
check-commit: sort errors by line number
timeless <timeless@mozdev.org>
parents:
27782
diff
changeset
|
72 end, exp, msg = hits[0] |
27782
7291c8165e33
check-commit: try to fix multiline handling
timeless <timeless@mozdev.org>
parents:
27781
diff
changeset
|
73 if pos < end: |
27780 | 74 break |
27783
1d095371de47
check-commit: sort errors by line number
timeless <timeless@mozdev.org>
parents:
27782
diff
changeset
|
75 if not printed: |
1d095371de47
check-commit: sort errors by line number
timeless <timeless@mozdev.org>
parents:
27782
diff
changeset
|
76 printed = True |
29164
91f35b1a34cf
py3: make contrib/check-commit use print_function
Pulkit Goyal <7895pulkit@gmail.com>
parents:
29163
diff
changeset
|
77 print("node: %s" % node) |
91f35b1a34cf
py3: make contrib/check-commit use print_function
Pulkit Goyal <7895pulkit@gmail.com>
parents:
29163
diff
changeset
|
78 print("%d: %s" % (n, msg)) |
91f35b1a34cf
py3: make contrib/check-commit use print_function
Pulkit Goyal <7895pulkit@gmail.com>
parents:
29163
diff
changeset
|
79 print(" %s" % nonempty(l, last)[:-1]) |
27783
1d095371de47
check-commit: sort errors by line number
timeless <timeless@mozdev.org>
parents:
27782
diff
changeset
|
80 if "BYPASS" not in os.environ: |
1d095371de47
check-commit: sort errors by line number
timeless <timeless@mozdev.org>
parents:
27782
diff
changeset
|
81 exitcode = 1 |
1d095371de47
check-commit: sort errors by line number
timeless <timeless@mozdev.org>
parents:
27782
diff
changeset
|
82 del hits[0] |
1d095371de47
check-commit: sort errors by line number
timeless <timeless@mozdev.org>
parents:
27782
diff
changeset
|
83 last = nonempty(l, last) |
1d095371de47
check-commit: sort errors by line number
timeless <timeless@mozdev.org>
parents:
27782
diff
changeset
|
84 |
27780 | 85 return exitcode |
22043
1274ff3f20a8
contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
86 |
27780 | 87 def readcommit(node): |
88 return os.popen("hg export %s" % node).read() | |
89 | |
90 if __name__ == "__main__": | |
27781
2af351bd289c
check-commit: support REVs as commandline arguments
timeless <timeless@mozdev.org>
parents:
27780
diff
changeset
|
91 exitcode = 0 |
27780 | 92 node = os.environ.get("HG_NODE") |
22043
1274ff3f20a8
contrib: add check-commit hook script to sanity-check commits
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
93 |
27780 | 94 if node: |
95 commit = readcommit(node) | |
27781
2af351bd289c
check-commit: support REVs as commandline arguments
timeless <timeless@mozdev.org>
parents:
27780
diff
changeset
|
96 exitcode = checkcommit(commit) |
2af351bd289c
check-commit: support REVs as commandline arguments
timeless <timeless@mozdev.org>
parents:
27780
diff
changeset
|
97 elif sys.argv[1:]: |
2af351bd289c
check-commit: support REVs as commandline arguments
timeless <timeless@mozdev.org>
parents:
27780
diff
changeset
|
98 for node in sys.argv[1:]: |
2af351bd289c
check-commit: support REVs as commandline arguments
timeless <timeless@mozdev.org>
parents:
27780
diff
changeset
|
99 exitcode |= checkcommit(readcommit(node), node) |
27780 | 100 else: |
101 commit = sys.stdin.read() | |
27781
2af351bd289c
check-commit: support REVs as commandline arguments
timeless <timeless@mozdev.org>
parents:
27780
diff
changeset
|
102 exitcode = checkcommit(commit) |
27780 | 103 sys.exit(exitcode) |