Thu, 04 Oct 2018 10:30:05 -0700 context: drop incorrect and superfluous docstring
Martin von Zweigbergk <martinvonz@google.com> [Thu, 04 Oct 2018 10:30:05 -0700] rev 40061
context: drop incorrect and superfluous docstring It's been incorrect at least since 8b86acc7aa64 (context: drop support for looking up context by ambiguous changeid (API), 2018-04-28). Differential Revision: https://phab.mercurial-scm.org/D4880
Thu, 04 Oct 2018 21:35:12 -0400 remotenames: follow-up on D3639 to make revset funcs take only one arg
Augie Fackler <raf@durin42.com> [Thu, 04 Oct 2018 21:35:12 -0400] rev 40060
remotenames: follow-up on D3639 to make revset funcs take only one arg Per the review discussion on D3639, we want this to just take one argument. That ended up simplifying the code, so I'm sharing this as a follow-up to that revision rather than editing in-flight.
Thu, 12 Jul 2018 03:12:09 +0530 remotenames: add names argument to remotenames revset
Pulkit Goyal <7895pulkit@gmail.com> [Thu, 12 Jul 2018 03:12:09 +0530] rev 40059
remotenames: add names argument to remotenames revset This patch adds names argument to the revsets provided by the remotenames extension. The revsets are remotenames(), remotebranches() and remotebookmarks(). names can be a single names, list of names or can be empty too which means it's an optional argument. If names is/are passed, changesets which have those remotenames will be returned. If names are not passed, changesets from all the remotenames are shown. Passing an invalid remotename does not throw error. The name argument also supports pattern matching. Tests are added for the argument in tests/test-logexchange.t Differential Revision: https://phab.mercurial-scm.org/D3639
Fri, 07 Sep 2018 11:43:48 -0400 copies: add time information to the debug information
Boris Feld <boris.feld@octobus.net> [Fri, 07 Sep 2018 11:43:48 -0400] rev 40058
copies: add time information to the debug information
Fri, 07 Sep 2018 11:16:06 -0400 copies: add a devel debug mode to trace what copy tracing does
Boris Feld <boris.feld@octobus.net> [Fri, 07 Sep 2018 11:16:06 -0400] rev 40057
copies: add a devel debug mode to trace what copy tracing does Mercurial can spend a lot of time finding renames between two commits. Having more information about that process help to understand what makes it slow in an individual instance. (eg: many files vs 1 file, etc...)
Tue, 02 Oct 2018 17:34:34 -0700 revlog: rewrite censoring logic
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 02 Oct 2018 17:34:34 -0700] rev 40056
revlog: rewrite censoring logic I was able to corrupt a revlog relatively easily with the existing censoring code. The underlying problem is that the existing code doesn't fully take delta chains into account. When copying revisions that occur after the censored revision, the delta base can refer to a censored revision. Then at read time, things blow up due to the revision data not being a compressed delta. This commit rewrites the revlog censoring code to take a higher-level approach. We now create a new revlog instance pointing at temp files. We iterate through each revision in the source revlog and insert those revisions into the new revlog, replacing the censored revision's data along the way. The new implementation isn't as efficient as the old one. This is because it will fully engage delta computation on insertion. But I don't think it matters. The new implementation is a bit hacky because it attempts to reload the revlog instance with a new revlog index/data file. This is fragile. But this is needed because the index (which could be backed by C) would have a cached copy of the old, possibly changed data and that could lead to problems accessing index or revision data later. One benefit of the new approach is that we integrate with the transaction. The old revlog is backed up and if the transaction is rolled back, the original revlog is restored. As part of this, we had to teach the transaction about the store vfs. I'm not super keen about this. But this was the easiest way to hook things up to the transaction. We /could/ just ignore the transaction like we were doing before. But any file mutation should be governed by transaction semantics, including undo during rollback. Differential Revision: https://phab.mercurial-scm.org/D4869
Tue, 02 Oct 2018 17:28:54 -0700 revlog: move loading of index data into own method
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 02 Oct 2018 17:28:54 -0700] rev 40055
revlog: move loading of index data into own method This will allow us to "reload" a revlog instance from a rewritten index file, which will be used in a subsequent commit. Differential Revision: https://phab.mercurial-scm.org/D4868
Wed, 03 Oct 2018 10:57:35 -0700 revlog: clear revision cache on hash verification failure
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 03 Oct 2018 10:57:35 -0700] rev 40054
revlog: clear revision cache on hash verification failure The revision cache is populated after raw revision fulltext is retrieved but before hash verification. If hash verification fails, the revision cache will be populated and subsequent operations to retrieve the invalid fulltext may return the cached fulltext instead of raising. This commit changes hash verification so it will invalidate the revision cache if the cached node fails hash verification. The side-effect is that subsequent operations to request the revision text - even the raw revision text - will always fail. The new behavior is consistent and is definitely less wrong. There is an open question of whether revision(raw=True) should validate hashes. But I'm going to punt on this problem. We can always change behavior later. And to be honest, I'm not sure we should expose raw=True on the storage interface at all. Another day... Differential Revision: https://phab.mercurial-scm.org/D4867
Thu, 06 Sep 2018 02:36:25 -0400 fuzz: new fuzzer for cext/manifest.c
Augie Fackler <augie@google.com> [Thu, 06 Sep 2018 02:36:25 -0400] rev 40053
fuzz: new fuzzer for cext/manifest.c This is a bit messy, because lazymanifest is tightly coupled to the cpython API for performance reasons. As a result, we have to build a whole Python without pymalloc (so ASAN can help us out) and link against that. Then we have to use an embedded Python interpreter. We could manually drive the lazymanifest in C from that point, but experimentally just using PyEval_EvalCode isn't really any slower so we may as well do that and write the innermost guts of the fuzzer in Python. Leak detection is currently disabled for this fuzzer because there are a few global-lifetime things in our extensions that we more or less intentionally leak and I didn't want to take the detour to work around that for now. This should not be pushed to our repo until https://github.com/google/oss-fuzz/pull/1853 is merged, as this depends on having the Python tarball around. Differential Revision: https://phab.mercurial-scm.org/D4879
Wed, 03 Oct 2018 10:32:21 -0700 revlog: rename _cache to _revisioncache
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 03 Oct 2018 10:32:21 -0700] rev 40052
revlog: rename _cache to _revisioncache "cache" is generic and revlog instances have multiple caches. Let's be descriptive about what this is a cache for. Differential Revision: https://phab.mercurial-scm.org/D4866
Wed, 03 Oct 2018 10:56:48 -0700 testing: add file storage integration for bad hashes and censoring
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 03 Oct 2018 10:56:48 -0700] rev 40051
testing: add file storage integration for bad hashes and censoring In order to implement these tests, we need a backdoor to write data into storage backends while bypassing normal checks. We invent a callable to do that. As part of writing the tests, I found a bug with censorrevision() pretty quickly! After calling censorrevision(), attempting to access revision data for an affected node raises a cryptic error related to malformed compression. This appears to be due to the revlog not adjusting delta chains as part of censoring. I also found a bug with regards to hash verification and revision fulltext caching. Essentially, we cache the fulltext before hash verification. If we look up the fulltext after a failed hash verification, we don't get a hash verification exception. Furthermore, the behavior of revision(raw=True) can be inconsistent depending on the order of operations. I'll be fixing both these bugs in subsequent commits. Differential Revision: https://phab.mercurial-scm.org/D4865
Wed, 03 Oct 2018 10:03:41 -0700 testing: add file storage tests for getstrippoint() and strip()
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 03 Oct 2018 10:03:41 -0700] rev 40050
testing: add file storage tests for getstrippoint() and strip() Differential Revision: https://phab.mercurial-scm.org/D4864
Wed, 03 Oct 2018 10:04:04 -0700 wireprotov2: always advertise raw repo requirements
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 03 Oct 2018 10:04:04 -0700] rev 40049
wireprotov2: always advertise raw repo requirements I'm pretty sure my original thinking behind making it conditional on stream clone support was that the behavior mirrored wire protocol version 1. I don't see a compelling reason for us to not advertise the server's storage requirements. The proper way to advertise stream clone support in wireprotov2 would be to not advertise the command(s) required to perform stream clone or to advertise a separate capability denoting stream clone support. Stream clone isn't yet implemented on wireprotov2, so we can cross this bridge later. Differential Revision: https://phab.mercurial-scm.org/D4863
Wed, 03 Oct 2018 09:48:22 -0700 tests: don't be as verbose in wireprotov2 tests
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 03 Oct 2018 09:48:22 -0700] rev 40048
tests: don't be as verbose in wireprotov2 tests I don't think that printing low-level I/O and frames is beneficial to testing command-level functionality. Protocol-level testing, yes. But command-level functionality shouldn't care about low-level details in most cases. This output makes tests more verbose and harder to read. It also makes them harder to maintain, as you need to glob over various dynamic width fields. Let's remove these low-level details from many of the wireprotov2 tests. Differential Revision: https://phab.mercurial-scm.org/D4861
Wed, 03 Oct 2018 12:57:01 -0700 repository: define and use revision flag constants
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 03 Oct 2018 12:57:01 -0700] rev 40047
repository: define and use revision flag constants Revlogs have a per-revision 2 byte field holding integer flags that define how revision data should be interpreted. For historical reasons, these integer values are sent verbatim on the wire protocol as part of changegroup data. From a semantic standpoint, the flags that go out over the wire are different from the flags stored internally by revlogs. Failure to establish this semantic distinction creates unwanted strong coupling between revlog's internals and the wire protocol. This commit establishes new constants on the repository module that define the revision flags used by the wire protocol (and by some internal storage APIs, sadly). The changegroups internals documentation has been updated to document them explicitly. Various references throughout the repo now use the repository constants instead of the revlog constants. This is done to make it clear that we're operating on generic revision data and this isn't tied to revlogs. Differential Revision: https://phab.mercurial-scm.org/D4860
Thu, 04 Oct 2018 01:22:25 +0200 context: reverse conditional branch order in introrev
Boris Feld <boris.feld@octobus.net> [Thu, 04 Oct 2018 01:22:25 +0200] rev 40046
context: reverse conditional branch order in introrev Positive logic will be simpler to follow. It will help to clarify coming refactoring.
Thu, 04 Oct 2018 08:40:01 +0200 context: drop a redundant fast path in introrev
Boris Feld <boris.feld@octobus.net> [Thu, 04 Oct 2018 08:40:01 +0200] rev 40045
context: drop a redundant fast path in introrev Now that _adjustlinkrev fast path this case itself, we no longer need an extra conditional. A nice side effect is that we are no longer calling `self.rev()`. In case where `_descendantrev` is set, calling `self.rev` will trigger a potentially expensive `_adjustlinkrev` call. So blindly calling `self.rev()` to avoid another `_adjustlinkrev` call can be counterproductive. Note that `_descendantrev` is currently never taken into account in `introrev` so far which is wrong. We'll fix that in changeset later in this series.
Thu, 04 Oct 2018 08:34:59 +0200 context: fast path linkrev adjustement in trivial case
Boris Feld <boris.feld@octobus.net> [Thu, 04 Oct 2018 08:34:59 +0200] rev 40044
context: fast path linkrev adjustement in trivial case If the search starts from the linkrev, there is nothing to adjust.
Thu, 04 Oct 2018 11:28:48 +0200 url: allow to configure timeout on http connection
Cédric Krier <ced@b2ck.com> [Thu, 04 Oct 2018 11:28:48 +0200] rev 40043
url: allow to configure timeout on http connection By default, httplib.HTTPConnection opens connection with no timeout. If the server is hanging, Mercurial will wait indefinitely. This may be an issue for automated scripts. Differential Revision: https://phab.mercurial-scm.org/D4878
Wed, 26 Sep 2018 23:50:14 +0200 obsolete: explicitly track folds inside the markers
Boris Feld <boris.feld@octobus.net> [Wed, 26 Sep 2018 23:50:14 +0200] rev 40042
obsolete: explicitly track folds inside the markers We now record information to be able to recognize "fold" event from obsolescence markers. To do so, we track the following pieces of information: a) a fold ID. Unique to that fold (per successor), b) the number of predecessors, c) the index of the predecessor in that fold. We will now be able to create an algorithm able to find "predecessorssets". We now store this data in the generic "metadata" field of the markers. Updating the format to have a more compact storage for this would be useful. This way of tracking a fold through multiple markers could be applied to split too. This would have two advantages: 1) We get a simpler format, since number of successors is limited to [0-1]. 2) We can better deal with situations where only some of the split successors are pushed to a remote repository. We should look into the relevance of such a change before updating the on-disk format. note: unlike splits, folds do not have to deal with cases where only some of the markers have been synchronized. As they all share the same successor changesets, they are all relevant to the same nodes.
Wed, 03 Oct 2018 11:59:47 +0200 cleanupnodes: update comment to drop mention of filtering
Boris Feld <boris.feld@octobus.net> [Wed, 03 Oct 2018 11:59:47 +0200] rev 40041
cleanupnodes: update comment to drop mention of filtering Since changeset 1857f50a9643 drop the filtering, we should not longer mention it in code comment.
Wed, 26 Sep 2018 18:04:46 -0700 treemanifests: remove _loadalllazy when doing copies
spectral <spectral@google.com> [Wed, 26 Sep 2018 18:04:46 -0700] rev 40040
treemanifests: remove _loadalllazy when doing copies 'before' here is https://phab.mercurial-scm.org/D4845 (not the committed/rebased version) diff --git: repo | N | T | before (mean +- stdev) | after (mean +- stdev) | % of before ------+---+---+------------------------+-----------------------+------------ m-u | | | 1.329 s +- 0.011 s | 1.320 s +- 0.010 s | 99.3% m-u | | x | 1.316 s +- 0.005 s | 1.334 s +- 0.018 s | 101.4% m-u | x | | 1.330 s +- 0.021 s | 1.322 s +- 0.005 s | 99.4% m-u | x | x | 87.2 ms +- 0.7 ms | 86.9 ms +- 1.5 ms | 99.7% l-d-r | | | 203.3 ms +- 7.8 ms | 199.4 ms +- 1.8 ms | 98.1% l-d-r | | x | 204.6 ms +- 2.8 ms | 201.7 ms +- 2.1 ms | 98.6% l-d-r | x | | 90.5 ms +- 11.0 ms | 86.2 ms +- 1.0 ms | 95.2% l-d-r | x | x | 66.3 ms +- 2.0 ms | 66.4 ms +- 0.9 ms | 100.2% diff -c . --git: repo | N | T | before (mean +- stdev) | after (mean +- stdev) | % of before ------+---+---+------------------------+-----------------------+------------ m-u | | | 239.4 ms +- 2.0 ms | 241.7 ms +- 4.6 ms | 101.0% m-u | | x | 128.9 ms +- 1.9 ms | 130.9 ms +- 7.7 ms | 101.6% m-u | x | | 241.1 ms +- 1.6 ms | 240.1 ms +- 1.4 ms | 99.6% m-u | x | x | 133.4 ms +- 1.5 ms | 133.4 ms +- 1.2 ms | 100.0% l-d-r | | | 84.3 ms +- 1.5 ms | 83.5 ms +- 1.0 ms | 99.1% l-d-r | | x | 200.9 ms +- 6.3 ms | 203.0 ms +- 4.4 ms | 101.0% l-d-r | x | | 108.1 ms +- 1.4 ms | 108.7 ms +- 2.1 ms | 100.6% l-d-r | x | x | 190.2 ms +- 4.8 ms | 191.6 ms +- 2.0 ms | 100.7% rebase -r . --keep -d .^^: repo | N | T | before (mean +- stdev) | after (mean +- stdev) | % of before ------+---+---+------------------------+-----------------------+------------ m-u | | | 5.655 s +- 0.029 s | 5.640 s +- 0.036 s | 99.7% m-u | | x | 5.813 s +- 0.038 s | 5.773 s +- 0.028 s | 99.3% m-u | x | | 5.593 s +- 0.043 s | 5.589 s +- 0.028 s | 99.9% m-u | x | x | 648.2 ms +- 19.2 ms | 637.3 ms +- 27.7 ms | 98.3% l-d-r | | | 673.3 ms +- 8.0 ms | 673.2 ms +- 6.8 ms | 100.0% l-d-r | | x | 6.583 s +- 0.030 s | 5.721 s +- 0.028 s | 86.9% <-- l-d-r | x | | 277.8 ms +- 6.7 ms | 276.0 ms +- 2.7 ms | 99.4% l-d-r | x | x | 1.692 s +- 0.013 s | 720.9 ms +- 13.3 ms | 42.6% <-- status --change . --copies: repo | N | T | before (mean +- stdev) | after (mean +- stdev) | % of before ------+---+---+------------------------+-----------------------+------------ m-u | | | 220.9 ms +- 1.6 ms | 219.9 ms +- 2.2 ms | 99.5% m-u | | x | 109.2 ms +- 1.0 ms | 109.4 ms +- 0.8 ms | 100.2% m-u | x | | 222.6 ms +- 1.7 ms | 221.4 ms +- 2.1 ms | 99.5% m-u | x | x | 113.4 ms +- 0.5 ms | 113.1 ms +- 1.1 ms | 99.7% l-d-r | | | 82.1 ms +- 1.7 ms | 82.1 ms +- 1.2 ms | 100.0% l-d-r | | x | 199.8 ms +- 4.0 ms | 200.7 ms +- 3.6 ms | 100.5% l-d-r | x | | 85.4 ms +- 1.5 ms | 85.2 ms +- 0.3 ms | 99.8% l-d-r | x | x | 202.6 ms +- 4.4 ms | 208.0 ms +- 4.0 ms | 102.7% status --copies: repo | N | T | before (mean +- stdev) | after (mean +- stdev) | % of before ------+---+---+------------------------+-----------------------+------------ m-u | | | 1.941 s +- 0.014 s | 1.930 s +- 0.009 s | 99.4% m-u | | x | 1.924 s +- 0.007 s | 1.950 s +- 0.010 s | 101.4% m-u | x | | 1.959 s +- 0.085 s | 1.926 s +- 0.009 s | 98.3% m-u | x | x | 96.2 ms +- 1.0 ms | 96.4 ms +- 0.7 ms | 100.2% l-d-r | | | 604.4 ms +- 10.6 ms | 602.6 ms +- 7.1 ms | 99.7% l-d-r | | x | 605.7 ms +- 4.1 ms | 607.4 ms +- 6.1 ms | 100.3% l-d-r | x | | 182.4 ms +- 1.2 ms | 183.4 ms +- 1.2 ms | 100.5% l-d-r | x | x | 150.8 ms +- 2.0 ms | 150.6 ms +- 1.0 ms | 99.9% update $rev^; ~/src/hg/hg{hg}/hg update $rev: repo | N | T | before (mean +- stdev) | after (mean +- stdev) | % of before ------+---+---+------------------------+-----------------------+------------ m-u | | | 3.185 s +- 0.027 s | 3.181 s +- 0.017 s | 99.9% m-u | | x | 3.028 s +- 0.021 s | 2.954 s +- 0.010 s | 97.6% m-u | x | | 3.168 s +- 0.010 s | 3.175 s +- 0.023 s | 100.2% m-u | x | x | 317.5 ms +- 3.5 ms | 313.2 ms +- 2.9 ms | 98.6% l-d-r | | | 456.2 ms +- 10.6 ms | 454.4 ms +- 5.8 ms | 99.6% l-d-r | | x | 9.236 s +- 0.063 s | 757.9 ms +- 9.2 ms | 8.2% <-- l-d-r | x | | 257.6 ms +- 2.3 ms | 261.2 ms +- 1.7 ms | 101.4% l-d-r | x | x | 1.614 s +- 0.013 s | 478.0 ms +- 14.3 ms | 29.6% <-- Differential Revision: https://phab.mercurial-scm.org/D4875
Tue, 25 Sep 2018 19:25:41 -0700 treemanifests: store whether a lazydirs entry needs copied after materializing
spectral <spectral@google.com> [Tue, 25 Sep 2018 19:25:41 -0700] rev 40039
treemanifests: store whether a lazydirs entry needs copied after materializing Due to the way that things like manifestlog.get caches its values, without making a copy (if necessary) after calling readsubtree(), we might end up adjusting the state of the same object on different contexts, breaking things like dirty state tracking (and probably other things). Differential Revision: https://phab.mercurial-scm.org/D4874
Tue, 02 Oct 2018 18:55:07 -0700 treemanifests: extract _loaddifflazy from _diff, use in _filesnotin
spectral <spectral@google.com> [Tue, 02 Oct 2018 18:55:07 -0700] rev 40038
treemanifests: extract _loaddifflazy from _diff, use in _filesnotin Differential Revision: https://phab.mercurial-scm.org/D4873
Wed, 03 Oct 2018 18:07:49 -0400 identify: show remote bookmarks in `hg id url -Tjson -B`
Valentin Gatien-Baron <vgatien-baron@janestreet.com> [Wed, 03 Oct 2018 18:07:49 -0400] rev 40037
identify: show remote bookmarks in `hg id url -Tjson -B` I didn't display bookmarks when `default and not ui.quiet`: it seems strange for templates to depend on --id or -q, and it would take more code for `hg id url -T {node}` to not request remote bookmarks. An alternative I thought of was providing lazy data to the formatter, `fm.data(bookmarks=lambda: fm.formatlist(getbms(), name='bookmark'))`. The plainformatter would naturally not compute it, the templateformatter would compute only what it needs, and the other ones would compute everything, but that's not supported (or I don't see how), so I abandoned this idea. Differential Revision: https://phab.mercurial-scm.org/D4872
Wed, 03 Oct 2018 16:03:16 -0400 showstack: also handle SIGALRM
Augie Fackler <augie@google.com> [Wed, 03 Oct 2018 16:03:16 -0400] rev 40036
showstack: also handle SIGALRM This is looking *very* handy when debugging mysterious hangs in a test: you can wrap a hanging invocation in `perl -e 'alarm shift @ARGV; exec @ARGV' 1` for example, a hanging `hg pull` becomes `perl -e 'alarm shift @ARGV; exec @ARGV' 1 hg pull` where the `1` is the timeout in seconds before the process will be hit with SIGALRM. After making that edit to the test file, you can then use --extra-config-opt on run-tests.py to globaly enable showstack during the test run, so you'll get full stack traces as you force your hg to exit. I wonder (but only a little, not enough to take action just yet) if we should wire up some scaffolding in run-tests itself to automatically wrap all commands in alarm(3) somehow to avoid hangs in the future? Differential Revision: https://phab.mercurial-scm.org/D4870
Wed, 03 Oct 2018 13:54:31 -0700 exchangev2: add progress bar around manifest scanning
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 03 Oct 2018 13:54:31 -0700] rev 40035
exchangev2: add progress bar around manifest scanning This can take a long time on large repositories. Let's add a progress bar so we don't have long periods where it isn't obvious what is going on. Differential Revision: https://phab.mercurial-scm.org/D4859
Mon, 01 Oct 2018 13:17:38 -0700 httppeer: report http statistics
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 01 Oct 2018 13:17:38 -0700] rev 40034
httppeer: report http statistics Now that keepalive.py records HTTP request count and the number of bytes sent and received as part of performing those requests, we can easily print a report on the activity when closing a peer instance! Exact byte counts are globbed in tests because they are influenced by non-deterministic things, such as hostnames and port numbers. Plus, the exact byte count isn't too important anyway. I feel obliged to note that printing the byte count could have security implications. e.g. if sending a password via HTTP basic auth, the length of that password will influence the byte count and the reporting of the byte count could be a side-channel leak of the password length. I /think/ this is beyond our threshold for concern. But if we think it poses a problem, we can teach the byte count logging code to e.g. ignore sensitive HTTP request headers. We could also consider not reporting the byte count of request headers altogether. But since the wire protocol uses HTTP headers for sending command arguments, it is kind of important to report their size. Differential Revision: https://phab.mercurial-scm.org/D4858
(0) -30000 -10000 -3000 -1000 -300 -100 -50 -28 +28 +50 +100 +300 +1000 +3000 +10000 tip