Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 30 Jul 2019 17:25:16 +0200] rev 42841
upgrade: add an argument to control changelog upgrade
Same as for `--manifest` we can now select more selection.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 30 Jul 2019 00:35:52 +0200] rev 42840
upgrade: add an argument to control manifest upgrade
The argument can be used to only "clone" manifest revlog or clone all of them
but this one. The selection will make more sense once we have a `--changelog`
flag in the next changesets.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 30 Aug 2019 18:11:41 +0200] rev 42839
unionrepo: drop the custom `rawdata` implementation
We can rely on the main one now.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 30 Aug 2019 18:10:43 +0200] rev 42838
unionrepo: drop `baserevdiff`
It has no caller anymore.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 30 Aug 2019 18:10:00 +0200] rev 42837
unionrepo: use normal inheritance scheme to call revdiff
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 30 Aug 2019 18:08:35 +0200] rev 42836
unionrepo: fix `revdiff` implementation to use `rawdata`
The parent code is using rawdata so we should use it here. Before this change,
union repo was probably broken with some flag processors.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 30 Aug 2019 18:05:24 +0200] rev 42835
unionrepo: get rid of `baserevision`
The method is not called anywhere anymore, so we can safely drop it.
Some of the comment get moved to `baserevdiff` because we did not got rid of it
(yet).
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 30 Aug 2019 17:45:38 +0200] rev 42834
unionrepo: use a lower level overide in unionrepo too
The unionrepo class also have a strange `baserevision` hack, let's try to get
ride of it too.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 30 Aug 2019 18:12:16 +0200] rev 42833
bundlerepo: drop the custom `rawdata` implementation
We can rely on the main one now.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 30 Aug 2019 17:46:47 +0200] rev 42832
bundlerepo: drop the `baserevision` hack
It is not used anywhere anymore, so we can safely drop it.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 30 Aug 2019 15:04:54 +0200] rev 42831
bundlerepo: simplify code to take advantage of `_rawtext`
In the revlog code, the code getting the raw text is now isolated. We take
advantage of this to simplify the bundlerepo code.
Yuya Nishihara <yuya@tcha.org> [Sat, 31 Aug 2019 11:10:12 +0900] rev 42830
merge with stable
Raphaël Gomès <rgomes@octobus.net> [Thu, 29 Aug 2019 15:49:16 +0200] rev 42829
rust: run cargo fmt
Martin von Zweigbergk <martinvonz@google.com> [Wed, 28 Aug 2019 17:36:53 -0700] rev 42828
py3: use pycompat.maplist() in chgserver
test-chg.t almost passes on py3 after this patch.
Differential Revision: https://phab.mercurial-scm.org/D6771
Martin von Zweigbergk <martinvonz@google.com> [Fri, 23 Aug 2019 08:54:32 -0700] rev 42827
run-tests: handle --local before --with-hg
We no longer support them both together, so this is now safe to do. By
checking --local first, we avoid error out about an invalid --with-hg
script if --local was also given (we instead tell the user that the
options are mutually exclusive).
I also had to wrap the 'binpath' we pass to setattr() in _strpath() to
keep `python3 run-tests.py -l` working. That change also made `python3
run-tests.py -l --chg` work, which was the reason for this series.
Differential Revision: https://phab.mercurial-scm.org/D6760
Martin von Zweigbergk <martinvonz@google.com> [Fri, 23 Aug 2019 08:46:49 -0700] rev 42826
run-tests: error out on `--local --with-[c]hg`
I don't see much reason to allow these combinations. You could use
--local and override only one of --with-hg or --with-chg, but I don't
see much practical use for that. It would be easy to work around
anyway by passing both --with-hg and --with-chg. By erroring out, it
makes the code a bit easier to reason about to allow the next few
patches.
Differential Revision: https://phab.mercurial-scm.org/D6759
Matt Harbison <matt_harbison@yahoo.com> [Tue, 20 Aug 2019 18:05:07 -0400] rev 42825
contrib: simplify the genosxversion.py command to find the hg libraries
I forget what problem I ran into while trying to teach the makefile to use a
non-system python. (It might have ben missing hg-evolve and/or keyring, but
`check_output()` was raising an error.) This still isn't great because it will
return non zero for something like the username not being set, even though we
aren't asking for it. But I suppose it's still useful to simplify.
Differential Revision: https://phab.mercurial-scm.org/D6753
Pulkit Goyal <pulkit@yandex-team.ru> [Sun, 18 Aug 2019 02:28:42 +0300] rev 42824
interfaceutil: move to interfaces/
Now that we have a dedicated folder for interfaces, let's move interfaceutil
there.
Differential Revision: https://phab.mercurial-scm.org/D6742
Pulkit Goyal <pulkit@yandex-team.ru> [Sun, 18 Aug 2019 00:45:33 +0300] rev 42823
interfaces: create a new folder for interfaces and move repository.py in it
I was trying to understand current interfaces and write new ones and I realized
we need to improve how current interfaces are organised. This creates a
dedicated folder for defining interfaces and move `repository.py` which defines
all the current interfaces inside it.
Differential Revision: https://phab.mercurial-scm.org/D6741
Martin von Zweigbergk <martinvonz@google.com> [Thu, 22 Aug 2019 16:47:31 -0700] rev 42822
narrow: fix typo "respositories"
Differential Revision: https://phab.mercurial-scm.org/D6758
Augie Fackler <augie@google.com> [Fri, 23 Aug 2019 17:03:42 -0400] rev 42821
merge with stable
Martin von Zweigbergk <martinvonz@google.com> [Wed, 21 Aug 2019 13:14:39 -0700] rev 42820
merge: hint about using `hg resolve` for resolving conflicts
This was suggested by one of our users at Google. Makes sense to me.
Differential Revision: https://phab.mercurial-scm.org/D6755
Yuya Nishihara <yuya@tcha.org> [Sat, 17 Aug 2019 18:28:55 +0900] rev 42819
rust-dirstate: remove test case for DirsMultiset::new(Manifest, Some)
It's no longer possible.
Yuya Nishihara <yuya@tcha.org> [Sat, 17 Aug 2019 18:25:29 +0900] rev 42818
rust-dirstate: split DirsMultiset constructor per input type
Since skip_state only applies to dirstate, it doesn't make sense to unify
these constructors and dispatch by enum.
Yuya Nishihara <yuya@tcha.org> [Sat, 17 Aug 2019 16:33:05 +0900] rev 42817
rust-dirstate: remove excessive clone() of parameter and return value
I think pass-by-ref is preferred in general.
Yuya Nishihara <yuya@tcha.org> [Sat, 17 Aug 2019 18:06:08 +0900] rev 42816
rust-dirstate: handle invalid length of p1/p2 parameters
It uses match syntax since map_err() failed to deduce the argument type.
Yuya Nishihara <yuya@tcha.org> [Sat, 17 Aug 2019 11:37:42 +0900] rev 42815
rust: simply use TryInto to convert slice to array
Since our rust module depends on TryInto, there's no point to avoid using it.
While rewriting copy_into_array(), I noticed CPython interface doesn't check
the length of the p1/p2 values, which is marked as TODO.
Yuya Nishihara <yuya@tcha.org> [Sat, 17 Aug 2019 13:55:05 +0900] rev 42814
rust-dirstate: use PARENT_SIZE constant where appropriate
Yuya Nishihara <yuya@tcha.org> [Sat, 17 Aug 2019 13:27:11 +0900] rev 42813
rust-dirstate: rename NULL_REVISION to NULL_ID which isn't a revision number
Yuya Nishihara <yuya@tcha.org> [Sat, 17 Aug 2019 13:26:04 +0900] rev 42812
rust-dirstate: remove repetition in array literal
Yuya Nishihara <yuya@tcha.org> [Sat, 17 Aug 2019 13:42:30 +0900] rev 42811
rust-dirstate: remove too abstracted way of getting &[u8]
Yuya Nishihara <yuya@tcha.org> [Sat, 17 Aug 2019 11:43:05 +0900] rev 42810
rust-dirstate: remove unneeded "ref"
At 7cae6bc29ff9, .to_owned() was rewritten as .to_owned().to_vec(), which
is no longer needed since the filename is a single reference.
Yuya Nishihara <yuya@tcha.org> [Sat, 17 Aug 2019 12:17:46 +0900] rev 42809
rust-parsers: fix unboxing of PyInt on Python 3
Broken at 7cae6bc29ff9.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 20 Aug 2019 17:12:36 +0200] rev 42808
revlog: split `rawtext` retrieval out of _revisiondata
This part is reasonably independent. Having it on its own clarify the code flow
and will help code that inherit from revlog to overwrite specific area only.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 19 Aug 2019 16:29:43 +0200] rev 42807
revlog: avoid caching raw text too early in _revisiondata
Without this change, we could cache the rawtext without considering for it
validating the cache or not. If the exception raised by the invalid hash were to
be caught and the same revision accessed again, the invalid rawtext would be
returned.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 07 Aug 2019 23:55:01 +0200] rev 42806
revlog: add some documentation to `_revisiondata` code
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 07 Aug 2019 23:52:55 +0200] rev 42805
revlog: move `nullid` early return sooner in `_revisiondata`
Let us deal with the special case before we start dealing with more generic
case.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 07 Aug 2019 23:48:54 +0200] rev 42804
revlog: stop calling `basetext` `rawtext` in _revisiondata
If the cache entry is used as a base test for delta, it is not the rawtext we
need. We update the variable name to clarify this.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 07 Aug 2019 23:46:14 +0200] rev 42803
revlog: assign rawtext earlier in `_revisiondata`
Assigning the revision earlier make the code easier to read.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 19 Aug 2019 16:14:27 +0200] rev 42802
revlog: drop silly `raw` parameter to `rawdata` function
This is a leftover from `revision` and does not make sense for `rawdata`
Martin von Zweigbergk <martinvonz@google.com> [Mon, 19 Aug 2019 10:34:10 -0700] rev 42801
perf: don't depend on pycompat for older Mercurial versions
We already define local _sysstr() and _xrange() for compatibility, so
we should use those.
Also create a similar local _bytestr() corresponding to
pycompat.bytestr().
Differential Revision: https://phab.mercurial-scm.org/D6745
Martin von Zweigbergk <martinvonz@google.com> [Mon, 19 Aug 2019 10:39:13 -0700] rev 42800
perf: don't try to call `util.queue` on Mercurial version before it existed
Differential Revision: https://phab.mercurial-scm.org/D6744
Martin von Zweigbergk <martinvonz@google.com> [Mon, 19 Aug 2019 10:38:38 -0700] rev 42799
perf: handle NameError for `pycompat.foo` when pycompat wasn't imported
On old Mercurial versions, we won't have a pycompat variable defined,
and then `pycompat.foo` will raise a NameError.
Differential Revision: https://phab.mercurial-scm.org/D6743
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 07 Aug 2019 20:12:07 +0200] rev 42798
rawdata: update callers in shallowbundle
We update callers incrementally because this help bisecting failures. This was
useful during development, so we expect it might be useful again in the future.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 07 Aug 2019 20:11:50 +0200] rev 42797
rawdata: update callers in storageutils
We update callers incrementally because this help bisecting failures. This was
useful during development, so we expect it might be useful again in the future.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 07 Aug 2019 20:11:35 +0200] rev 42796
rawdata: update callers in delta utils
We update callers incrementally because this help bisecting failures. This was
useful during development, so we expect it might be useful again in the future.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 07 Aug 2019 20:11:22 +0200] rev 42795
rawdata: update callers in repository
We update callers incrementally because this help bisecting failures. This was
useful during development, so we expect it might be useful again in the future.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 07 Aug 2019 20:11:12 +0200] rev 42794
rawdata: update callers in testing/storage.py
We update callers incrementally because this help bisecting failures. This was
useful during development, so we expect it might be useful again in the future.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 07 Aug 2019 22:41:49 +0200] rev 42793
rawdata: update callers in test-revlog-raw
We update callers incrementally because this help bisecting failures. This was
useful during development, so we expect it might be useful again in the future.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 07 Aug 2019 20:10:43 +0200] rev 42792
rawdata: update callers in lfs' tests
We update callers incrementally because this help bisecting failures. This was
useful during development, so we expect it might be useful again in the future.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 07 Aug 2019 20:10:32 +0200] rev 42791
rawdata: update callers in lfs' wrapper
We update callers incrementally because this help bisecting failures. This was
useful during development, so we expect it might be useful again in the future.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 07 Aug 2019 20:10:24 +0200] rev 42790
rawdata: update caller in wireprotov2server
We update callers incrementally because this help bisecting failures. This was
useful during development, so we expect it might be useful again in the future.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 07 Aug 2019 20:10:08 +0200] rev 42789
rawdata: update callers in debugcommands
We update callers incrementally because this help bisecting failures. This was
useful during development, so we expect it might be useful again in the future.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 07 Aug 2019 20:09:53 +0200] rev 42788
rawdata: update callers in sqlitestore
We update callers incrementally because this help bisecting failures. This was
useful during development, so we expect it might be useful again in the future.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 07 Aug 2019 22:35:12 +0200] rev 42787
rawdata: update caller in remotefilelog
We update callers incrementally because this help bisecting failures. This was
useful during development, so we expect it might be useful again in the future.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 07 Aug 2019 20:09:10 +0200] rev 42786
rawdata: update callers in bundlerepo
We update callers incrementally because this help bisecting failures. This was
useful during development, so we expect it might be useful again in the future.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 07 Aug 2019 20:08:35 +0200] rev 42785
rawdata: update callers in context
We update callers incrementally because this help bisecting failures. This was
useful during development, so we expect it might be useful again in the future.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 07 Aug 2019 20:08:26 +0200] rev 42784
rawdata: update caller in revlog
We update callers incrementally because this help bisecting failures. This was
useful during development, so we expect it might be useful again in the future.
Augie Fackler <augie@google.com> [Thu, 15 Aug 2019 14:54:39 -0400] rev 42783
setup: fix a sorting issue I noticed in package names
Differential Revision: https://phab.mercurial-scm.org/D6733
Yuya Nishihara <yuya@tcha.org> [Sat, 17 Aug 2019 10:25:04 +0900] rev 42782
py3: do not convert rust module/attribute names to bytes
policy.import*() functions expect system strings.
Yuya Nishihara <yuya@tcha.org> [Sat, 17 Aug 2019 15:43:41 +0900] rev 42781
transplant: unnest --stop case
It should be aligned with --continue.
Yuya Nishihara <yuya@tcha.org> [Fri, 16 Aug 2019 18:34:05 +0900] rev 42780
rust-discovery: use while loop instead of match + break
This looks slightly nicer.
Yuya Nishihara <yuya@tcha.org> [Fri, 16 Aug 2019 18:31:17 +0900] rev 42779
rust-discovery: remove useless extern crate
Taapas Agrawal <taapas2897@gmail.com> [Fri, 26 Jul 2019 01:19:43 +0530] rev 42778
transplant: added support for --stop flag
This adds fuctionality for `--stop` flag to `transplant`.
A new method `stop` is added to `transplanter` class
containing logic to abort transplant.
Tests are updated as shown.
Differential Revision: https://phab.mercurial-scm.org/D6695
Navaneeth Suresh <navaneeths1998@gmail.com> [Thu, 15 Aug 2019 20:43:25 +0530] rev 42777
unshelve: abort on using --keep and --interactive together
I am working on making interactive mode support `--keep` flag. Until
we support the usage of `--interactive` and `--keep` together, let
us abort on it.
Differential Revision: https://phab.mercurial-scm.org/D6699
Navaneeth Suresh <navaneeths1998@gmail.com> [Tue, 20 Aug 2019 18:35:16 +0300] rev 42776
config: add experimental argument to the config registrar
Until now, there are almost 28 config items which are considered as
`experimental` but, not present in the `experimental` section of
the registrar. This patch adds an `experimental` argument to the
config registrar to mark such config items.
Differential Revision: https://phab.mercurial-scm.org/D6728
Differential Revision: https://phab.mercurial-scm.org/D6746
Augie Fackler <augie@google.com> [Wed, 14 Aug 2019 16:11:45 -0400] rev 42775
tests: split joint repo/changelog fake into one for each type
I'm just not comfortable with fakes that get overloaded for multiple
types: too often it gets out of hand and becomes difficult to trace.
Differential Revision: https://phab.mercurial-scm.org/D6725
Danny Hooper <hooper@google.com> [Tue, 13 Aug 2019 14:28:10 -0700] rev 42774
fix: pass line ranges as value instead of callback
The callback no longer takes any arguments from the inner function, so we might
as well call it sooner and pass the value instead. Note the value still needs
to be recomputed every iteration to account for the previous iteration's
changes to the file content.
Differential Revision: https://phab.mercurial-scm.org/D6727
Danny Hooper <hooper@google.com> [Tue, 13 Aug 2019 14:20:48 -0700] rev 42773
fix: correctly parse the :metadata subconfig
It's being handled as a string instead of a bool, though the thruthiness of the
string makes the feature still essentially work. Added a regression test.
Differential Revision: https://phab.mercurial-scm.org/D6726
Danny Hooper <hooper@google.com> [Mon, 12 Aug 2019 16:39:39 -0700] rev 42772
fix: allow tools to use :linerange, but also run if a file is unchanged
The definition of "unchanged" here is subtle, because pure deletion diff hunks
are ignored. That means this is different from using the --whole flag. This
change allows you to configure, for example, a code formatter that:
1. Formats specific line ranges if specified via flags
2. Does not format the entire file when there are no line ranges provided
3. Performs some other kind of formatting regardless of provided line ranges
This sounds a little far fetched, but it is meant to address a specific corner
case encountered in Google's use of the fix extension. The default behavior is
kept because it exists to prevent mistakes that could erase uncommitted
changes.
Differential Revision: https://phab.mercurial-scm.org/D6723
Raphaël Gomès <rgomes@octobus.net> [Wed, 10 Jul 2019 09:57:28 +0200] rev 42771
rust-dirstate: call rust dirstatemap from Python
Since Rust-backed Python classes cannot be used as baseclasses (for
rust-cpython anyway), we use composition rather than inheritance.
This also allows us to keep the IO operations in the Python side, removing
(for now) the need to rewrite VFS in Rust, which would be a heavy undertaking.
Differential Revision: https://phab.mercurial-scm.org/D6634
Raphaël Gomès <rgomes@octobus.net> [Wed, 10 Jul 2019 09:56:53 +0200] rev 42770
rust-dirstate: rust-cpython bridge for dirstatemap
This change also showcases the limitations of the `py_shared_ref!` macro.
See the previous commit 'rust-dirstate: rust implementation of dirstatemap`
for an explanation for the TODOs in the code.
Differential Revision: https://phab.mercurial-scm.org/D6633
Raphaël Gomès <rgomes@octobus.net> [Wed, 10 Jul 2019 09:56:23 +0200] rev 42769
rust-dirstate: rust implementation of dirstatemap
The `dirstatemap` is one of the last building blocks needed to get to a
`dirstate.walk` Rust implementation.
Disclaimer: This change is part of a big (10) series of patches, all of which
started as one big changeset that took a long time to write.
This `dirstatemap` implementation is a compromise in terms of complexity both
for me and for the reviewers. I chose to submit this patch right now because
while it is not perfect, it works and is simple enough (IMHO) to be reviewed.
The Python implementation uses a lot of lazy propertycaches, breaks
encapsulation and is used as an iterator in a lot of places, all of which
dictated the somewhat unidiomatic patterns in this change.
Like written in the comments, rewriting this struct to use the typestate
pattern might be a good idea, but this is a good first step.
Differential Revision: https://phab.mercurial-scm.org/D6632
Raphaël Gomès <rgomes@octobus.net> [Tue, 09 Jul 2019 15:15:54 +0200] rev 42768
rust-cpython: add macro for sharing references
Following an experiment done by Georges Racinet, we now have a working way of
sharing references between Python and Rust. This is needed in many points of
the codebase, for example every time we need to expose an iterator to a
Rust-backed Python class.
In a few words, references are (unsafely) marked as `'static` and coupled
with manual reference counting; we are doing manual borrow-checking.
This changes introduces two declarative macro to help reduce boilerplate.
While it is better than not using macros, they are not perfect. They need to:
- Integrate with the garbage collector for container types (not needed
as of yet), as stated in the docstring
- Allow for leaking multiple attributes at the same time
- Inject the `py_shared_state` data attribute in `py_class`-generated
structs
- Automatically namespace the functions and attributes they generate
For at least the last two points, we will need to write a procedural macro
instead of a declarative one.
While this reference-sharing mechanism is being ironed out I thought it best
not to implement it yet.
Lastly, and implementation detail renders our Rust-backed Python iterators too
strict to be proper drop-in replacements, as will be illustrated in a future
patch: if the data structure referenced by a non-depleted iterator is mutated,
an `AlreadyBorrowed` exception is raised, whereas Python would allow it, only
to raise a `RuntimeError` if `next` is called on said iterator. This will have
to be addressed at some point.
Differential Revision: https://phab.mercurial-scm.org/D6631
Raphaël Gomès <rgomes@octobus.net> [Tue, 09 Jul 2019 14:53:34 +0200] rev 42767
rust-docstrings: add missing module docstrings
Differential Revision: https://phab.mercurial-scm.org/D6630
Raphaël Gomès <rgomes@octobus.net> [Wed, 17 Jul 2019 11:37:43 +0200] rev 42766
rust-dirstate: improve API of `DirsMultiset`
- Use opaque `Iterator` type instead of implementation-specific one from
`HashMap`
- Make `DirsMultiset` behave like a set both in Rust and from Python
Differential Revision: https://phab.mercurial-scm.org/D6690
Raphaël Gomès <rgomes@octobus.net> [Tue, 09 Jul 2019 12:15:09 +0200] rev 42765
rust-dirstate: use EntryState enum instead of literals
This improves code readability quite a bit, while also adding a layer of
safety because we're checking the state byte against the enum.
Differential Revision: https://phab.mercurial-scm.org/D6629
Raphaël Gomès <rgomes@octobus.net> [Tue, 09 Jul 2019 11:49:49 +0200] rev 42764
rust-parsers: switch to parse/pack_dirstate to mutate-on-loop
Both `parse_dirstate` and `pack_dirstate` can operate directly on the data
they're passed, which prevents the creation of intermediate data structures,
simplifies the function signatures and reduces boilerplate. They are exposed
directly to the Python for now, but a later patch will make use of them inside
`hg-core`.
Differential Revision: https://phab.mercurial-scm.org/D6628
Raphaël Gomès <rgomes@octobus.net> [Wed, 10 Jul 2019 10:16:28 +0200] rev 42763
rust-parsers: move parser bindings to their own file and Python module
This tidies up the Rust side while simplifying the Python side.
Differential Revision: https://phab.mercurial-scm.org/D6627
Raphaël Gomès <rgomes@octobus.net> [Mon, 08 Jul 2019 18:01:39 +0200] rev 42762
rust-dirstate: create dirstate submodule in hg-cpython
This module will soon hold multiple files, this change is to make the
review process easier.
Differential Revision: https://phab.mercurial-scm.org/D6626
Georges Racinet <georges.racinet@octobus.net> [Wed, 20 Feb 2019 09:04:54 +0100] rev 42761
rust-discovery: using from Python code
As previously done in other topics, the Rust version is used if it's been
built.
The version fully in Rust of the partialdiscovery class has the performance
advantage over the Python version (actually using the Rust MissingAncestor) if
the undecided set is big enough. Otherwise no sampling occurs, and the
discovery is reasonably fast anyway.
Note: it's hard to predict the size of the initial undecided set, it can
depend on the kind of topological changes between the local and remote graphs.
The point of the Rust version is to make the bad cases acceptable.
More specifically, the performance advantages are:
- faster sampling, especially takefullsample()
- much faster addmissings() in almost all cases (see commit message in
grandparent of the present changeset)
- no conversion cost of the undecided set at the interface between Rust and
Python
== Measurements with big undecided sets
For an extreme example, discovery between mozilla-try and mozilla-unified
(over one million undecided revisions, same case as in dbd0fcca6dfc), we
get roughly a x2.5/x3 better performance:
Growing sample size (5% starting with 200): time goes down from
210 to 72 seconds.
Constant sample size of 200: time down from 1853 to 659 seconds.
With a sample size computed from number of roots and heads of the
undecided set (`respectsize` is `False`), here are perfdiscovery results:
Before ! wall 9.358729 comb 9.360000 user 9.310000 sys 0.050000 (median of 50)
After ! wall 3.793819 comb 3.790000 user 3.750000 sys 0.040000 (median of 50)
In that later case, the sample sizes are routinely in the hundreds of
thousands of revisions. While still faster, the Rust iteration in
addmissings has less of an advantage than with smaller sample sizes, but
one sees addcommons becoming faster, probably a consequence of not having
to copy big sets back and forth.
This example is not a goal in itself, but it showcases several different
areas in which the process can become slow, due to different factors, and
how this full Rust version can help.
== Measurements with small undecided sets
In cases the undecided set is small enough than no sampling occurs,
the Rust version has a disadvantage at init if `targetheads` is really big
(some time is lost in the translation to Rust data structures),
and that is compensated by the faster `addmissings()`.
On a private repository with over one million commits, we still get a minor
improvement, of 6.8%:
Before ! wall 0.593585 comb 0.590000 user 0.550000 sys 0.040000 (median of 50)
After ! wall 0.553035 comb 0.550000 user 0.520000 sys 0.030000 (median of 50)
What's interesting in that case is the first addinfo() at 180ms for Rust and
233ms for Python+C, mostly due to add_missings and the children cache
computation being done in less than 0.2ms on the Rust side vs over 40ms on the
Python side.
The worst case we have on hand is with mozilla-try, prepared with
discovery-helper.sh for 10 heads and depth 10, time goes up 2.2% on the median.
In this case `targetheads` is really huge with 165842 server heads.
Before ! wall 0.823884 comb 0.810000 user 0.790000 sys 0.020000 (median of 50)
After ! wall 0.842607 comb 0.840000 user 0.800000 sys 0.040000 (median of 50)
If that would be considered a problem, more adjustments can be made, which are
prematurate at this stage: cooking special variants of methods of the inner
MissingAncestors object, retrieving local heads directly from Rust to avoid
the cost of conversion. Effort would probably be better spent at this point
improving the surroundings if needed.
Here's another data point with a smaller repository, pypy, where performance
is almost identical
Before ! wall 0.015121 comb 0.030000 user 0.020000 sys 0.010000 (median of 186)
After ! wall 0.015009 comb 0.010000 user 0.010000 sys 0.000000 (median of 184)
Differential Revision: https://phab.mercurial-scm.org/D6430
Georges Racinet on percheron.racinet.fr <georges@racinet.fr> [Tue, 21 May 2019 12:46:38 +0200] rev 42760
rust-discovery: optimization of add commons/missings for empty arguments
These two cases have to be catched early for different reasons.
In the case of add_missing_revisions, we don't want to trigger
the computation of the undecided set (and the children cache)
too early: the later the better.
In the case of add_common_revisions, the inner `MissingAncestors`
object wouldn't know that all ancestors of its bases have already
been removed from the undecided.
In principle, that would in itself be a lead for further
improvement: this remove_ancestors_from could be more incremental,
but the current performance seems to be good enough.
Differential Revision: https://phab.mercurial-scm.org/D6429
Georges Racinet <georges.racinet@octobus.net> [Tue, 16 Apr 2019 01:16:39 +0200] rev 42759
rust-discovery: using the children cache in add_missing
The DAG range computation often needs to get back to very old
revisions, and turns out to be disproportionately long, given
that the end goal is to remove the descendents of the given
missing revisons from the undecided set.
The fast iteration capabilities available in the Rust case make
it possible to avoid the DAG range entirely, at the cost of
precomputing the children cache, and to simply iterate on
children of the given missing revisions.
This is a case where staying on the same side of the interface
between the two languages has clear benefits.
On discoveries with initial undecided sets
small enough to bypass sampling entirely, the total cost of
computing the children cache and the subsequent iteration
becomes better than the Python + C counterpart, which relies on
reachableroots2.
For example, on a repo with more than one million revisions with
an initial undecided set of 11 elements, we get these figures:
Rust version with simple iteration
addcommons: 57.287us
first undecided computation: 184.278334ms
first children cache computation: 131.056us
addmissings iteration: 42.766us
first addinfo total: 185.24 ms
Python + C version
first addcommons: 0.29 ms
addcommons 0.21 ms
first undecided computation 191.35 ms
addmissings 45.75 ms
first addinfo total: 237.77 ms
On discoveries with large undecided sets, the initial price paid
makes the first addinfo slower than the Python + C version,
but that's more than compensated by the gain in sampling and
subsequent iterations.
Here's an extreme example with an undecided set of a million revisions:
Rust version:
first undecided computation: 293.842629ms
first children cache computation: 407.911297ms
addmissings iteration: 34.312869ms
first addinfo total: 776.02 ms
taking initial sample
query 2: sampling time: 1318.38 ms
query 2; still undecided: 1005013, sample size is: 200
addmissings: 143.062us
Python + C version:
first undecided computation 298.13 ms
addmissings 80.13 ms
first addinfo total: 399.62 ms
taking initial sample
query 2: sampling time: 3957.23 ms
query 2; still undecided: 1005013, sample size is: 200
addmissings 52.88 ms
Differential Revision: https://phab.mercurial-scm.org/D6428
Georges Racinet <georges.racinet@octobus.net> [Tue, 21 May 2019 17:44:15 +0200] rev 42758
discovery: new devel.discovery.randomize option
By default, this is True, but setting it to False is a uniform
way to kill all randomness in integration tests such as test-setdiscovery.t
By "uniform" we mean that it can be passed to implementations in other
languages, for which the monkey-patching of random.sample would be
irrelevant.
In the above mentioned test file, we use it right away,
replacing the adhoc extension that had the same purpose, and to derandomize a
case with many round-trips, that we'll need to behave uniformly in the Rust
version.
Differential Revision: https://phab.mercurial-scm.org/D6427
Georges Racinet <georges.racinet@octobus.net> [Tue, 21 May 2019 17:43:55 +0200] rev 42757
rust-discovery: optionally don't randomize at all, for tests
As seen from Python, this is a new `randomize` kwarg in init of the
discovery object. It replaces random picking by some arbitrary yet
deterministic strategy.
This is the same as what test-setdiscovery.t does, with the added
benefit to be usable both in Python and Rust implementations.
Differential Revision: https://phab.mercurial-scm.org/D6426
Georges Racinet <georges.racinet@octobus.net> [Fri, 17 May 2019 01:56:57 +0200] rev 42756
rust-discovery: exposing sampling to python
Differential Revision: https://phab.mercurial-scm.org/D6425
Georges Racinet <georges.racinet@octobus.net> [Fri, 17 May 2019 01:56:57 +0200] rev 42755
rust-discovery: takefullsample() core implementation
take_full_sample() browses the undecided set in both directions: from
its roots as well as from its heads.
Following what's done on the Python side, we alter update_sample()
signature to take a closure returning an iterator: either ParentsIterator
or an iterator over the children found in `children_cache`. These constructs
should probably be split off in a separate module.
This is a first concrete example where a more abstract graph notion (probably
a trait) would be useful, as this is nothing but an operation on the reversed
DAG.
A similar motivation in the context of the discovery
process would be to replace the call to dagops::range in
`add_missing_revisions()` with a simple iteration over descendents, again an
operation on the reversed graph.
Differential Revision: https://phab.mercurial-scm.org/D6424
Georges Racinet <georges.racinet@octobus.net> [Fri, 17 May 2019 01:56:56 +0200] rev 42754
rust-discovery: core implementation for take_quick_sample()
This makes in particular `rand` no longer a testing dependency.
We keep a seedable random generator on the `PartialDiscovery` object
itself, to avoid lengthy initialization.
In take_quick_sample() itself, we had to avoid keeping the reference
to `self.undecided` to cope with the mutable reference introduced
by the the call to `limit_sample`, but it's still manageable without
resorting to inner mutability.
Sampling being prone to be improved in the mid-term future, testing
is minimal, amounting to checking which code path got executed.
Differential Revision: https://phab.mercurial-scm.org/D6423
Georges Racinet <georges.racinet@octobus.net> [Wed, 12 Jun 2019 14:31:41 +0100] rev 42753
rust-discovery: read the index from a repo passed at init
This makes the API of the Rust PartialDiscovery object now
the same (or rather a subset) of the Python object, hence
easier to control through module policy down the road.
Differential Revision: https://phab.mercurial-scm.org/D6517
Georges Racinet <georges.racinet@octobus.net> [Wed, 12 Jun 2019 14:18:12 +0100] rev 42752
rust-discovery: accept the new 'respectsize' init arg
At this stage, we don't do anything about it: it will be meaningful
in sampling methods that aren't implemented yet.
Differential Revision: https://phab.mercurial-scm.org/D6516
Yuya Nishihara <yuya@tcha.org> [Wed, 14 Aug 2019 09:22:54 +0900] rev 42751
merge with stable
Navaneeth Suresh <navaneeths1998@gmail.com> [Tue, 13 Aug 2019 22:48:05 +0530] rev 42750
unshelve: forget unknown files after a partial unshelve
This is a follow-up patch to 6957f7b93e03. This allows hg to forget
unknown files after a partial unshelve.
Differential Revision: https://phab.mercurial-scm.org/D6724
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 08 Aug 2019 01:59:43 +0200] rev 42749
flagutil: move addflagprocessor to the new module (API)
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 08 Aug 2019 01:25:37 +0200] rev 42748
flagutil: move insertflagprocessor to the new module (API)
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 08 Aug 2019 01:28:34 +0200] rev 42747
flagutil: move REVIDX_KNOWN_FLAGS source of truth in flagutil (API)
Since REVIDX_KNOWN_FLAGS is "not really a constant" (extension can update it)
and python integer,... it needs to be the responsability of a single module and
always accessed through the module. We update all the user and move the source
of truth in flagutil.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 08 Aug 2019 01:04:48 +0200] rev 42746
flagutil: move the `flagprocessors` mapping in the new module
This module is meant to host most of the flag processing logic. We start with
the mapping between flag and processors.