Simon Sapin <simon.sapin@octobus.net> [Wed, 19 May 2021 16:18:16 +0200] rev 47346
dirstate-tree: Skip readdir() in `hg status -mard`
When running the status algorithm in a mode where we don’t list unknown
or ignored files, all we care about are files that are listed in the dirstate.
We can there for skip making expensive calls to readdir() to list the contents
of filesystem directories, and instead only run stat() to get the filesystem
state of files listed in the dirstate. (This state may be an error for files
that don’t exist anymore on the filesystem.)
On 16 CPU threads, this reduces the time spent in the `status()` function for
`hg status -mard` on an old snapshot of mozilla-central from ~70ms to ~50ms.
Differential Revision: https://phab.mercurial-scm.org/D10752
Simon Sapin <simon.sapin@octobus.net> [Wed, 19 May 2021 13:15:00 +0200] rev 47345
dirstate-v2: Parse the dirstate lazily, with copy-on-write nodes
TODO: more description
Differential Revision: https://phab.mercurial-scm.org/D10751
Simon Sapin <simon.sapin@octobus.net> [Wed, 19 May 2021 13:15:00 +0200] rev 47344
dirstate-v2: Make the dirstate bytes buffer available in more places
Differential Revision: https://phab.mercurial-scm.org/D10750
Simon Sapin <simon.sapin@octobus.net> [Wed, 19 May 2021 13:15:00 +0200] rev 47343
dirstate-v2: Make more APIs fallible, returning Result
When parsing becomes lazy, parse error will potentially happen in more places.
This propagates such errors to callers.
Differential Revision: https://phab.mercurial-scm.org/D10749
Simon Sapin <simon.sapin@octobus.net> [Wed, 19 May 2021 13:15:00 +0200] rev 47342
dirstate-v2: Add a zero-size error type for dirstate v2 parse errors
This error should only happen if Mercurial is buggy or the file is corrupted.
It indicates for example that:
* A part of the file refers to another part, and the byte offset or item count
would cause reading out of bounds, beyond the end of the file.
* The byte for an entry state has an invalid value
When parsing becomes lazy, many more functions will return a `Result` with
this error. Making it zero-size reduces the work that the `?` operator needs
to do to pass around the error value.
Differential Revision: https://phab.mercurial-scm.org/D10748
Simon Sapin <simon.sapin@octobus.net> [Wed, 19 May 2021 13:15:00 +0200] rev 47341
dirstate-tree: Add `NodeRef` and `ChildNodesRef` enums
They are used instead of `&Node` and `&ChildNodes` respectively.
The `ChildNodes` type alias also becomes a similar enum.
For now they only have one variant each, to be extended later.
Adding enums now forces various use sites go through new methods
instead of manipulating the underlying data structure directly.
Differential Revision: https://phab.mercurial-scm.org/D10747
Simon Sapin <simon.sapin@octobus.net> [Wed, 19 May 2021 13:15:00 +0200] rev 47340
rust: Return owned instead of borrowed DirstateEntry in DirstateMap APIs
This will enable the tree-based DirstateMap to not always have an actual
DirstateEntry in memory for all nodes, but construct it on demand.
Differential Revision: https://phab.mercurial-scm.org/D10746
Simon Sapin <simon.sapin@octobus.net> [Wed, 19 May 2021 13:15:00 +0200] rev 47339
dirstate-tree: Downgrade `&mut Node` to `&Node` in status and serialization
Mutable access is not used, and upcoming changes will make it more costly
(with copy-on-write nodes that can be read from disk representation)
Differential Revision: https://phab.mercurial-scm.org/D10745
Simon Sapin <simon.sapin@octobus.net> [Wed, 19 May 2021 13:15:00 +0200] rev 47338
dirstate-tree: Remove DirstateMap::iter_node_data_mut
In an upcoming changeset we want DirstateMap to be able to work directly
with nodes in their "on disk" representation, without always allocating
corresponding in-memory data structures. Nodes would have two possible
representations: one immutable "on disk" refering to the bytes buffer
of the contents of the .hg/dirstate file, and one mutable with HashMap
like the curren data structure.
These nodes would have copy-on-write semantics: when an immutable node
would need to be mutated, instead we allocate new mutable node for it and
its ancestors.
A mutable iterator of the entire tree would still be possible, but it would
become much more expensive since we’d need to allocate mutable nodes for
everything.
Instead, remove this iterator. It was only used to clear ambiguous mtimes
while serializing the `DirstateMap`. Instead clearing and serialization are
now two separate passes. Clearing first uses an immutable iterator to collect
the paths of nodes that need to be cleared, then accesses only those nodes
mutably.
Differential Revision: https://phab.mercurial-scm.org/D10744
Matt Harbison <matt_harbison@yahoo.com> [Fri, 28 May 2021 17:33:20 -0400] rev 47337
merge with stable
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 26 May 2021 21:46:45 +0200] rev 47336
revlog: close the index file handle after the data one
This make sure the data file is flushed before the index. preventing the index
to reference unflushed data.
Differential Revision: https://phab.mercurial-scm.org/D10776
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 26 May 2021 21:35:51 +0200] rev 47335
revlog: simplify the try nesting in the `_writing` context
Lets use a single try, with conditional cleanup. This make is easier to add a
file handle dedicated to sidedata.
Differential Revision: https://phab.mercurial-scm.org/D10775
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 20 May 2021 21:54:21 +0200] rev 47334
revlogv2: add a `get_data` helper to grab the next piece of docket
This make the processing more compact but abstracting repetitive processing
away.
Differential Revision: https://phab.mercurial-scm.org/D10774
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 20 May 2021 21:48:53 +0200] rev 47333
revlogv2: simplify and clarify the processing of each entry
As we add more entries and some of them has non trivial processing it seems
useful to make the processing leaner and clearly separated to simplify futures
patches.
Differential Revision: https://phab.mercurial-scm.org/D10773