Simon Sapin <simon.sapin@octobus.net> [Fri, 28 May 2021 11:48:59 +0200] rev 47349
dirstate-v2: Skip readdir in status based on directory mtime
When calling `read_dir` during `status` and the directory is found to be
eligible for caching (see code comments), write the directory’s mtime to the
dirstate. The presence of a directory mtime in the dirstate is meaningful
and indicates eligibility.
When an eligible directory mtime is found in the dirstate and `stat()` shows
that the mtime has not changed, `status` can skip calling `read_dir` again
and instead rely on the names of child nodes in the dirstate tree.
The `tempfile` crate is used to create a temporary file in order to use its
modification time as "current time" with the same truncation as other files
and directories would have in their own modification time.
Differential Revision: https://phab.mercurial-scm.org/D10826
Simon Sapin <simon.sapin@octobus.net> [Thu, 27 May 2021 18:40:54 +0200] rev 47348
dirstate-v2: Allow tree nodes without an entry to store a timestamp
Timestamps are stored on 96 bits:
* 64 bits for the signed number of seconds since the Unix epoch
* 32 bits for the nanoseconds in the `0 <= ns < 1_000_000_000` range
For now timestamps are not used or set yet.
Differential Revision: https://phab.mercurial-scm.org/D10825
Simon Sapin <simon.sapin@octobus.net> [Fri, 28 May 2021 20:07:27 +0200] rev 47347
dirstate-tree: Change status() results to not borrow DirstateMap
The `status` function takes a `&'tree mut DirstateMap<'on_disk>` parameter.
`'on_disk` borrows a read-only byte buffer with the contents of the
`.hg/dirstate` file. `DirstateMap` internally uses represents file paths as
`std::borrow::Cow<'on_disk, HgPath>`, which borrows the byte buffer when
possible and allocates an owned string if not, such as for files added to the
dirstate after it was loaded from disk.
Previously the return type of of `status` has a `'tree` lifetime, meaning it
could borrow all paths from the `DirstateMap`. With this changeset, that
lifetime is changed to `'on_disk` meaning that only paths from the byte buffer
can be borrowed, and paths allocated by `DirstateMap` must be copied.
Usually most paths are in the byte buffer, and most paths are not part of the
return value of `status`, so the number of extra copies should be small.
This change will enable `status` to mutate the `DirstateMap` after it has
finished constructing its return value. Previously such mutation would be
prevented by possible on-going borrows.
Differential Revision: https://phab.mercurial-scm.org/D10824
Simon Sapin <simon.sapin@octobus.net> [Fri, 28 May 2021 12:16:14 +0200] rev 47346
dirstate-tree: Fix status algorithm with unreadable directory
When reading a directory fails such as because of insufficient permissions,
it should be treated as empty by status instead of skipped entirely.
Differential Revision: https://phab.mercurial-scm.org/D10823
Martin von Zweigbergk <martinvonz@google.com> [Tue, 25 May 2021 16:46:32 -0700] rev 47345
docket: make compatible with py3.6, where Struct.format is bytes
Differential Revision: https://phab.mercurial-scm.org/D10770
Mathias De Mare <mathias.de_mare@nokia.com> [Tue, 15 Jun 2021 09:06:12 +0200] rev 47344
packaging: disable rust extensions again on CentOS
Backed out changeset
eccbfa7e19c0
We're seeing (very rarely) crashes of 'hg purge' on some of our machines
(see https://bz.mercurial-scm.org/show_bug.cgi?id=6509 ).
Unfortunately, I haven't been able to find out much more about
what is going wrong.
To avoid further impact on our users and CI,
I would prefer to disable the rust extensions for now.
Differential Revision: https://phab.mercurial-scm.org/D10877
Georges Racinet <georges.racinet@octobus.net> [Sun, 06 Jun 2021 01:24:30 +0200] rev 47343
cext: fix memory leak in phases computation
Without this a buffer whose size in bytes is the number of
changesets in the repository is leaked each time the repository is
opened and changeset phases are computed.
Impact: the current code in hgwebdir creates a new `localrepository`
instance for each HTTP request. Since any pull or push is made of several
requests, a team of 100 people can easily produce thousands of such
requests per day.
Being a low-level malloc, this leak can't be seen with the gc module and
tools relying on that, but was spotted by valgrind immediately.
Reproduction
------------
for i in range(cl_args.iterations):
repo = hg.repository(baseui, repo_path)
rev = repo.revs(rev).first()
ctx = repo[rev]
del ctx
del repo
# avoid any pollution by other type of leak
# (that should be fixed in 5.8)
repoview._filteredrepotypes.clear()
gc.collect()
Measurements
------------
Resident Set Size (RSS), taken on a clone of
mozilla-central for performance analysis (440 000
changesets).
before:
5.8+hg19.
5ac0f2a8ba72 1000 iterations: 1606MB
5.8+hg19.
5ac0f2a8ba72 10000 iterations: 5723MB
after:
5.8+hg20.
e2084d39e145 1000 iterations: 555MB
5.8+hg20.
e2084d39e145 10000 iterations: 555MB
(double checked, not a copy/paste error)
(
e2084d39e14 is the present changeset, before amendment
of the message to add the measurements)
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 30 May 2021 22:12:48 +0200] rev 47342
revlogv2: make sure bundling pick a compatible bundle format
Before this change, revlog-v2 repository where bundled using the incompatible
"v1" format.
Differential Revision: https://phab.mercurial-scm.org/D10802
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 30 May 2021 20:42:51 +0200] rev 47341
censor: do not process sidedata of censored revision while bundling
The revision is censored, we should ignore it.
Differential Revision: https://phab.mercurial-scm.org/D10801
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 28 May 2021 20:00:27 +0200] rev 47340
changegroup: fix deltachunk API to be consistent from one class to another
Depending of the subclass the 8th index of `chunkdata` items was either a
sidedata dict of a proto_flags integer. We have not fixed the inconsistency and
we already return fixed "delta" items from `deltaiter`.
Differential Revision: https://phab.mercurial-scm.org/D10778
Augie Fackler <augie@google.com> [Thu, 27 May 2021 12:10:59 -0400] rev 47339
fuzz: add hg to sys.path when constructing mpatch corpus
Differential Revision: https://phab.mercurial-scm.org/D10777
Simon Sapin <simon.sapin@octobus.net> [Wed, 19 May 2021 16:18:16 +0200] rev 47338
dirstate-tree: Skip readdir() in `hg status -mard`
When running the status algorithm in a mode where we don’t list unknown
or ignored files, all we care about are files that are listed in the dirstate.
We can there for skip making expensive calls to readdir() to list the contents
of filesystem directories, and instead only run stat() to get the filesystem
state of files listed in the dirstate. (This state may be an error for files
that don’t exist anymore on the filesystem.)
On 16 CPU threads, this reduces the time spent in the `status()` function for
`hg status -mard` on an old snapshot of mozilla-central from ~70ms to ~50ms.
Differential Revision: https://phab.mercurial-scm.org/D10752