Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 25 Sep 2020 15:05:08 +0200] rev 45640
copies: directly pass a changes object to the copy tracing code
The object contains all the data we need. For example, the `is_merged` callback
can now use the associated precomputed data.
This will be useful again soon when the `salvaged` set will be introduce to
solve the issue with delete file reverted during a merge. See
4b582a93316a and
14be07d5603c for details.
Differential Revision: https://phab.mercurial-scm.org/D9117
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 25 Sep 2020 14:54:43 +0200] rev 45639
copies: no longer change the sidedata flag
With the new sidedata storage that include data about all file changes, every
revision has one, so the sidedata flag is not longer a good way to spot
changeset with copy information. So we drop this check to simplify the code
We optimisation itself provided an interesting speedup, so we will likely
reintroduce something similar, with a dedicated flag, in the future.
Differential Revision: https://phab.mercurial-scm.org/D9116
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 25 Sep 2020 14:52:34 +0200] rev 45638
copies: use dedicated `_revinfo_getter` function and call
We want to return data in a different form, so we need different revinfo
function. At that point it make sense to have different getter.
Differential Revision: https://phab.mercurial-scm.org/D9115
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 25 Sep 2020 14:39:04 +0200] rev 45637
copies: make two version of the changeset centric algorithm
They are two main ways to run the changeset-centric copy-tracing algorithm. One
fed from data stored in side-data and still in development, and one based on
data stored in extra (with a "compatibility" mode).
The `extra` based is used in production at Google, but still experimental in
code. It is mostly unsuitable for other users because it affects the hash.
The side-data based storage and algorithm have been evolving to store more data, cover more cases
(mostly around merge, that Google do not really care about) and use lower level
storage for efficiency.
All this changes make is increasingly hard to maintain de common code base,
without impacting code complexity and performance. For example, the
compatibility mode requires to keep things at different level than what we
need for side-data.
So, I am duplicating the involved functions. The newly added `_extra` variants
will be kept as today, while I will do some deeper rework of the side data
versions.
Long terms, the side-data version should be more featureful and performant than
the extra based version, so I expect the duplicated `_extra` functions to
eventually get dropped.
Differential Revision: https://phab.mercurial-scm.org/D9114
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 15 Sep 2020 10:55:30 +0200] rev 45636
changing-files: retrieve changelogrevision.files from the sidedata block
The `files` field is know to have issue, using a list with fixed, and fixable,
computation can only help. For example, using a fixes `files` field would be
enough to fix
issue6219 once this feature get usable in production.
We focus on having thing working for now, we will deal with performance later.
Right now we have a ironic situation were we parse sorted value from disk to
turn them into a set and then having to sort it again.
Differential Revision: https://phab.mercurial-scm.org/D9092
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 15 Sep 2020 10:49:50 +0200] rev 45635
changing-files: drop the now useless changelogrevision argument
Since all filename are now included in the sidedata block, we no longer need to decode the `files` from the revision.
Differential Revision: https://phab.mercurial-scm.org/D9091
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 15 Sep 2020 10:55:17 +0200] rev 45634
changing-files: rework the way we store changed files in side-data
We need to store new data so this is a good opportunity to rework this fully.
1) We directly store the list of affected file in the side data:
* This avoid having to fetch and parse the `files` list in the revision in
addition to the sidedata. Making the data more self sufficient.
* This work around situation where that `files` field contains wrong
information, and open the way to other bug fixing (eg:
issue6219)
* The format (fixed initial index, sorted files) allow for fast lookup of
filename within the structure.
* This unify the storage of affected files and copies sources and destination,
limiting the number filename stored redundantly.
* This prepare for the fact we should drop the `files` as soon as we do any
change affecting the revision schema.
* This rely on compression to avoid a significant increase of the changelog.d.
More testing on this will be done before we freeze the final format.
2) We can store additional data:
* The new "merged" field,
* A future "salvaged" set recording files that might have been deleted but have
were still present in the final result.
Differential Revision: https://phab.mercurial-scm.org/D9090
Joerg Sonnenberger <joerg@bec.de> [Mon, 05 Oct 2020 15:08:15 +0200] rev 45633
tests: skip doctests if not running from a hg repo
Differential Revision: https://phab.mercurial-scm.org/D9150
Raphaël Gomès <rgomes@octobus.net> [Mon, 05 Oct 2020 10:33:52 +0200] rev 45632
py3: use native string when comparing with a function's argspec
I only found two such bugs in `contrib/perf.py`
Differential Revision: https://phab.mercurial-scm.org/D9149
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 02 Oct 2020 10:29:22 +0200] rev 45631
test: try to unflaky test-profile.t
That test rely on timing measurement, because it is about timing measurement. We
try to filter out the most common source of flakyness (slow disk stating)