Mercurial > hg
view relnotes/5.9 @ 50337:47686726545d stable
match: sort patterns before compiling them into a regex
While investigating cripping performance for `hg cat` in some context, I
discovered that, for large inputs, building a regex from out of order patterns
result may result in a *much* slower regex and a much slower associated
matcher's performance.
So we are now sorting the patterns to help the regex engine.
There is more to the story as we rely on regexp more than we should. See the
next changeset for details.
Benchmarks
==========
In the following benchmark we are comparing the `hg cat` and `hg files` run
time when matching against the full list of files in the repository. They are
run:
- without the rust extensions
- with the standard python enfine (so without re2)
sort vs non-sorted - Before this changeset (3f5137543773)
---------------------------------------------------------
###### hg files ###############################################################
### mercurial-2018-08-01-zstd-sparse-revlog
sorted: 0.230092 seconds
shuffled: 0.234235 seconds (+1.80%)
### pypy-2018-08-01-zstd-sparse-revlog
sorted: 0.613567 seconds
shuffled: 0.801880 seconds (+30.69%)
### mozilla-central-2018-08-01-zstd-sparse-revlog
sorted: 62.474221 seconds
shuffled: 1364.180218 seconds (+2083.59%)
### netbeans-2018-08-01-zstd-sparse-revlog
sorted: 21.541828 seconds
shuffled: 172.759857 seconds (+701.97%)
###### hg cat #################################################################
### mercurial-2018-08-01-zstd-sparse-revlog
sorted: 0.764407 seconds
shuffled: 0.768924 seconds
### pypy-2018-08-01-zstd-sparse-revlog
sorted: 2.065220 seconds
shuffled: 2.276388 seconds (+10.22%)
### netbeans-2018-08-01-zstd-sparse-revlog
sorted: 40.967983 seconds
shuffled: 216.388709 seconds (+428.19%)
### mozilla-central-2018-08-01-zstd-sparse-revlog
sorted: 105.228510 seconds
shuffled: 1448.722784 seconds (+1276.74%)
sort vs non-sorted - With this changeset
----------------------------------------
###### hg files ###############################################################
### mercurial-2018-08-01-zstd-sparse-revlog
all-list-pattern-sorted: 0.230069
all-list-pattern-shuffled: 0.231165
### pypy-2018-08-01-zstd-sparse-revlog
all-list-pattern-sorted: 0.616799
all-list-pattern-shuffled: 0.616393
### netbeans-2018-08-01-zstd-sparse-revlog
all-list-pattern-sorted: 21.586773
all-list-pattern-shuffled: 21.908197
### mozilla-central-2018-08-01-zstd-sparse-revlog
all-list-pattern-sorted: 61.279490
all-list-pattern-shuffled: 62.473549
###### hg cat #################################################################
### mercurial-2018-08-01-zstd-sparse-revlog
sorted: 0.763883 seconds
shuffled: 0.765848 seconds
### pypy-2018-08-01-zstd-sparse-revlog
sorted: 2.070498 seconds
shuffled: 2.069197 seconds
### netbeans-2018-08-01-zstd-sparse-revlog
sorted: 41.392423 seconds
shuffled: 41.648689 seconds
### mozilla-central-2018-08-01-zstd-sparse-revlog
sorted: 103.315670 seconds
shuffled: 104.369358 seconds
author | Pierre-Yves David <pierre-yves.david@octobus.net> |
---|---|
date | Sat, 01 Apr 2023 05:57:09 +0200 |
parents | 809e780c72e5 |
children |
line wrap: on
line source
== New Features == * `hg config` now has a `--source` option to show where each configuration value comes from. * Introduced a command (debug-repair-issue6528) to repair repositories affected by issue6528 where certain files would show up as modified even if they were clean due to an issue in the copy-tracing code. == Default Format Change == These changes affect newly created repositories (or new clone) done with Mercurial 5.9. == New Experimental Features == * A `changelogv2` format has been introduced. It is not ready for use yet, but will be used later to address some of the weaknesses of the current revlog format. * Initial experiment and support for `dirstatev2`, a new dirstate format that addresses some of the weaknesses of the current dirstate format. Python + C and Rust support are being implemented, but the Rust solution is the one currently getting the attention for performance. * Initial support for `rhg status`. `rhg` is the Rust wrapper executable for hg that shortcuts some commands for faster execution speed. == Bug Fixes == * Fixed committing empty files with `narrow` * Allow overriding `pip`'s pep517 compliance to build C or Rust extensions * Fixed regression on outgoing email when not specifying revisions * Fixed a regression causing bookmarks to disappear when using Rust persistent nodemap * Fixed a regression (in 5.9.1) introduced in 5.9 when cloning repos with deep filenames * Fixed detection of directories becoming symlinks, but only when using the Rust extensions. * Fixed ignore and include not composing in the Rust status * `hg commit --interactive` now handles deselecting edits of a rename * Fixed a case where `hg evolve` gives different results when interrupted * Fixed a memory leak in phases computation * `histedit` and `shelve` don't swallow errors when updating the working copy anymore * Improve error message when detecting content-divergence with a hidden common predecessor * No longer re-order parents in filelog, see issue6533 * Fix revisions affected by issue6533 on the fly during exchange * Many Windows fixes for stability and py3 compatibility improvements * Many other miscellaneous fixes == Backwards Compatibility Changes == == Internal API Changes == The Dirstate API have been updated as the previous function leaked some internal details and did not distinguish between two important cases: "We are changing parent and need to adjust the dirstate" and "some command is changing which file is tracked". To clarify the situation: * the following functions have been deprecated, - `dirstate.add`, - `dirstate.normal`, - `dirstate.normallookup`, - `dirstate.merge`, - `dirstate.otherparent`, - `dirstate.remove`, - `dirstate.drop`, - `dirstateitem.__getitem__`, * these new functions are added for the "adjusting parents" use-case: - `dirstate.update_file`, - `dirstate.update_file_p1`, * these new function are added for the "adjusting wc file" use-case": - `dirstate.set_tracked`, - `dirstate.set_untracked`, - `dirstate.set_clean`, - `dirstate.set_possibly_dirty`, See inline documentation of the new functions for details. * Additionally, the following have been deprecated: - `urlutil.getpath` function - `localrepository.updatecaches`' `full` argument * The following have been removed: - `revlog.revlogio` has been removed