Mercurial > hg
view relnotes/5.8 @ 50337:47686726545d stable
match: sort patterns before compiling them into a regex
While investigating cripping performance for `hg cat` in some context, I
discovered that, for large inputs, building a regex from out of order patterns
result may result in a *much* slower regex and a much slower associated
matcher's performance.
So we are now sorting the patterns to help the regex engine.
There is more to the story as we rely on regexp more than we should. See the
next changeset for details.
Benchmarks
==========
In the following benchmark we are comparing the `hg cat` and `hg files` run
time when matching against the full list of files in the repository. They are
run:
- without the rust extensions
- with the standard python enfine (so without re2)
sort vs non-sorted - Before this changeset (3f5137543773)
---------------------------------------------------------
###### hg files ###############################################################
### mercurial-2018-08-01-zstd-sparse-revlog
sorted: 0.230092 seconds
shuffled: 0.234235 seconds (+1.80%)
### pypy-2018-08-01-zstd-sparse-revlog
sorted: 0.613567 seconds
shuffled: 0.801880 seconds (+30.69%)
### mozilla-central-2018-08-01-zstd-sparse-revlog
sorted: 62.474221 seconds
shuffled: 1364.180218 seconds (+2083.59%)
### netbeans-2018-08-01-zstd-sparse-revlog
sorted: 21.541828 seconds
shuffled: 172.759857 seconds (+701.97%)
###### hg cat #################################################################
### mercurial-2018-08-01-zstd-sparse-revlog
sorted: 0.764407 seconds
shuffled: 0.768924 seconds
### pypy-2018-08-01-zstd-sparse-revlog
sorted: 2.065220 seconds
shuffled: 2.276388 seconds (+10.22%)
### netbeans-2018-08-01-zstd-sparse-revlog
sorted: 40.967983 seconds
shuffled: 216.388709 seconds (+428.19%)
### mozilla-central-2018-08-01-zstd-sparse-revlog
sorted: 105.228510 seconds
shuffled: 1448.722784 seconds (+1276.74%)
sort vs non-sorted - With this changeset
----------------------------------------
###### hg files ###############################################################
### mercurial-2018-08-01-zstd-sparse-revlog
all-list-pattern-sorted: 0.230069
all-list-pattern-shuffled: 0.231165
### pypy-2018-08-01-zstd-sparse-revlog
all-list-pattern-sorted: 0.616799
all-list-pattern-shuffled: 0.616393
### netbeans-2018-08-01-zstd-sparse-revlog
all-list-pattern-sorted: 21.586773
all-list-pattern-shuffled: 21.908197
### mozilla-central-2018-08-01-zstd-sparse-revlog
all-list-pattern-sorted: 61.279490
all-list-pattern-shuffled: 62.473549
###### hg cat #################################################################
### mercurial-2018-08-01-zstd-sparse-revlog
sorted: 0.763883 seconds
shuffled: 0.765848 seconds
### pypy-2018-08-01-zstd-sparse-revlog
sorted: 2.070498 seconds
shuffled: 2.069197 seconds
### netbeans-2018-08-01-zstd-sparse-revlog
sorted: 41.392423 seconds
shuffled: 41.648689 seconds
### mozilla-central-2018-08-01-zstd-sparse-revlog
sorted: 103.315670 seconds
shuffled: 104.369358 seconds
author | Pierre-Yves David <pierre-yves.david@octobus.net> |
---|---|
date | Sat, 01 Apr 2023 05:57:09 +0200 |
parents | 32b527417ba3 |
children |
line wrap: on
line source
== New Features == * `hg purge` is now a core command using `--confirm` by default. * The `rev-branch-cache` is now updated incrementally whenever changesets are added. * The new options `experimental.bundlecompthreads` and `experimental.bundlecompthreads.<engine>` can be used to instruct the compression engines for bundle operations to use multiple threads for compression. The default is single threaded operation. Currently only supported for zstd. == Default Format Change == These changes affects newly created repositories (or new clone) done with Mercurial 5.8. * The `ZSTD` compression will now be used by default for new repositories when available. This compression format was introduced in Mercurial 5.0, released in May 2019. See `hg help config.format.revlog-compression` for details. * Mercurial installation built with the Rust parts will now use the "persistent nodemap" feature by default. This feature was introduced in Mercurial 5.4 (May 2020). However Mercurial instalation built without the fast Rust implementation will refuse to interract with them by default. This restriction can be lifted through configuration. See `hg help config.format.use-persistent-nodemap` for details == New Experimental Features == * There's a new `diff.merge` config option to show the changes relative to an automerge for merge changesets. This makes it easier to detect and review manual changes performed in merge changesets. It is supported by `hg diff --change`, `hg log -p` `hg incoming -p`, and `hg outgoing -p` so far. == Bug Fixes == * gracefully recover from inconsistent persistent-nodemap data from disk. == Backwards Compatibility Changes == * In normal repositories, the first parent of a changeset is not null, unless both parents are null (like the first changeset). Some legacy repositories violate this condition. The revlog code will now silentely swap the parents if this condition is tested. This can change the output of `hg log` when explicitly asking for first or second parent. The changesets "nodeid" are not affected. == Internal API Changes == * `changelog.branchinfo` is deprecated and will be removed after 5.8. It is superseded by `changelogrevision.branchinfo`. * Callbacks for revlog.addgroup and the changelog._nodeduplicatecallback hook now get a revision number as argument instead of a node. * revlog.addrevision returns the revision number instead of the node. * `nodes.nullid` and related constants are being phased out as part of the deprecation of SHA1. Repository instances and related classes provide access via `nodeconstants` and in some cases `nullid` attributes.