Mercurial > hg
view relnotes/5.5 @ 50337:47686726545d stable
match: sort patterns before compiling them into a regex
While investigating cripping performance for `hg cat` in some context, I
discovered that, for large inputs, building a regex from out of order patterns
result may result in a *much* slower regex and a much slower associated
matcher's performance.
So we are now sorting the patterns to help the regex engine.
There is more to the story as we rely on regexp more than we should. See the
next changeset for details.
Benchmarks
==========
In the following benchmark we are comparing the `hg cat` and `hg files` run
time when matching against the full list of files in the repository. They are
run:
- without the rust extensions
- with the standard python enfine (so without re2)
sort vs non-sorted - Before this changeset (3f5137543773)
---------------------------------------------------------
###### hg files ###############################################################
### mercurial-2018-08-01-zstd-sparse-revlog
sorted: 0.230092 seconds
shuffled: 0.234235 seconds (+1.80%)
### pypy-2018-08-01-zstd-sparse-revlog
sorted: 0.613567 seconds
shuffled: 0.801880 seconds (+30.69%)
### mozilla-central-2018-08-01-zstd-sparse-revlog
sorted: 62.474221 seconds
shuffled: 1364.180218 seconds (+2083.59%)
### netbeans-2018-08-01-zstd-sparse-revlog
sorted: 21.541828 seconds
shuffled: 172.759857 seconds (+701.97%)
###### hg cat #################################################################
### mercurial-2018-08-01-zstd-sparse-revlog
sorted: 0.764407 seconds
shuffled: 0.768924 seconds
### pypy-2018-08-01-zstd-sparse-revlog
sorted: 2.065220 seconds
shuffled: 2.276388 seconds (+10.22%)
### netbeans-2018-08-01-zstd-sparse-revlog
sorted: 40.967983 seconds
shuffled: 216.388709 seconds (+428.19%)
### mozilla-central-2018-08-01-zstd-sparse-revlog
sorted: 105.228510 seconds
shuffled: 1448.722784 seconds (+1276.74%)
sort vs non-sorted - With this changeset
----------------------------------------
###### hg files ###############################################################
### mercurial-2018-08-01-zstd-sparse-revlog
all-list-pattern-sorted: 0.230069
all-list-pattern-shuffled: 0.231165
### pypy-2018-08-01-zstd-sparse-revlog
all-list-pattern-sorted: 0.616799
all-list-pattern-shuffled: 0.616393
### netbeans-2018-08-01-zstd-sparse-revlog
all-list-pattern-sorted: 21.586773
all-list-pattern-shuffled: 21.908197
### mozilla-central-2018-08-01-zstd-sparse-revlog
all-list-pattern-sorted: 61.279490
all-list-pattern-shuffled: 62.473549
###### hg cat #################################################################
### mercurial-2018-08-01-zstd-sparse-revlog
sorted: 0.763883 seconds
shuffled: 0.765848 seconds
### pypy-2018-08-01-zstd-sparse-revlog
sorted: 2.070498 seconds
shuffled: 2.069197 seconds
### netbeans-2018-08-01-zstd-sparse-revlog
sorted: 41.392423 seconds
shuffled: 41.648689 seconds
### mozilla-central-2018-08-01-zstd-sparse-revlog
sorted: 103.315670 seconds
shuffled: 104.369358 seconds
author | Pierre-Yves David <pierre-yves.david@octobus.net> |
---|---|
date | Sat, 01 Apr 2023 05:57:09 +0200 |
parents | 53a6febafc66 |
children |
line wrap: on
line source
== New Features == * clonebundles can be annotated with the expected memory requirements using the `REQUIREDRAM` option. This allows clients to skip bundles created with large zstd windows and fallback to larger, but less demanding bundles. * The `phabricator` extension now provides more functionality of the arcanist CLI like changing the status of a differential. * Phases processing is much faster, especially for repositories with old non-public changesets. == New Experimental Features == * The core of some hg operations have been (and are being) implemented in rust, for speed. `hg status` on a repository with 300k tracked files goes from 1.8s to 0.6s for instance. This has currently been tested only on linux, and does not build on windows. See rust/README.rst in the mercurial repository for instructions to opt into this. * An experimental config `rewrite.empty-successor` was introduced to control what happens when rewrite operations result in empty changesets. == Bug Fixes == * For the case when connected to a TTY, stdout was fixed to be line-buffered on Python 3 (where it was block-buffered before, causing the process to seem hanging) and Windows on Python 2 (where it was unbuffered before). * Subversion sources of the convert extension were fixed to work on Python 3. * Subversion sources of the convert extension now interpret the encoding of URLs like Subversion. Previously, there were situations where the convert extension recognized a repository as present but Subversion did not, and vice versa. * The empty changeset check of in-memory rebases was fixed to match that of normal rebases (and that of the commit command). * The push command now checks the correct set of outgoing changesets for obsolete and unstable changesets. Previously, it could happen that the check prevented pushing changesets which were already on the server. == Backwards Compatibility Changes == * Mercurial now requires at least Python 2.7.9 or a Python version that backported modern SSL/TLS features (as defined in PEP 466), and that Python was compiled against a OpenSSL version supporting TLS 1.1 or TLS 1.2 (likely this requires the OpenSSL version to be at least 1.0.1). * The `hg perfwrite` command from contrib/perf.py was made more flexible and changed its default behavior. To get the previous behavior, run `hg perfwrite --nlines=100000 --nitems=1 --item='Testing write performance' --batch-line`. * The absorb extension now preserves changesets with no file changes that can be created by the commit command (those which change the branch name compared to the parent and those closing a branch head). == Internal API Changes == * logcmdutil.diffordiffstat() now takes contexts instead of nodes. * The `mergestate` class along with some related methods and constants have moved from `mercurial.merge` to a new `mercurial.mergestate` module. * The `phasecache` class now uses sparse dictionaries for the phase data. New accessors are provided to detect if any non-public changeset exists (`hasnonpublicphases`) and get the correponsponding root set (`nonpublicphaseroots`). * The `stdin`, `stdout` and `stderr` attributes of the `mercurial.pycompat` module were removed. Instead, the attributes of same name from the `mercurial.utils.procutil` module should be used, which provide more consistent behavior across Python versions and platforms.