sparse-read: move from a recursive-based approach to a heap-based one
The previous recursive approach was trying to optimise each read slice to have
a good density. It had the tendency to over-optimize smaller slices while
leaving larger hole in others.
The new approach focuses on improving the combined density of all the reads,
instead of the individual slices. It slices at the largest gaps first, as they
reduce the total amount of read data the most efficiently.
Another benefit of this approach is that we iterate over the delta chain only
once, reducing the overhead of slicing long delta chains.
On the repository we use for tests, the new approach shows similar or faster
performance than the current default linear full read.
The repository contains about 450,000 revisions with many concurrent
topological branches. Tests have been run on two versions of the repository:
one built with the current delta constraint, and the other with an unlimited
delta span (using 'experimental.maxdeltachainspan=0')
Below are timings for building 1% of all the revision in the manifest log using
'hg perfrevlogrevisions -m'. Times are given in seconds. They include the new
couple of follow-up changeset in this series.
delta-span standard unlimited
linear-read 922s 632s
sparse-read 814s 566s
Some commands allow the user to specify a date, e.g.:
- backout, commit, import, tag: Specify the commit date.
- log, revert, update: Select revision(s) by date.
Many date formats are valid. Here are some examples:
- ``Wed Dec 6 13:18:29 2006`` (local timezone assumed)
- ``Dec 6 13:18 -0600`` (year assumed, time offset provided)
- ``Dec 6 13:18 UTC`` (UTC and GMT are aliases for +0000)
- ``Dec 6`` (midnight)
- ``13:18`` (today assumed)
- ``3:39`` (3:39AM assumed)
- ``3:39pm`` (15:39)
- ``2006-12-06 13:18:29`` (ISO 8601 format)
- ``2006-12-6 13:18``
- ``2006-12-6``
- ``12-6``
- ``12/6``
- ``12/6/6`` (Dec 6 2006)
- ``today`` (midnight)
- ``yesterday`` (midnight)
- ``now`` - right now
Lastly, there is Mercurial's internal format:
- ``1165411109 0`` (Wed Dec 6 13:18:29 2006 UTC)
This is the internal representation format for dates. The first number
is the number of seconds since the epoch (1970-01-01 00:00 UTC). The
second is the offset of the local timezone, in seconds west of UTC
(negative if the timezone is east of UTC).
The log command also accepts date ranges:
- ``<DATE`` - at or before a given date/time
- ``>DATE`` - on or after a given date/time
- ``DATE to DATE`` - a date range, inclusive
- ``-DAYS`` - within a given number of days of today