mercurial/stabletailgraph/stabletailsort.py
author pacien <pacien.trangirard@pacien.net>
Fri, 21 Apr 2023 16:19:32 +0200
changeset 50561 dc372251d4dc
parent 50560 4fd2f7ab4177
child 50562 8fb3e942473a
permissions -rw-r--r--
stabletailgraph: extract _parents util func
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
50460
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
     1
# stabletailsort.py - stable ordering of revisions
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
     2
#
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
     3
# Copyright 2021-2023 Pacien TRAN-GIRARD <pacien.trangirard@pacien.net>
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
     4
#
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
     5
# This software may be used and distributed according to the terms of the
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
     6
# GNU General Public License version 2 or any later version.
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
     7
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
     8
"""
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
     9
Stable-tail sort computation.
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    10
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    11
The "stable-tail sort", or STS, is a reverse topological ordering of the
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    12
ancestors of a node, which tends to share large suffixes with the stable-tail
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    13
sort of ancestors and other nodes, giving it its name.
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    14
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    15
Its properties should make it suitable for making chunks of ancestors with high
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    16
reuse and incrementality for example.
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    17
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    18
This module and implementation are experimental. Most functions are not yet
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    19
optimised to operate on large production graphs.
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    20
"""
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    21
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    22
import itertools
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    23
from ..node import nullrev
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    24
from .. import ancestor
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    25
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    26
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    27
def _sorted_parents(cl, p1, p2):
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    28
    """
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    29
    Chooses and returns the pair (px, pt) from (p1, p2).
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    30
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    31
    Where
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    32
    "px" denotes the parent starting the "exclusive" part, and
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    33
    "pt" denotes the parent starting the "Tail" part.
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    34
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    35
    "px" is chosen as the parent with the lowest rank with the goal of
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    36
    minimising the size of the exclusive part and maximise the size of the
50493
e1496403b08c stabletailgraph: fix terminology in doc
pacien <pacien.trangirard@pacien.net>
parents: 50460
diff changeset
    37
    tail part, hopefully reducing the overall complexity of the stable-tail
e1496403b08c stabletailgraph: fix terminology in doc
pacien <pacien.trangirard@pacien.net>
parents: 50460
diff changeset
    38
    sort.
50460
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    39
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    40
    In case of equal ranks, the stable node ID is used as a tie-breaker.
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    41
    """
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    42
    r1, r2 = cl.fast_rank(p1), cl.fast_rank(p2)
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    43
    if r1 < r2:
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    44
        return (p1, p2)
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    45
    elif r1 > r2:
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    46
        return (p2, p1)
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    47
    elif cl.node(p1) < cl.node(p2):
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    48
        return (p1, p2)
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    49
    else:
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    50
        return (p2, p1)
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    51
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    52
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    53
def _nonoedipal_parent_revs(cl, rev):
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    54
    """
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    55
    Returns the non-œdipal parent pair of the given revision.
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    56
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    57
    An œdipal merge is a merge with parents p1, p2 with either
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    58
    p1 in ancestors(p2) or p2 in ancestors(p1).
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    59
    In the first case, p1 is the œdipal parent.
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    60
    In the second case, p2 is the œdipal parent.
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    61
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    62
    Œdipal edges start empty exclusive parts. They do not bring new ancestors.
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    63
    As such, they can be skipped when computing any topological sort or any
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    64
    iteration over the ancestors of a node.
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    65
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    66
    The œdipal edges are eliminated here using the rank information.
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    67
    """
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    68
    p1, p2 = cl.parentrevs(rev)
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    69
    if p1 == nullrev or cl.fast_rank(p2) == cl.fast_rank(rev) - 1:
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    70
        return p2, nullrev
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    71
    elif p2 == nullrev or cl.fast_rank(p1) == cl.fast_rank(rev) - 1:
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    72
        return p1, nullrev
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    73
    else:
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    74
        return p1, p2
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    75
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    76
50561
dc372251d4dc stabletailgraph: extract _parents util func
pacien <pacien.trangirard@pacien.net>
parents: 50560
diff changeset
    77
def _parents(cl, rev):
dc372251d4dc stabletailgraph: extract _parents util func
pacien <pacien.trangirard@pacien.net>
parents: 50560
diff changeset
    78
    p1, p2 = _nonoedipal_parent_revs(cl, rev)
dc372251d4dc stabletailgraph: extract _parents util func
pacien <pacien.trangirard@pacien.net>
parents: 50560
diff changeset
    79
    if p2 == nullrev:
dc372251d4dc stabletailgraph: extract _parents util func
pacien <pacien.trangirard@pacien.net>
parents: 50560
diff changeset
    80
        return p1, p2
dc372251d4dc stabletailgraph: extract _parents util func
pacien <pacien.trangirard@pacien.net>
parents: 50560
diff changeset
    81
dc372251d4dc stabletailgraph: extract _parents util func
pacien <pacien.trangirard@pacien.net>
parents: 50560
diff changeset
    82
    return _sorted_parents(cl, p1, p2)
dc372251d4dc stabletailgraph: extract _parents util func
pacien <pacien.trangirard@pacien.net>
parents: 50560
diff changeset
    83
dc372251d4dc stabletailgraph: extract _parents util func
pacien <pacien.trangirard@pacien.net>
parents: 50560
diff changeset
    84
50559
1a4f54574e3d stabletailgraph: clarify naiveness of current implementation
pacien <pacien.trangirard@pacien.net>
parents: 50493
diff changeset
    85
def _stable_tail_sort_naive(cl, head_rev):
50460
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    86
    """
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    87
    Naive topological iterator of the ancestors given by the stable-tail sort.
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    88
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    89
    The stable-tail sort of a node "h" is defined as the sequence:
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    90
    sts(h) := [h] + excl(h) + sts(pt(h))
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    91
    where excl(h) := u for u in sts(px(h)) if u not in ancestors(pt(h))
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    92
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    93
    This implementation uses a call-stack whose size is
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    94
    O(number of open merges).
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    95
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    96
    As such, this implementation exists mainly as a defining reference.
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    97
    """
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    98
    cursor_rev = head_rev
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
    99
    while cursor_rev != nullrev:
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
   100
        yield cursor_rev
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
   101
50561
dc372251d4dc stabletailgraph: extract _parents util func
pacien <pacien.trangirard@pacien.net>
parents: 50560
diff changeset
   102
        px, pt = _parents(cl, cursor_rev)
dc372251d4dc stabletailgraph: extract _parents util func
pacien <pacien.trangirard@pacien.net>
parents: 50560
diff changeset
   103
        if pt == nullrev:
dc372251d4dc stabletailgraph: extract _parents util func
pacien <pacien.trangirard@pacien.net>
parents: 50560
diff changeset
   104
            cursor_rev = px
50460
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
   105
        else:
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
   106
            tail_ancestors = ancestor.lazyancestors(
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
   107
                cl.parentrevs, (pt,), inclusive=True
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
   108
            )
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
   109
            exclusive_ancestors = (
50559
1a4f54574e3d stabletailgraph: clarify naiveness of current implementation
pacien <pacien.trangirard@pacien.net>
parents: 50493
diff changeset
   110
                a
1a4f54574e3d stabletailgraph: clarify naiveness of current implementation
pacien <pacien.trangirard@pacien.net>
parents: 50493
diff changeset
   111
                for a in _stable_tail_sort_naive(cl, px)
1a4f54574e3d stabletailgraph: clarify naiveness of current implementation
pacien <pacien.trangirard@pacien.net>
parents: 50493
diff changeset
   112
                if a not in tail_ancestors
50460
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
   113
            )
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
   114
50560
4fd2f7ab4177 stabletailgraph: clarify excl part size computation
pacien <pacien.trangirard@pacien.net>
parents: 50559
diff changeset
   115
            # Notice that excl(cur) is disjoint from ancestors(pt),
4fd2f7ab4177 stabletailgraph: clarify excl part size computation
pacien <pacien.trangirard@pacien.net>
parents: 50559
diff changeset
   116
            # so there is no double-counting:
4fd2f7ab4177 stabletailgraph: clarify excl part size computation
pacien <pacien.trangirard@pacien.net>
parents: 50559
diff changeset
   117
            # rank(cur) = len([cur]) + len(excl(cur)) + rank(pt)
50460
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
   118
            excl_part_size = cl.fast_rank(cursor_rev) - cl.fast_rank(pt) - 1
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
   119
            yield from itertools.islice(exclusive_ancestors, excl_part_size)
f0d2b18f0274 stabletailgraph: implement stable-tail sort
pacien <pacien.trangirard@pacien.net>
parents:
diff changeset
   120
            cursor_rev = pt