view mercurial/node.py @ 39183:f39efa885a6d

revlog: also detect intermediate snapshots Also detect intermediate-snapshot done against another previous snapshot. Doing an intermediate snapshot instead of a full one can reduce the number of full snapshots we need. They are especially useful for content with a lot of churn on the same line (eg: the manifest) where having a delta over multiple revisions can end up being significantly smaller than the sum of these revision deltas. A revlog built using intermediate snapshots can be a bit smaller and reuse snapshot much more efficiently. This last property is useful combined with constraints on chain length. Using intermediate snapshot can produce repository with delta chain ten times shorter without impact on the storage size. Shorter chain lengths are faster to restore, greatly improving read performance. This changesets (and the following ones) focus on getting the core principle of intermediate snapshots into Mercurial core. Later changeset will introduce the strategy to create them.
author Paul Morelle <paul.morelle@octobus.net>
date Fri, 20 Jul 2018 13:34:48 +0200
parents b623c7b23695
children 1e7a462cb946
line wrap: on
line source

# node.py - basic nodeid manipulation for mercurial
#
# Copyright 2005, 2006 Matt Mackall <mpm@selenic.com>
#
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.

from __future__ import absolute_import

import binascii

# This ugly style has a noticeable effect in manifest parsing
hex = binascii.hexlify
# Adapt to Python 3 API changes. If this ends up showing up in
# profiles, we can use this version only on Python 3, and forward
# binascii.unhexlify like we used to on Python 2.
def bin(s):
    try:
        return binascii.unhexlify(s)
    except binascii.Error as e:
        raise TypeError(e)

nullrev = -1
# In hex, this is '0000000000000000000000000000000000000000'
nullid = b"\0" * 20
nullhex = hex(nullid)

# Phony node value to stand-in for new files in some uses of
# manifests.
# In hex, this is '2121212121212121212121212121212121212121'
newnodeid = '!!!!!!!!!!!!!!!!!!!!'
# In hex, this is '0000000000000000000000000000006164646564'
addednodeid = '000000000000000added'
# In hex, this is '0000000000000000000000006d6f646966696564'
modifiednodeid = '000000000000modified'

wdirfilenodeids = {newnodeid, addednodeid, modifiednodeid}

# pseudo identifiers for working directory
# (they are experimental, so don't add too many dependencies on them)
wdirrev = 0x7fffffff
# In hex, this is 'ffffffffffffffffffffffffffffffffffffffff'
wdirid = b"\xff" * 20
wdirhex = hex(wdirid)

def short(node):
    return hex(node[:6])