Mercurial > hg
view mercurial/ancestor.py @ 14365:a8e3931e3fb5
revlog: linearize created changegroups in generaldelta revlogs
This greatly improves the speed of the bundling process, and often reduces the
bundle size considerably. (Although if the repository is already ordered, this
has little effect on both time and bundle size.)
For non-generaldelta clients, the reduced bundle size translates to a reduced
repository size, similar to shrinking the revlogs (which uses the exact same
algorithm). For generaldelta clients the difference is minor.
When the new bundle format comes, reordering will not be necessary since we
can then store the deltaparent relationsships directly. The eventual default
behavior for clients and servers is presented in the table below, where "new"
implies support for GD as well as the new bundle format:
old client new client
old server old bundle, no reorder old bundle, no reorder
new server, non-GD old bundle, no reorder[1] old bundle, no reorder[2]
new server, GD old bundle, reorder[3] new bundle, no reorder[4]
[1] reordering is expensive on the server in this case, skip it
[2] client can choose to do its own redelta here
[3] reordering is needed because otherwise the pull does a lot of extra
work on the server
[4] reordering isn't needed because client can get deltabase in bundle
format
Currently, the default is to reorder on GD-servers, and not otherwise. A new
setting, bundle.reorder, has been added to override the default reordering
behavior. It can be set to either 'auto' (the default), or any true or false
value as a standard boolean setting, to either force the reordering on or off
regardless of generaldelta.
Some timing data from a relatively branch test repository follows. All
bundling is done with --all --type none options.
Non-generaldelta, non-shrunk repo:
-----------------------------------
Size: 276M
Without reorder (default):
Bundle time: 14.4 seconds
Bundle size: 939M
With reorder:
Bundle time: 1 minute, 29.3 seconds
Bundle size: 381M
Generaldelta, non-shrunk repo:
-----------------------------------
Size: 87M
Without reorder:
Bundle time: 2 minutes, 1.4 seconds
Bundle size: 939M
With reorder (default):
Bundle time: 25.5 seconds
Bundle size: 381M
author | Sune Foldager <cryo@cyanite.org> |
---|---|
date | Wed, 18 May 2011 23:26:26 +0200 |
parents | 22565ddb28e7 |
children | 1ffeeb91c55d |
line wrap: on
line source
# ancestor.py - generic DAG ancestor algorithm for mercurial # # Copyright 2006 Matt Mackall <mpm@selenic.com> # # This software may be used and distributed according to the terms of the # GNU General Public License version 2 or any later version. import heapq def ancestor(a, b, pfunc): """ Returns the common ancestor of a and b that is furthest from a root (as measured by longest path) or None if no ancestor is found. If there are multiple common ancestors at the same distance, the first one found is returned. pfunc must return a list of parent vertices for a given vertex """ if a == b: return a a, b = sorted([a, b]) # find depth from root of all ancestors # depth is stored as a negative for heapq parentcache = {} visit = [a, b] depth = {} while visit: vertex = visit[-1] pl = pfunc(vertex) parentcache[vertex] = pl if not pl: depth[vertex] = 0 visit.pop() else: for p in pl: if p == a or p == b: # did we find a or b as a parent? return p # we're done if p not in depth: visit.append(p) if visit[-1] == vertex: # -(maximum distance of parents + 1) depth[vertex] = min([depth[p] for p in pl]) - 1 visit.pop() # traverse ancestors in order of decreasing distance from root def ancestors(vertex): h = [(depth[vertex], vertex)] seen = set() while h: d, n = heapq.heappop(h) if n not in seen: seen.add(n) yield (d, n) for p in parentcache[n]: heapq.heappush(h, (depth[p], p)) def generations(vertex): sg, s = None, set() for g, v in ancestors(vertex): if g != sg: if sg: yield sg, s sg, s = g, set((v,)) else: s.add(v) yield sg, s x = generations(a) y = generations(b) gx = x.next() gy = y.next() # increment each ancestor list until it is closer to root than # the other, or they match try: while 1: if gx[0] == gy[0]: for v in gx[1]: if v in gy[1]: return v gy = y.next() gx = x.next() elif gx[0] > gy[0]: gy = y.next() else: gx = x.next() except StopIteration: return None