view contrib/dumprevlog @ 22022:2ea6d906cf9b

merge: use no-minimal for premerge too ecc1387138ba disabled minimal for `internal:merge` but forgot to also disabled it for premerge. This is now done. This gives me an occasion to shamelessly includes my explanation of why this minimisation feature must disappear: [this is why it's pointless to reject patches with misspellings in the description - mpm] Detailled explanation ===================== The ``simplemerge`` code use in ``internal:merge`` has a feature called "minimization". It reprocess conflicting chunks to find common changes inside them and excludes such common sections from the marker. This approach seems a significant win at first glance but produces very confusing results in some other cases. Simple example -------------- A simple example is enough to show the benefit of this feature. In this merge, both sides change all numbers from letters to digits, but one side is also changing some values. $ cat << EOF > base > Small Mathematical Series. > One > Two > Three > Four > Five > Hop we are done. > EOF $ cat << EOF > local > Small Mathematical Series. > 1 > 2 > 3 > 4 > 5 > Hop we are done. > EOF $ cat << EOF > other > Small Mathematical Series. > 1 > 2 > 3 > 6 > 8 > Hop we are done. > EOF In the minimalists case, the markers focus on the disagreement between the two sides. $ $TESTDIR/../contrib/simplemerge --print local base other Small Mathematical Series. 1 2 3 <<<<<<< local 4 5 ======= 6 8 >>>>>>> other Hop we are done. warning: conflicts during merge. [1] In the non minimalist case, the whole chunk is included in the conflict marker. Making it harder spot actual differences. $ $TESTDIR/../contrib/simplemerge --print --no-minimal local base other Small Mathematical Series. <<<<<<< local 1 2 3 4 5 ======= 1 2 3 6 8 >>>>>>> other Hop we are done. warning: conflicts during merge. [1] Practical Advantages of minimalisation: merge of grafted change --------------------------------------------------------------- This feature can be very useful when a change have been grafted in another branch and then some change have been made to the grafted code. $ cat << EOF > base > # empty file > EOF $ cat << EOF > local > def somefunction(one, two): > some = one > stuff = two > are(happening) > here() > EOF $ cat << EOF > other > def somefunction(one, two): > some = one > change = two > are(happening) > here() > EOF The minimalist case recognises the grafted content as similar and highlight the actual change. $ $TESTDIR/../contrib/simplemerge --print local base other def somefunction(one, two): some = one <<<<<<< local stuff = two ======= change = two >>>>>>> other are(happening) here() warning: conflicts during merge. [1] Again, the non-minimalist case produces a larger conflict. Making it harder to spot the actual conflict. $ $TESTDIR/../contrib/simplemerge --print --no-minimal local base other <<<<<<< local def somefunction(one, two): some = one stuff = two are(happening) here() ======= def somefunction(one, two): some = one change = two are(happening) here() >>>>>>> other warning: conflicts during merge. [1] Practical disadvantage: multiple functions on each side --------------------------------------------------------------- So, if this "minimalist" help so much, why introduce a setting to disable it? The issue is that this minimisation will grab any common lines for breaking chunks. This may result in partial context when solving a merge. The most simple example is a merge where both side added some (different) functions separated by blank lines. The "minimalist" approach will recognise the blank line as "common" and over slice the chunks, turning a simple conflict case into multiple pairs of conflicting functions. $ cat << EOF > base > # empty file > EOF $ cat << EOF > local > def function1(): > bla() > bla() > bla() > > def function2(): > ble() > ble() > ble() > EOF $ cat << EOF > other > def function3(): > bli() > bli() > bli() > > def function4(): > blo() > blo() > blo() > EOF The minimal case presents each function as a separated context. $ $TESTDIR/../contrib/simplemerge --print local base other <<<<<<< local def function1(): bla() bla() bla() ======= def function3(): bli() bli() bli() >>>>>>> other <<<<<<< local def function2(): ble() ble() ble() ======= def function4(): blo() blo() blo() >>>>>>> other warning: conflicts during merge. [1] The non-minimalist approach produces a simpler version with more context in each block. Solving such conflicts is usually as simple as dropping the 3 lines dedicated to markers. $ $TESTDIR/../contrib/simplemerge --prin --no-minimal local base other <<<<<<< local def function1(): bla() bla() bla() def function2(): ble() ble() ble() ======= def function3(): bli() bli() bli() def function4(): blo() blo() blo() >>>>>>> other warning: conflicts during merge. [1] Practical disaster: programing language have a lot of common line ================================================================= If only blank lines between function where the only frequent content of a code file. But programming language tend to repeat them self much more often. In that case, the minimalist approach turns a simple conflict into a massive mess. Consider this example where two unrelated functions are added on each side. Those function shares common programming constructs by chance. $ cat << EOF > base > # empty file > EOF $ cat << EOF > local > def longfunction(): > if bla: > foo > else: > bar > try: > ret = some stuff > except Exception: > ret = None > if ret is not None: > return ret > return 0 > > def shortfunction(foo): > goo() > ret = foo + 5 > return ret > EOF $ cat << EOF > other > def otherlongfunction(): > for x in xxx: > if coin: > break > tutu > else: > bar() > baz() > ret = week() > try: > groumpf = tutu > fool() > except Exception: > zoo() > pool() > if cond: > return ret > > # some big block > ret ** 6 > koin() > return ret > EOF The minimalist approach will hash the whole conflict into small chunks that does not match any meaningful semantic and are impossible to solve. $ $TESTDIR/../contrib/simplemerge --print local base other <<<<<<< local def longfunction(): if bla: foo ======= def otherlongfunction(): for x in xxx: if coin: break tutu >>>>>>> other else: <<<<<<< local bar ======= bar() baz() ret = week() >>>>>>> other try: <<<<<<< local ret = some stuff ======= groumpf = tutu fool() >>>>>>> other except Exception: <<<<<<< local ret = None if ret is not None: ======= zoo() pool() if cond: >>>>>>> other return ret <<<<<<< local return 0 ======= >>>>>>> other <<<<<<< local def shortfunction(foo): goo() ret = foo + 5 ======= # some big block ret ** 6 koin() >>>>>>> other return ret warning: conflicts during merge. [1] The non minimalist approach will properly produce a single set of conflict markers. Highlighting that the two chunk are unrelated. Such conflict from unrelated content added at the same place is usually solved by dropping the marker an keeping both content. Something impossible with minimised markers. $ $TESTDIR/../contrib/simplemerge --prin --no-minimal local base other <<<<<<< local def longfunction(): if bla: foo else: bar try: ret = some stuff except Exception: ret = None if ret is not None: return ret return 0 def shortfunction(foo): goo() ret = foo + 5 return ret ======= def otherlongfunction(): for x in xxx: if coin: break tutu else: bar() baz() ret = week() try: groumpf = tutu fool() except Exception: zoo() pool() if cond: return ret # some big block ret ** 6 koin() return ret >>>>>>> other warning: conflicts during merge. [1]
author Pierre-Yves David <pierre-yves.david@fb.com>
date Tue, 29 Jul 2014 11:55:01 -0700
parents 659f34b833b9
children a212ca70205c
line wrap: on
line source

#!/usr/bin/env python
# Dump revlogs as raw data stream
# $ find .hg/store/ -name "*.i" | xargs dumprevlog > repo.dump

import sys
from mercurial import revlog, node, util

for fp in (sys.stdin, sys.stdout, sys.stderr):
    util.setbinary(fp)

for f in sys.argv[1:]:
    binopen = lambda fn: open(fn, 'rb')
    r = revlog.revlog(binopen, f)
    print "file:", f
    for i in r:
        n = r.node(i)
        p = r.parents(n)
        d = r.revision(n)
        print "node:", node.hex(n)
        print "linkrev:", r.linkrev(i)
        print "parents:", node.hex(p[0]), node.hex(p[1])
        print "length:", len(d)
        print "-start-"
        print d
        print "-end-"