Pierre-Yves David <pierre-yves.david@fb.com> [Tue, 05 Aug 2014 14:56:25 -0700] rev 22023
simplemerge: burn "minimal" feature to the ground
Matt Mackall said:
The goal of simplemerge should have always been to be a drop-in
replacement for RCS merge. Please nuke this minimization thing entirely.
This whole things is now dead.
Pierre-Yves David <pierre-yves.david@fb.com> [Tue, 29 Jul 2014 11:55:01 -0700] rev 22022
merge: use no-minimal for premerge too
ecc1387138ba disabled minimal for `internal:merge` but forgot to also disabled
it for premerge. This is now done.
This gives me an occasion to shamelessly includes my explanation of why this
minimisation feature must disappear:
[this is why it's pointless to reject patches with misspellings in the
description - mpm]
Detailled explanation
=====================
The ``simplemerge`` code use in ``internal:merge`` has a feature called
"minimization". It reprocess conflicting chunks to find common changes
inside them and excludes such common sections from the marker.
This approach seems a significant win at first glance but produces very
confusing results in some other cases.
Simple example
--------------
A simple example is enough to show the benefit of this feature. In this merge,
both sides change all numbers from letters to digits, but one side is also
changing some values.
$ cat << EOF > base
> Small Mathematical Series.
> One
> Two
> Three
> Four
> Five
> Hop we are done.
> EOF
$ cat << EOF > local
> Small Mathematical Series.
> 1
> 2
> 3
> 4
> 5
> Hop we are done.
> EOF
$ cat << EOF > other
> Small Mathematical Series.
> 1
> 2
> 3
> 6
> 8
> Hop we are done.
> EOF
In the minimalists case, the markers focus on the disagreement between the two
sides.
$ $TESTDIR/../contrib/simplemerge --print local base other
Small Mathematical Series.
1
2
3
<<<<<<< local
4
5
=======
6
8
>>>>>>> other
Hop we are done.
warning: conflicts during merge.
[1]
In the non minimalist case, the whole chunk is included in the conflict marker.
Making it harder spot actual differences.
$ $TESTDIR/../contrib/simplemerge --print --no-minimal local base other
Small Mathematical Series.
<<<<<<< local
1
2
3
4
5
=======
1
2
3
6
8
>>>>>>> other
Hop we are done.
warning: conflicts during merge.
[1]
Practical Advantages of minimalisation: merge of grafted change
---------------------------------------------------------------
This feature can be very useful when a change have been grafted in another
branch and then some change have been made to the grafted code.
$ cat << EOF > base
> # empty file
> EOF
$ cat << EOF > local
> def somefunction(one, two):
> some = one
> stuff = two
> are(happening)
> here()
> EOF
$ cat << EOF > other
> def somefunction(one, two):
> some = one
> change = two
> are(happening)
> here()
> EOF
The minimalist case recognises the grafted content as similar and highlight the
actual change.
$ $TESTDIR/../contrib/simplemerge --print local base other
def somefunction(one, two):
some = one
<<<<<<< local
stuff = two
=======
change = two
>>>>>>> other
are(happening)
here()
warning: conflicts during merge.
[1]
Again, the non-minimalist case produces a larger conflict. Making it harder to
spot the actual conflict.
$ $TESTDIR/../contrib/simplemerge --print --no-minimal local base other
<<<<<<< local
def somefunction(one, two):
some = one
stuff = two
are(happening)
here()
=======
def somefunction(one, two):
some = one
change = two
are(happening)
here()
>>>>>>> other
warning: conflicts during merge.
[1]
Practical disadvantage: multiple functions on each side
---------------------------------------------------------------
So, if this "minimalist" help so much, why introduce a setting to disable it?
The issue is that this minimisation will grab any common lines for breaking
chunks. This may result in partial context when solving a merge. The most
simple example is a merge where both side added some (different) functions
separated by blank lines. The "minimalist" approach will recognise the blank
line as "common" and over slice the chunks, turning a simple conflict case into
multiple pairs of conflicting functions.
$ cat << EOF > base
> # empty file
> EOF
$ cat << EOF > local
> def function1():
> bla()
> bla()
> bla()
>
> def function2():
> ble()
> ble()
> ble()
> EOF
$ cat << EOF > other
> def function3():
> bli()
> bli()
> bli()
>
> def function4():
> blo()
> blo()
> blo()
> EOF
The minimal case presents each function as a separated context.
$ $TESTDIR/../contrib/simplemerge --print local base other
<<<<<<< local
def function1():
bla()
bla()
bla()
=======
def function3():
bli()
bli()
bli()
>>>>>>> other
<<<<<<< local
def function2():
ble()
ble()
ble()
=======
def function4():
blo()
blo()
blo()
>>>>>>> other
warning: conflicts during merge.
[1]
The non-minimalist approach produces a simpler version with more context in
each block. Solving such conflicts is usually as simple as dropping the 3 lines
dedicated to markers.
$ $TESTDIR/../contrib/simplemerge --prin --no-minimal local base other
<<<<<<< local
def function1():
bla()
bla()
bla()
def function2():
ble()
ble()
ble()
=======
def function3():
bli()
bli()
bli()
def function4():
blo()
blo()
blo()
>>>>>>> other
warning: conflicts during merge.
[1]
Practical disaster: programing language have a lot of common line
=================================================================
If only blank lines between function where the only frequent content of a code
file. But programming language tend to repeat them self much more often. In that
case, the minimalist approach turns a simple conflict into a massive mess.
Consider this example where two unrelated functions are added on each side.
Those function shares common programming constructs by chance.
$ cat << EOF > base
> # empty file
> EOF
$ cat << EOF > local
> def longfunction():
> if bla:
> foo
> else:
> bar
> try:
> ret = some stuff
> except Exception:
> ret = None
> if ret is not None:
> return ret
> return 0
>
> def shortfunction(foo):
> goo()
> ret = foo + 5
> return ret
> EOF
$ cat << EOF > other
> def otherlongfunction():
> for x in xxx:
> if coin:
> break
> tutu
> else:
> bar()
> baz()
> ret = week()
> try:
> groumpf = tutu
> fool()
> except Exception:
> zoo()
> pool()
> if cond:
> return ret
>
> # some big block
> ret ** 6
> koin()
> return ret
> EOF
The minimalist approach will hash the whole conflict into small chunks that
does not match any meaningful semantic and are impossible to solve.
$ $TESTDIR/../contrib/simplemerge --print local base other
<<<<<<< local
def longfunction():
if bla:
foo
=======
def otherlongfunction():
for x in xxx:
if coin:
break
tutu
>>>>>>> other
else:
<<<<<<< local
bar
=======
bar()
baz()
ret = week()
>>>>>>> other
try:
<<<<<<< local
ret = some stuff
=======
groumpf = tutu
fool()
>>>>>>> other
except Exception:
<<<<<<< local
ret = None
if ret is not None:
=======
zoo()
pool()
if cond:
>>>>>>> other
return ret
<<<<<<< local
return 0
=======
>>>>>>> other
<<<<<<< local
def shortfunction(foo):
goo()
ret = foo + 5
=======
# some big block
ret ** 6
koin()
>>>>>>> other
return ret
warning: conflicts during merge.
[1]
The non minimalist approach will properly produce a single set of conflict
markers. Highlighting that the two chunk are unrelated. Such conflict from
unrelated content added at the same place is usually solved by dropping the
marker an keeping both content. Something impossible with minimised markers.
$ $TESTDIR/../contrib/simplemerge --prin --no-minimal local base other
<<<<<<< local
def longfunction():
if bla:
foo
else:
bar
try:
ret = some stuff
except Exception:
ret = None
if ret is not None:
return ret
return 0
def shortfunction(foo):
goo()
ret = foo + 5
return ret
=======
def otherlongfunction():
for x in xxx:
if coin:
break
tutu
else:
bar()
baz()
ret = week()
try:
groumpf = tutu
fool()
except Exception:
zoo()
pool()
if cond:
return ret
# some big block
ret ** 6
koin()
return ret
>>>>>>> other
warning: conflicts during merge.
[1]
Pierre-Yves David <pierre-yves.david@fb.com> [Mon, 09 Jun 2014 23:37:36 -0700] rev 22021
merge: refactor labels selection code
The code is simplified to prepare the future introduction of a third labels for
the merge base.
Pierre-Yves David <pierre-yves.david@fb.com> [Tue, 01 Jul 2014 23:08:17 +0200] rev 22020
push: include phase push in the unified bundle2 push
Phase push is now included in the same bundle2 push as changesets. We use
multiple pushkey parts to transmit the information. Note that phase moves are
still not part of the repository "transaction".
Pierre-Yves David <pierre-yves.david@fb.com> [Wed, 30 Jul 2014 19:26:47 -0700] rev 22019
push: perform phases discovery before the push
This will allow including phase information in the same bundle2 as the
changesets.
Pierre-Yves David <pierre-yves.david@fb.com> [Tue, 01 Jul 2014 17:06:02 +0200] rev 22018
push: make discovery extensible
We need to gather all discovery before the unified bundle2 push. We
use the same pattern as bundle2 parts generation.
Pierre-Yves David <pierre-yves.david@fb.com> [Wed, 30 Jul 2014 19:04:50 -0700] rev 22017
push: rework the bundle2partsgenerators logic
Instead of a single list of functions, we now have a list of names and
a mapping of names to functions. This simplifies wrapping of steps
from extensions. In the same move, declaration becomes decorator-based
(syntax sugar, nom nom nom!).
Pierre-Yves David <pierre-yves.david@fb.com> [Tue, 01 Jul 2014 17:27:22 +0200] rev 22016
push: move common heads computation into pushop
Now that both options (push succeed or fall back) live in pushop, we
can move the common heads computation there too. It is a very commonly
accessed attribute so it makes a lot of sense to have it in pushop.
Pierre-Yves David <pierre-yves.david@fb.com> [Tue, 01 Jul 2014 17:20:47 +0200] rev 22015
push: extract fallback heads computation into pushop
Similar motivation to `futureheads`, we extract the computation into pushop
to make it available early to all possibly interested parties.
Pierre-Yves David <pierre-yves.david@fb.com> [Tue, 01 Jul 2014 17:20:31 +0200] rev 22014
push: extract future heads computation into pushop
Bundle2 will allow pushing all different parts of the push in a single bundle.
This mean that the discovery for each part needs to be done before trying to
push. Currently we may have different behaviors for phases and obsolescence markers
when the push of changesets fails. For example, information may still be
exchanged for a part of the history where changesets are common but where
phases mismatch. So the preparation of the push need to determine what
information need to be pushed in both situations. And it needs a different set of
heads for this. Therefore we are moving heads computation within pushop for easy
access by all parties. We start with the simplest set of heads.
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Sat, 02 Aug 2014 21:46:27 +0900] rev 22013
cmdutil: use '[committemplate]' section like as map file for style definition
Before this patch, each template definitions for 'changeset*' in
'[committemplate]' section have to be written fully from scratch,
even though many parts of them may be common.
This patch uses '[committemplate]' section like as the map file for
the style definition. All items other than 'changeset' can be referred
from others.
This can reduce total cost of template customization in
'[committemplate]' section.
When the commit template other than '[committemplate] changeset'
is chosen by 'editform', putting '[committemplate] changeset'
value into the cache of the templater causes unexpected result,
because the templater stores the specified (= chosen) template
definition into own cache as 'changeset' at construction time.
This is the reason why '[committemplate] changeset' can't be referred
from others.
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Sat, 02 Aug 2014 21:46:27 +0900] rev 22012
cmdutil: look commit template definition up by specified 'editform'
Before this patch, '[committemplate] changeset' definition is shared
between all actions invoking 'commitforceeditor()'.
This prevents template definition from showing action specific
messages: for example, 'hg tag --remove' may need specific
message to call attention, but showing it may be redundant for
other actions.
This patch looks commit template definition up by specified
'editform' introduced by prior patches. 'editform' are
dot-separated list of names, and treated as hierarchical one.
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Sat, 02 Aug 2014 21:46:27 +0900] rev 22011
import: pass 'editform' argument to 'cmdutil.getcommiteditor'
This patch passes 'editform' argument according to the format below:
COMMAND[.ROUTE]
- ROUTE: name of route, if there are two or more routes in COMMAND
In this patch, 'normal' and 'bypass' are used as ROUTE.
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Sat, 02 Aug 2014 21:46:27 +0900] rev 22010
commit: pass 'editform' argument to 'cmdutil.getcommiteditor'
This patch passes 'editform' argument according to the format below:
COMMAND[.ROUTE]
- ROUTE: name of route, if there are two or more routes in COMMAND
In this patch, 'normal' and 'amend' are used as ROUTE.
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Sat, 02 Aug 2014 21:46:27 +0900] rev 22009
tag: pass 'editform' argument to 'cmdutil.getcommiteditor'
This patch passes 'editform' argument according to the format below:
COMMAND[.ROUTE]
- ROUTE: name of route, if there are two or more routes in COMMAND
In this patch, 'add' and 'remove' are used as ROUTE
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Sat, 02 Aug 2014 21:46:27 +0900] rev 22008
graft: pass 'editform' argument to 'cmdutil.getcommiteditor'
This patch passes 'editform' argument according to the format below:
COMMAND[.ROUTE]
- ROUTE: name of route, if there are two or more routes in COMMAND
In this patch, ROUTE is omitted.
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Sat, 02 Aug 2014 21:46:27 +0900] rev 22007
backout: pass 'editform' argument to 'cmdutil.getcommiteditor'
This patch passes 'editform' argument according to the format below:
COMMAND[.ROUTE]
- ROUTE: name of route, if there are two or more routes in COMMAND
In this patch, ROUTE is omitted..
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Sat, 02 Aug 2014 21:46:26 +0900] rev 22006
transplant: pass 'editform' argument to 'cmdutil.getcommiteditor'
This patch passes 'editform' argument according to the format below:
EXTENSION[.COMMAND][.ROUTE]
- EXTENSION: name of extension
- COMMAND: name of command, if there are two or more commands in EXTENSION
- ROUTE: name of route, if there are two or more routes in COMMAND
In this patch, COMMAND and ROUTE are omitted.
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Sat, 02 Aug 2014 21:46:26 +0900] rev 22005
shelve: pass 'editform' argument to 'cmdutil.getcommiteditor'
This patch passes 'editform' argument according to the format below:
EXTENSION[.COMMAND][.ROUTE]
- EXTENSION: name of extension
- COMMAND: name of command, if there are two or more commands in EXTENSION
- ROUTE: name of route, if there are two or more routes in COMMAND
In this patch:
- 'shelve' is used as COMMAND
- ROUTE is omitted
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Sat, 02 Aug 2014 21:46:26 +0900] rev 22004
rebase: pass 'editform' argument to 'cmdutil.getcommiteditor'
This patch passes 'editform' argument according to the format below:
EXTENSION[.COMMAND][.ROUTE]
- EXTENSION: name of extension
- COMMAND: name of command, if there are two or more commands in EXTENSION
- ROUTE: name of route, if there are two or more routes in COMMAND
In this patch:
- COMMAND is omitted
- 'normal' and 'collapse' are used as ROUTE
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Sat, 02 Aug 2014 21:46:26 +0900] rev 22003
mq: pass 'editform' argument to 'cmdutil.getcommiteditor'
This patch passes 'editform' argument according to the format below:
EXTENSION[.COMMAND][.ROUTE]
- EXTENSION: name of extension
- COMMAND: name of command, if there are two or more commands in EXTENSION
- ROUTE: name of route, if there are two or more routes in COMMAND
In this patch:
- MQ command names (qnew/qrefresh/qfold) are used as COMMAND
- ROUTE is omitted
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Sat, 02 Aug 2014 21:46:26 +0900] rev 22002
histedit: pass 'editform' argument to 'cmdutil.getcommiteditor'
This patch passes 'editform' argument according to the format below:
EXTENSION[.COMMAND][.ROUTE]
- EXTENSION: name of extension
- COMMAND: name of command, if there are two or more commands in EXTENSION
- ROUTE: name of route, if there are two or more routes for COMMAND
In this patch:
- 'edit', 'fold', 'mess' and 'pick' are used as COMMAND
- ROUTE is omitted
'histedit.pick' case is very rare, but possible if:
- target revision causes conflict at merging (= requires '--continue'), and
- description of it is empty ('hg commit -m " "' can create such one)
In the code path for 'histedit --continue' (the last patch hunk),
'canonaction' doesn't contain the entry for 'fold', because 'fold'
action causes:
- using temporary commit message forcibly, and
- making 'editopt' False always (= omit editor invocation if commit
message is specified)
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Sat, 02 Aug 2014 21:46:26 +0900] rev 22001
gpg: pass 'editform' argument to 'cmdutil.getcommiteditor'
This patch passes 'editform' argument according to the format below:
EXTENSION[.COMMAND][.ROUTE]
- EXTENSION: name of extension
- COMMAND: name of command, if there are two or more commands in EXTENSION
- ROUTE: name of route, if there are two or more routes in COMMAND
In this patch, 'sign' is used as COMMAND, and ROUTE is omitted.
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Sat, 02 Aug 2014 21:46:26 +0900] rev 22000
fetch: pass 'editform' argument to 'cmdutil.getcommiteditor'
This patch passes 'editform' argument according to the format below:
EXTENSION[.COMMAND][.ROUTE]
- EXTENSION: name of extension
- COMMAND: name of command, if there are two or more commands in EXTENSION
- ROUTE: name of route, if there are two or more routes in COMMAND
In this patch, COMMAND and ROUTE are omitted.
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Sat, 02 Aug 2014 21:46:26 +0900] rev 21999
cmdutil: introduce 'editform' to distinguish the purpose of commit text editing
This information will be used to switch '[committemplate] changeset'
definition according to its purpose in the subsequent patch.
This information also makes it easier to hook commit text editing only
in the specific cases.
Durham Goode <durham@fb.com> [Tue, 22 Jul 2014 22:40:16 -0700] rev 21998
log: allow patterns with -f
It's not uncommon for a user to want to run log with a pattern or directory name
on the history of their current commit. Currently we prevent that, but
I can't think of any reason to continue blocking that.
This commit removes the restriction and allows 'hg log -f <dir/pat>'
Augie Fackler <raf@durin42.com> [Mon, 28 Jul 2014 19:48:59 -0400] rev 21997
run-tests: fix test result counts with --keyword specified or skips occurring
This preserves the current behavior that excludes ignored or skipped
tests from the number of tests run, except when tests are ignored due
to the --retest flag.
Augie Fackler <raf@durin42.com> [Tue, 29 Jul 2014 22:35:59 -0400] rev 21996
test-run-tests.t: add tests for skips
This will make some minor behavior changes in a future patch more obvious.