view tests/test-diff-upgrade.t @ 42222:57203e0210f8

copies: calculate mergecopies() based on pathcopies() When copies are stored in changesets, we need a changeset-centric version of mergecopies() just like we have a changeset-centric version of pathcopies(). I think the natural way of thinking about mergecopies() is in terms of pathcopies() from the base to each of the commits. So if we can rewrite mergecopies() based on two such pathcopies() calls, we'll get the changeset-centric version for free. That's what this patch does. A nice bonus is that it ends up being a lot simpler. mergecopies() has accumulated a lot of technical debt over time. One good example is the code for dealing with grafts (the "partial/incomplete/dirty" stuff). Since pathcopies() already deals with backwards renames and ping-pong renames, we get that for free. I've run tests with hard-coded debug logging for "fullcopy" and while I haven't looked at every difference it produces, all the ones I have looked at seemed reasonable to me. I'm a little surprised that no more tests fail when run with '--extra-config-opt experimental.copies.read-from=compatibility' compared to before this patch. This patch also fixes the broken cases in test-annotate.t and test-fastannotate.t. It also enables the part of test-copies.t that was previously disabled exactly because mergecopies() needed to get a changeset-centric version. One drawback of the rewritten code is that we may now make remotefilelog prefetch more files. We used to prefetch files that were unique to either side of the merge compared to the other. We now prefetch files that are unique to either side of the merge compared to the base. This means that if you added the same file to each side, we would not prefetch it before, but we would now. Such cases are probably quite rare, but one likely scenario where they happen is when moving from a commit to its successor (or the other way around). The user will probably already have the files in the cache in such cases, so it's probably not a big deal. Some timings for calculating mergecopies between two revisions (revisions shown on each line, all using the common ancestor as base): In the hg repo: 4.8 4.9: 0.21s -> 0.21s 4.0 4.8: 0.35s -> 0.63s In and old copy of the mozilla-unified repo: FIREFOX_BETA_60_BASE^ FIREFOX_BETA_60_BASE: 0.82s -> 0.82s FIREFOX_NIGHTLY_59_END FIREFOX_BETA_60_BASE: 2.5s -> 2.6s FIREFOX_BETA_59_END FIREFOX_BETA_60_BASE: 3.9s -> 4.1s FIREFOX_AURORA_50_BASE FIREFOX_BETA_60_BASE: 31s -> 33s So it's measurably slower in most cases. The most significant difference is in the hg repo between revisions 4.0 and 4.8. In that case it seems to come from the fact that pathcopies() uses fctx.isintroducedafter() (in _tracefile), while the old mergecopies() used fctx.linkrev() (in _checkcopies()). That results in a single call to filectx._adjustlinkrev(), which is responsible for the entire difference in time (in my repo). So we pay a performance penalty but we get more correct code (see change in test-mv-cp-st-diff.t). Deleting the "== f.filenode()" in _tracefile() recovers the lost performance in the hg repo. There were are few other optimizations in _checkcopies() that I could not measure any impact from. One was from the "seen" set. Another was from a "continue" when the file was not in the destination manifest (corresponding to "am" in _tracefile). Also note that merge copies are not calculated when updating with a clean working copy, which is probably the most common case. I therefore think the much simpler code is worth the slowdown. Differential Revision: https://phab.mercurial-scm.org/D6255
author Martin von Zweigbergk <martinvonz@google.com>
date Thu, 11 Apr 2019 23:22:54 -0700
parents 5abc47d4ca6b
children 0492002560f3
line wrap: on
line source

#require execbit

  $ cat <<EOF >> $HGRCPATH
  > [extensions]
  > autodiff = $TESTDIR/autodiff.py
  > [diff]
  > nodates = 1
  > EOF

  $ hg init repo
  $ cd repo



make a combination of new, changed and deleted file

  $ echo regular > regular
  $ echo rmregular > rmregular
  $ "$PYTHON" -c "open('bintoregular', 'wb').write(b'\0')"
  $ touch rmempty
  $ echo exec > exec
  $ chmod +x exec
  $ echo rmexec > rmexec
  $ chmod +x rmexec
  $ echo setexec > setexec
  $ echo unsetexec > unsetexec
  $ chmod +x unsetexec
  $ echo binary > binary
  $ "$PYTHON" -c "open('rmbinary', 'wb').write(b'\0')"
  $ hg ci -Am addfiles
  adding binary
  adding bintoregular
  adding exec
  adding regular
  adding rmbinary
  adding rmempty
  adding rmexec
  adding rmregular
  adding setexec
  adding unsetexec
  $ echo regular >> regular
  $ echo newregular >> newregular
  $ rm rmempty
  $ touch newempty
  $ rm rmregular
  $ echo exec >> exec
  $ echo newexec > newexec
  $ echo bintoregular > bintoregular
  $ chmod +x newexec
  $ rm rmexec
  $ chmod +x setexec
  $ chmod -x unsetexec
  $ "$PYTHON" -c "open('binary', 'wb').write(b'\0\0')"
  $ "$PYTHON" -c "open('newbinary', 'wb').write(b'\0')"
  $ rm rmbinary
  $ hg addremove -s 0
  adding newbinary
  adding newempty
  adding newexec
  adding newregular
  removing rmbinary
  removing rmempty
  removing rmexec
  removing rmregular

git=no: regular diff for all files

  $ hg autodiff --git=no
  diff -r a66d19b9302d binary
  Binary file binary has changed
  diff -r a66d19b9302d bintoregular
  Binary file bintoregular has changed
  diff -r a66d19b9302d exec
  --- a/exec
  +++ b/exec
  @@ -1,1 +1,2 @@
   exec
  +exec
  diff -r a66d19b9302d newbinary
  Binary file newbinary has changed
  diff -r a66d19b9302d newexec
  --- /dev/null
  +++ b/newexec
  @@ -0,0 +1,1 @@
  +newexec
  diff -r a66d19b9302d newregular
  --- /dev/null
  +++ b/newregular
  @@ -0,0 +1,1 @@
  +newregular
  diff -r a66d19b9302d regular
  --- a/regular
  +++ b/regular
  @@ -1,1 +1,2 @@
   regular
  +regular
  diff -r a66d19b9302d rmbinary
  Binary file rmbinary has changed
  diff -r a66d19b9302d rmexec
  --- a/rmexec
  +++ /dev/null
  @@ -1,1 +0,0 @@
  -rmexec
  diff -r a66d19b9302d rmregular
  --- a/rmregular
  +++ /dev/null
  @@ -1,1 +0,0 @@
  -rmregular

git=yes: git diff for single regular file

  $ hg autodiff --git=yes regular
  diff --git a/regular b/regular
  --- a/regular
  +++ b/regular
  @@ -1,1 +1,2 @@
   regular
  +regular

git=auto: regular diff for regular files and non-binary removals

  $ hg autodiff --git=auto regular newregular rmregular rmexec
  diff -r a66d19b9302d newregular
  --- /dev/null
  +++ b/newregular
  @@ -0,0 +1,1 @@
  +newregular
  diff -r a66d19b9302d regular
  --- a/regular
  +++ b/regular
  @@ -1,1 +1,2 @@
   regular
  +regular
  diff -r a66d19b9302d rmexec
  --- a/rmexec
  +++ /dev/null
  @@ -1,1 +0,0 @@
  -rmexec
  diff -r a66d19b9302d rmregular
  --- a/rmregular
  +++ /dev/null
  @@ -1,1 +0,0 @@
  -rmregular

  $ for f in exec newexec setexec unsetexec binary newbinary newempty rmempty rmbinary bintoregular; do
  >     echo
  >     echo '% git=auto: git diff for' $f
  >     hg autodiff --git=auto $f
  > done
  
  % git=auto: git diff for exec
  diff -r a66d19b9302d exec
  --- a/exec
  +++ b/exec
  @@ -1,1 +1,2 @@
   exec
  +exec
  
  % git=auto: git diff for newexec
  diff --git a/newexec b/newexec
  new file mode 100755
  --- /dev/null
  +++ b/newexec
  @@ -0,0 +1,1 @@
  +newexec
  
  % git=auto: git diff for setexec
  diff --git a/setexec b/setexec
  old mode 100644
  new mode 100755
  
  % git=auto: git diff for unsetexec
  diff --git a/unsetexec b/unsetexec
  old mode 100755
  new mode 100644
  
  % git=auto: git diff for binary
  diff --git a/binary b/binary
  index a9128c283485202893f5af379dd9beccb6e79486..09f370e38f498a462e1ca0faa724559b6630c04f
  GIT binary patch
  literal 2
  Jc${Nk0000200961
  
  
  % git=auto: git diff for newbinary
  diff --git a/newbinary b/newbinary
  new file mode 100644
  index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..f76dd238ade08917e6712764a16a22005a50573d
  GIT binary patch
  literal 1
  Ic${MZ000310RR91
  
  
  % git=auto: git diff for newempty
  diff --git a/newempty b/newempty
  new file mode 100644
  
  % git=auto: git diff for rmempty
  diff --git a/rmempty b/rmempty
  deleted file mode 100644
  
  % git=auto: git diff for rmbinary
  diff --git a/rmbinary b/rmbinary
  deleted file mode 100644
  index f76dd238ade08917e6712764a16a22005a50573d..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
  GIT binary patch
  literal 0
  Hc$@<O00001
  
  
  % git=auto: git diff for bintoregular
  diff --git a/bintoregular b/bintoregular
  index f76dd238ade08917e6712764a16a22005a50573d..9c42f2b6427d8bf034b7bc23986152dc01bfd3ab
  GIT binary patch
  literal 13
  Uc$`bh%qz(+N=+}#Ni5<5043uE82|tP
  


git=warn: regular diff with data loss warnings

  $ hg autodiff --git=warn
  diff -r a66d19b9302d binary
  Binary file binary has changed
  diff -r a66d19b9302d bintoregular
  Binary file bintoregular has changed
  diff -r a66d19b9302d exec
  --- a/exec
  +++ b/exec
  @@ -1,1 +1,2 @@
   exec
  +exec
  diff -r a66d19b9302d newbinary
  Binary file newbinary has changed
  diff -r a66d19b9302d newexec
  --- /dev/null
  +++ b/newexec
  @@ -0,0 +1,1 @@
  +newexec
  diff -r a66d19b9302d newregular
  --- /dev/null
  +++ b/newregular
  @@ -0,0 +1,1 @@
  +newregular
  diff -r a66d19b9302d regular
  --- a/regular
  +++ b/regular
  @@ -1,1 +1,2 @@
   regular
  +regular
  diff -r a66d19b9302d rmbinary
  Binary file rmbinary has changed
  diff -r a66d19b9302d rmexec
  --- a/rmexec
  +++ /dev/null
  @@ -1,1 +0,0 @@
  -rmexec
  diff -r a66d19b9302d rmregular
  --- a/rmregular
  +++ /dev/null
  @@ -1,1 +0,0 @@
  -rmregular
  data lost for: binary
  data lost for: bintoregular
  data lost for: newbinary
  data lost for: newempty
  data lost for: newexec
  data lost for: rmbinary
  data lost for: rmempty
  data lost for: setexec
  data lost for: unsetexec

git=abort: fail on execute bit change

  $ hg autodiff --git=abort regular setexec
  abort: losing data for setexec
  [255]

git=abort: succeed on regular file

  $ hg autodiff --git=abort regular
  diff -r a66d19b9302d regular
  --- a/regular
  +++ b/regular
  @@ -1,1 +1,2 @@
   regular
  +regular

  $ cd ..