diff tests/test-fix.t @ 48178:f12a19d03d2c

fix: reduce number of tool executions By grouping together (path, ctx) pairs according to the inputs they would provide to fixer tools, we can deduplicate executions of fixer tools to significantly reduce the amount of time spent running slow tools. This change does not handle clean files in the working copy, which could still be deduplicated against the files in the checked out commit. It's a little harder to do that because the filerev is not available in the workingfilectx (and it doesn't exist for added files). Anecdotally, this change makes some real uses cases at Google 10x faster. I think we were originally hesitant to do this because the benefits weren't obvious, and implementing it efficiently is kind of tricky. If we simply memoized the formatter execution function, we would be keeping tons of file content in memory. Also included is a regression test for a corner case that I broke with my first attempt at optimizing this code. Differential Revision: https://phab.mercurial-scm.org/D11280
author Danny Hooper <hooper@google.com>
date Thu, 02 Sep 2021 14:08:45 -0700
parents 066cdec8f74f
children 7a4d187479b6
line wrap: on
line diff
--- a/tests/test-fix.t	Thu Sep 02 14:07:55 2021 -0700
+++ b/tests/test-fix.t	Thu Sep 02 14:08:45 2021 -0700
@@ -1797,7 +1797,56 @@
   $ cat $LOGFILE | sort | uniq -c
         4 bar.log
         4 baz.log
-        4 foo.log
-        4 qux.log
+        3 foo.log
+        2 qux.log
 
   $ cd ..
+
+For tools that support line ranges, it's wrong to blindly re-use fixed file
+content for the same file revision if it appears twice with different baserevs,
+because the line ranges could be different. Since computing line ranges is
+ambiguous, this isn't a matter of correctness, but it affects the usability of
+this extension. It could maybe be simpler if baserevs were computed on a
+per-file basis to make this situation impossible to construct.
+
+In the following example, we construct two subgraphs with the same file
+revisions, and fix different sub-subgraphs to get different baserevs and
+different changed line ranges. The key precondition is that revisions 1 and 4
+have the same file revision, and the key result is that their successors don't
+have the same file content, because we want to fix different areas of that same
+file revision's content.
+
+  $ hg init differentlineranges
+  $ cd differentlineranges
+
+  $ printf "a\nb\n" > file.changed
+  $ hg commit -Aqm "0 ab"
+  $ printf "a\nx\n" > file.changed
+  $ hg commit -Aqm "1 ax"
+  $ hg remove file.changed
+  $ hg commit -Aqm "2 removed"
+  $ hg revert file.changed -r 0
+  $ hg commit -Aqm "3 ab (reverted)"
+  $ hg revert file.changed -r 1
+  $ hg commit -Aqm "4 ax (reverted)"
+
+  $ hg manifest --debug --template "{hash}\n" -r 0; \
+  > hg manifest --debug --template "{hash}\n" -r 3
+  418f692145676128d2fb518b027ddbac624be76e
+  418f692145676128d2fb518b027ddbac624be76e
+  $ hg manifest --debug --template "{hash}\n" -r 1; \
+  > hg manifest --debug --template "{hash}\n" -r 4
+  09b8b3ce5a507caaa282f7262679e6d04091426c
+  09b8b3ce5a507caaa282f7262679e6d04091426c
+
+  $ hg fix --working-dir -r 1+3+4
+  3 new orphan changesets
+
+  $ hg cat file.changed -r "successors(1)" --hidden
+  a
+  X
+  $ hg cat file.changed -r "successors(4)" --hidden
+  A
+  X
+
+  $ cd ..