Martin von Zweigbergk <martinvonz@google.com> [Thu, 18 Apr 2019 21:22:14 -0700] rev 42501
copies: don't filter out copy targets created on other side of merge commit
If file X is copied to Y on one side of merge and the other side
creates Y (no copy), we would not mark that as copy. In the
changeset-centric pathcopies() version, that was done by checking if
the copy target existed on the other branch. Even though merge commits
are pretty uncommon, it still turned out to be too expensive to load
the manifest of the parents of merge commits. In a repo of
mozilla-unified converted to storing copies in changesets, about 2m30s
of `hg debugpathcopies FIREFOX_BETA_59_END FIREFOX_BETA_60_BASE` is
spent on this check of merge commits.
I tried to think of a way of storing more information in the
changesets in order to cheaply detect these cases, but I couldn't
think of a solution. So this patch simply removes those checks.
For reference, these extra copies are reported from the aforementioned
command after this patch:
browser/base/content/sanitize.js -> browser/modules/Sanitizer.jsm
testing/mozbase/mozprocess/tests/process_normal_finish_python.ini -> testing/mozbase/mozprocess/tests/process_normal_finish.ini
testing/mozbase/mozprocess/tests/process_waittimeout_python.ini -> testing/mozbase/mozprocess/tests/process_waittimeout.ini
testing/mozbase/mozprocess/tests/process_waittimeout_10s_python.ini -> testing/mozbase/mozprocess/tests/process_waittimeout_10s.ini
Since these copies were created on one side of some merge, it still
seems reasonable to include them, so I'm not even sure it's worse than
filelog pathcopies(), just different.
Differential Revision: https://phab.mercurial-scm.org/D6420
Martin von Zweigbergk <martinvonz@google.com> [Thu, 18 Apr 2019 00:40:53 -0700] rev 42500
copies: do full filtering at end of _changesetforwardcopies()
As mentioned earlier, pathcopies() is very slow when copies are stored
in the changeset. Most of the cost comes from calling _chain() for
every changeset, which is slow because it needs to read manifests. It
needs to read manifests to be able to filter out copies that are were
created in one commit and then deleted. (It also filters out copies
that were created from a file that didn't exist in the starting
revision, but that's a fixed revision across calls to _chain(), so
it's much cheaper.)
This patch changes from _chainandfilter() to just _chain() in the main
loop in _changesetforwardcopies(). It instead removes copies that have
subsequently been removed by using ctx.filesremoved(). We thus rely on
that to be fast.
It timed this command in mozilla-unified:
hg debugpathcopies FIREFOX_59_0b3_BUILD2 FIREFOX_BETA_59_END
It took 18s before and 1.1s after. It's still faster when copy
information is stored in filelogs: 0.70s. It also still gets slow when
there are merge commits involved, because we read manifests there
too. We'll deal with that later.
Differential Revision: https://phab.mercurial-scm.org/D6419