# HG changeset patch # User Pulkit Goyal # Date 1548767032 -10800 # Node ID fa7d61f9c512ba9b06d1e194eb483fe69aae408a # Parent 73a33fe625bb75eec68493a76a4bb86c995cfc9a changegroup: don't try to build changelog chunks if not required When we extend a narrow clone without ellipsis, we don't download changelog information because that's already present with the client. However we still try to build that chunk stream. Building that chunk stream involves calling a lookup function and store.emitrevisions() API. The lookup function is called len(cl) number of times. On large repositories, where len(cl) is in millions, calling that lookup function is not a good idea. Also it's not required to use the store.emitrevisons() API because we already have nodes present which we can use. This patch short-circuits state building logic if we are processing a non-ellipsis case and changelog is not required. This saves up ~20 seconds on our internal repo for a single extend call. Differential Revision: https://phab.mercurial-scm.org/D5733 diff -r 73a33fe625bb -r fa7d61f9c512 mercurial/changegroup.py --- a/mercurial/changegroup.py Tue Jan 29 15:43:02 2019 +0300 +++ b/mercurial/changegroup.py Tue Jan 29 16:03:52 2019 +0300 @@ -817,13 +817,13 @@ self._verbosenote(_('uncompressed size of bundle content:\n')) size = 0 - clstate, deltas = self._generatechangelog(cl, clnodes) + clstate, deltas = self._generatechangelog(cl, clnodes, + generate=changelog) for delta in deltas: - if changelog: - for chunk in _revisiondeltatochunks(delta, - self._builddeltaheader): - size += len(chunk) - yield chunk + for chunk in _revisiondeltatochunks(delta, + self._builddeltaheader): + size += len(chunk) + yield chunk close = closechunk() size += len(close) @@ -917,12 +917,15 @@ if clnodes: repo.hook('outgoing', node=hex(clnodes[0]), source=source) - def _generatechangelog(self, cl, nodes): + def _generatechangelog(self, cl, nodes, generate=True): """Generate data for changelog chunks. Returns a 2-tuple of a dict containing state and an iterable of byte chunks. The state will not be fully populated until the chunk stream has been fully consumed. + + if generate is False, the state will be fully populated and no chunk + stream will be yielded """ clrevorder = {} manifests = {} @@ -937,6 +940,20 @@ 'clrevtomanifestrev': clrevtomanifestrev, } + if not (generate or self._ellipses): + # sort the nodes in storage order + nodes = sorted(nodes, key=cl.rev) + for node in nodes: + c = cl.changelogrevision(node) + clrevorder[node] = len(clrevorder) + # record the first changeset introducing this manifest version + manifests.setdefault(c.manifest, node) + # Record a complete list of potentially-changed files in + # this manifest. + changedfiles.update(c.files) + + return state, () + # Callback for the changelog, used to collect changed files and # manifest nodes. # Returns the linkrev node (identity in the changelog case).