Gregory Szorc <gregory.szorc@gmail.com> [Thu, 02 Aug 2018 17:29:53 -0700] rev 38894
changegroup: define functions for creating changegroup packers
Currently, we have 3 classes for changegroup generation. Each class
handles a specific changegroup format. And each subsequent version's
class inherits from the previous one.
The interface for the classes is not very well defined and a lot of
version-specific behavior is behind overloaded functions. This
approach adds complexity and makes changegroup generation difficult
to reason about.
Upcoming commits will be consolidating these 3 classes so differences
between changegroup versions and changegroup generation are controlled
by parameters to a single constructor / type rather than by
overriding class attributes via inheritance.
We begin this process by building dedicated functions for creating
each changegroup packer instance. Currently they just call the
constructor on the appropriate class. This will soon change.
Differential Revision: https://phab.mercurial-scm.org/D4076
Gregory Szorc <gregory.szorc@gmail.com> [Fri, 03 Aug 2018 10:05:26 -0700] rev 38893
changegroup: capture revision delta in a data structure
The current changegroup generation code is tightly coupled to
the revlog API. This tight coupling makes it difficult to implement
alternate storage backends without requiring a large surface area
of the revlog API to be exposed. This is not desirable.
In order to support changegroup generation with non-revlog storage,
we'll need to abstract the concept of delta generation.
This commit is the first step down that road. We introduce a
data structure for representing a delta in a changegroup.
The API still leaves a lot to be desired. But at least we now
have separation between data and actions performed on it.
As part of this, we tweak behavior slightly: we no longer
concatenate the delta prefix with the metadata header. Instead,
we track and emit the prefix as a separate chunk. This shouldn't
have any meaningful impact since all the chunks just get sent to
the wire, the compressor, etc.
Because we're introducing a new object, this does add some
overhead to changegroup execution. `hg perfchangegroupchangelog`
on my clone of the Mercurial repo (~40,000 visible revisions in
the changelog) slows down a bit:
! wall 1.268600 comb 1.270000 user 1.270000 sys 0.000000 (best of 8)
! wall 1.419479 comb 1.410000 user 1.410000 sys 0.000000 (best of 8)
With for `hg bundle -t none-v2 -a /dev/null`:
before: real 6.610 secs (user 6.460+0.000 sys 0.140+0.000)
after: real 7.210 secs (user 7.060+0.000 sys 0.140+0.000)
I plan to claw back this regression in future commits. And I may
even do away with this data structure once the refactor is complete.
For now, it makes things easier to comprehend.
Differential Revision: https://phab.mercurial-scm.org/D4075
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 02 Aug 2018 16:36:40 -0700] rev 38892
changegroup: inline ellipsisdata()
There's only one caller of it. I don't think it needs to exist as
a standalone function.
Differential Revision: https://phab.mercurial-scm.org/D4074
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 02 Aug 2018 17:05:11 -0700] rev 38891
changegroup: rename "revlog" variables
"revlog" shadows the module import. But more importantly, changegroup
generation should be storage agnostic and not assume the existence
of revlogs. Let's rename the thing providing revision storage to
"store" to reflect this ideal property.
Differential Revision: https://phab.mercurial-scm.org/D4073
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 02 Aug 2018 14:15:10 -0700] rev 38890
changegroup: move generate() modifications from narrow
Narrow had a custom version of generate() that was essentially a copy
of generate() with inline additions to facilitate ellipses serving.
This commit inlines those modifications into generate().
Differential Revision: https://phab.mercurial-scm.org/D4067
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 02 Aug 2018 12:18:35 -0700] rev 38889
changegroup: move generatefiles() from narrow
The code is a bit ugly in that it overrides the linknodes
function that is passed in as a function. I'd like to think
that the caller of generatefiles() would pass in the appropriate
function. We can clean this up later.
Differential Revision: https://phab.mercurial-scm.org/D4066
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 02 Aug 2018 12:12:12 -0700] rev 38888
changegroup: move _sortgroup() from narrow
Differential Revision: https://phab.mercurial-scm.org/D4065
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 02 Aug 2018 09:52:01 -0700] rev 38887
changegroup: move close() from narrow
More of the same.
Differential Revision: https://phab.mercurial-scm.org/D4064
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 02 Aug 2018 09:53:22 -0700] rev 38886
changegroup: move revchunk() from narrow
The monkeypatched revchunk for ellipses serving is a
completely independent implementation. We model it as such
in the changegroup code. revchunk() is now a simple proxy
function.
Again, I wish we had better APIs here. Especially since this
narrow code is part of cg1packer and cg1packer can't be used
with narrow. Class inheritance is wonky. And I will definitely
be making changes to changegroup code for delta generation.
As part of the code move, `node.nullrev` was replaced by
`nullrev`. And a reference to `orig` was replaced to call
`self._revchunknormal` directly.
Differential Revision: https://phab.mercurial-scm.org/D4063
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 02 Aug 2018 09:40:18 -0700] rev 38885
changegroup: move deltaparent() from narrow
I'm not keen on performing the attribute sniff to test for
presence of ellipses mode: I'd rather we use a separate packer
instance that was ellipses mode specific. But I've tried to
formalize a better API without narrow in core and I can't
make sense of all the monkeypatching. My goal is to inline
as much of the monkeypatching as possible then refactor the
changegroup generation API.
We add this code to the cg2packer because narrow doesn't work
with cg1.
Differential Revision: https://phab.mercurial-scm.org/D4062