Gregory Szorc <gregory.szorc@gmail.com> [Wed, 12 Sep 2018 15:59:26 -0700] rev 39700
localrepo: extract resolving of opener options to standalone functions
Requirements and config options are converted into a dict which is
available to the store vfs to consult. This is how storage options
are communicated from the repo layer to the storage layer.
Currently, we do that option resolution in a private method on the
repo instance. And there is a single method doing that resolution.
Opener options are logically specific to the storage backend they
apply to. And, opener options may wish to influence how the repo
object/type is constructed. So it makes sense to have more granular
storage option resolution that occurs before the repo object is
instantiated.
This commit extracts the code for resolving opener options into new
module-level functions. These functions are run before the repo
instance is constructed.
As part of the code move, we split the option resolution into
generic and revlog-specific options. After this commit, we no longer
add revlog-specific options to repos that don't have a revlog
requirement.
Some of these opener options and associated config options might make
sense on alternate storage backends. We can always reuse config
options and opener option names for other backends. But we shouldn't
be passing opener options to storage backends that won't recognize
them. I haven't done it here, but after this commit it should be
possible for store backends to validate the set of opener options
it receives.
Because localrepository.openerreqs is no longer used after this commit,
it has been removed.
I'm not super thrilled about the code outside of localrepo that is
adding requirements and updating opener options. We'll probably want
to create a more formal API for that use case that constructs a new
repo instance and poisons the old repo object. But this was a
pre-existing issue and can be dealt with later. I have little doubt
it will cause me troubles as I continue to refactor how repository
objects are instantiated.
.. api::
``localrepository.openerreqs`` has been removed. Override
``localrepo.resolvestorevfsoptions()`` to add custom opener options.
.. api::
``localrepository._applyopenerreqs()`` has been removed. Use
``localrepo.resolvestorevfsoptions()`` to add custom opener options.
Differential Revision: https://phab.mercurial-scm.org/D4576
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 12 Sep 2018 15:17:47 -0700] rev 39699
localrepo: use boolean in opener options
Not sure why we're using an integer for a flag value here. I'm
pretty sure nothing relies on values being 1.
While we're here, convert to a dict comprehension.
Differential Revision: https://phab.mercurial-scm.org/D4575
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 12 Sep 2018 15:07:27 -0700] rev 39698
localrepo: move store() from store module
I want logic related to requirements handling to be in the localrepo
module so it is all in one place.
I would have loved to inline this logic. Unfortunately, statichttprepo
also calls it. I didn't want to inline it twice. We could potentially
refactor statichttppeer. But meh.
Differential Revision: https://phab.mercurial-scm.org/D4574
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 12 Sep 2018 15:05:51 -0700] rev 39697
localrepo: resolve store and cachevfs in makelocalrepository()
This is mostly a code move and refactor.
One change is that we now explicitly look for requirements indicating
a share is being used rather than blindly try to read from
.hg/sharedpath. Requirements *should* be all that is necessary to
dictate high-level behavior and I'm not sure why the previous code
was doing what it was.
The previous code has been in place since
87d1fd40f57e (authored in
2009). And the commit immediately after that (
971e38a9344b) introduced
``hg.share()`` and always wrote the ``shared`` requirement. And as far
as I can tell, every revision of ``hg.share()`` since has written
either the ``shared`` or ``relshared`` requirement. So I'm pretty
sure we don't need to maintain BC by always looking for and honoring
the ``.hg/sharedpath`` file even if a requirement isn't present.
.. bc::
A repository will no longer use shared storage if it has a
``.hg/sharedpath`` file but no entry in ``.hg/requires`` saying it
is shared.
This change should not have any end-user impact, as all shared
repos should have a ``.hg/requires`` file indicating this.
Differential Revision: https://phab.mercurial-scm.org/D4573
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 12 Sep 2018 13:10:45 -0700] rev 39696
localrepo: document and test bug around opening shared repos
As part of refactoring this code, I realized that we don't
validate the requirements of a shared repository. This commit
documents that next to the requirements validation code and adds a
test demonstrating the buggy behavior.
I'm not sure if I'll fix this. But it is definitely a bug that
users could encounter, as LFS, narrow, and potentially other
extensions dynamically add requirements on first use. One part
of this I'm not sure about is how to handle loading the .hg/hgrc
of the shared repo. We need to do that in order to load extensions.
But we don't want that repo's hgrc to overwrite the current repo's.
Differential Revision: https://phab.mercurial-scm.org/D4572
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 12 Sep 2018 15:03:17 -0700] rev 39695
localrepo: move requirements reasonability testing to own function
Just because we know how to handle each listed requirement doesn't
mean that set of requirements is reasonable.
This commit introduces an extension-wrappable function to validate
that a set of requirements makes sense.
We could combine this with ensurerequirementsrecognized(). But I think
having a line between basic membership testing and compatibility
checking is more powerful as it will help differentiate between
missing support and buggy behavior.
Differential Revision: https://phab.mercurial-scm.org/D4571
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 12 Sep 2018 15:47:24 -0700] rev 39694
statichttprepo: use new functions for requirements validation
The new code in localrepo for requirements gathering and validation
is more robust than scmutil.readrequires(). Let's port statichttprepo
to it.
Since scmutil.readrequires() is no longer used, it has been removed.
It is possible extensions were monkeypatching this to supplement the
set of supported requirements. But the proper way to do that is to
register a featuresetupfuncs. I'm comfortable forcing the API break
because featuresetupfuncs is more robust and has been supported for
a while.
.. api::
``scmutil.readrequires()`` has been removed.
Use ``localrepo.featuresetupfuncs`` to register new repository
requirements.
Use ``localrepo.ensurerequirementsrecognized()`` to validate them.
Differential Revision: https://phab.mercurial-scm.org/D4570
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 12 Sep 2018 14:54:17 -0700] rev 39693
localrepo: validate supported requirements in makelocalrepository()
This should be a glorified code move. I did take the opportunity to
refactor things. We now have a separate function for gathering
requirements and one for validating them.
I also mode cosmetic changes to the code, such as not using
abbreviations and using a set instead of list to model missing
requirements.
Differential Revision: https://phab.mercurial-scm.org/D4569
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 12 Sep 2018 14:45:52 -0700] rev 39692
localrepo: read requirements file in makelocalrepository()
Previously, scmutil.readrequires() loaded the requirements file
and validated its content against what was supported.
Requirements translate to repository features and are critical to
our plans to dynamically create local repository types. So, we must
load them in makelocalrepository() before a repository instance is
constructed.
This commit moves the reading of the .hg/requires file to
makelocalrepository(). Because scmutil.readrequires() was performing
I/O and validation, we inlined the validation into
localrepository.__init__ and removed scmutil.readrequires().
I plan to remove scmutil.readrequires() in a future commit (we can't
do it now because statichttprepo uses it).
Differential Revision: https://phab.mercurial-scm.org/D4568
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 12 Sep 2018 12:36:07 -0700] rev 39691
localrepo: check for .hg/ directory in makelocalrepository()
As part of this, we move the check to before .hg/hgrc is loaded,
as it makes sense to check for the directory before attempting to
open a file in it.
Differential Revision: https://phab.mercurial-scm.org/D4567
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 12 Sep 2018 11:44:57 -0700] rev 39690
localrepo: load extensions in makelocalrepository()
Behavior does change subtly.
First, we now load the hgrc before optionally setting up the vfs ward.
That's fine: the vfs ward is for debugging and we know we won't hit it
when reading .hg/hgrc. If the loaded extension were performing repo/vfs
I/O, then we'd be worried. But extensions don't have access to the
repo object that loaded them when they are loaded. Unless they are
doing stack walking as part of module loading (which would be crazy),
they shouldn't have access to the repo that incurred their load.
Second, we now load extensions outside of the try..except IOError
block. Previously, if loading an extension raised IOError, it would
be silently ignored. I'm pretty sure the IOError is there for missing
.hgrc files and should never have been ignored for issues loading
extensions. I don't think this matters in reality because extension
loading traps I/O errors.
Differential Revision: https://phab.mercurial-scm.org/D4566
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 12 Sep 2018 11:34:02 -0700] rev 39689
localrepo: copy ui in makelocalrepository()
We will want to load the .hg/hgrc file from makelocalrepository() so
we can consult its options as part of deriving the repository type.
This means we need to create our ui instance copy in that function.
Differential Revision: https://phab.mercurial-scm.org/D4565
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 12 Sep 2018 11:31:14 -0700] rev 39688
localrepo: move some vfs initialization out of __init__
In order to make repository types more dynamic, we'll need to move the
logic for determining repository behavior out of
localrepository.__init__ so we can influence behavior before the type
is instantiated.
This commit starts that process by moving working directory and .hg/
vfs initialization to our new standalone function for instantiating
local repositories.
Aside from API changes, behavior should be fully backwards compatible.
.. api::
localrepository.__init__ now does less work and accepts new args
Use ``hg.repository()``, ``localrepo.instance()``, or
``localrepo.makelocalrepository()`` to obtain a new local repository
instance instead of calling the ``localrepository`` constructor
directly.
Differential Revision: https://phab.mercurial-scm.org/D4564
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 12 Sep 2018 11:02:16 -0700] rev 39687
localrepo: create new function for instantiating a local repo object
Today, there is a single local repository class - localrepository. Its
__init__ is responsible for loading the .hg/requires file and taking
different actions depending on what is present.
In addition, extensions may define a "reposetup" function that
monkeypatches constructed repository instances, often by implementing
a derived type and changing the __class__ of the repo instance.
Work around alternate storage backends and partial clone has made it
clear to me that shoehorning all this logic into __init__ and operating
on an existing instance is too convoluted. For example, localrepository
assumes revlog storage and swapping in non-revlog storage requires
overriding e.g. file() to return something that isn't a revlog. I've
authored various patches that either:
a) teach various methods (like file()) about different states and
taking the appropriate code path at run-time
b) create methods/attributes/callables used for instantiating things
and populating these in __init__
"a" incurs run-time performance penalties and makes code more
complicated since various functions have a bunch of "if storage is X"
branches.
"b" makes localrepository quickly explode in complexity.
My plan for tackling this problem is to make the local repository type
more dynamic. Instead of a static localrepository class/type that
supports all of the local repository configurations (revlogs vs other,
revlogs with ellipsis, revlog v1 versus revlog v2, etc), we'll
dynamically construct a type providing the implementations that are
needed for the repository on disk, derived from the .hg/requires file
and configuration options. The constructed repository type will be
specialized and methods won't need to be taught about different
implementations nor overloaded.
We may also leverage this functionality for building types that don't
implement all attributes. For example, the "intents" feature allows
commands to declare that they are read only. By dynamically
constructing a repository type, we could return a repository instance
with no attributes related to mutating the repository. This could
include things like a "changelog" property implementation that doesn't
check whether it needs to invalidate the hidden revisions set on every
access.
This commit establishes a function for building a local repository
instance. Future commits will start moving functionality from
localrepository.__init__ to this function. Then we'll start dynamically
changing the returned type depending on options that are present.
This change may seem radical. But it should be fully compatible with
the reposetup() model - at least for now.
Differential Revision: https://phab.mercurial-scm.org/D4563
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 17 Sep 2018 16:29:12 -0700] rev 39686
transaction: make entries a private attribute (API)
This attribute is tracking changes to append-only files. It is
an implementation detail and should not be exposed as part of
the public interface.
But code in repair was accessing it, so it seemingly does belong
as part of the public API. But that code in repair is making
assumptions about how storage works and is grossly wrong when
alternate storage backends are in play. We'll need some kind of
"strip" API at the storage layer that knows how to handle things
in a storage-agnostic manner. I don't think accessing a private
attribute on the transaction is any worse than what this code
is already doing. So I'm fine with violating the abstraction for
transactions.
And with this change, all per-instance attributes on transaction
have been made private except for "changes" and "hookargs." Both
are used by multiple consumers and look like they need to be
part of the public interface.
.. api::
Various attributes of ``transaction.transaction`` are now ``_``
prefixed to indicate they shouldn't be used by external
consumers.
Differential Revision: https://phab.mercurial-scm.org/D4634
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 17 Sep 2018 16:19:55 -0700] rev 39685
transaction: make names a private attribute
This is used to report the transaction name in __repr__. It is
very obviously an implementation detail and doesn't need to be
exposed as part of the public interface. So mark it as private.
Differential Revision: https://phab.mercurial-scm.org/D4633
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 17 Sep 2018 16:13:38 -0700] rev 39684
transaction: make map a private attribute
This is used to track which files are modified. It is an
implementation detail of current transactions and doesn't need
to be exposed to the public interface. So mark it as private.
Differential Revision: https://phab.mercurial-scm.org/D4632
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 17 Sep 2018 16:11:25 -0700] rev 39683
transaction: make report a private attribute
This is a callable used for logging. It isn't used outside the
transaction code. It doesn't need to be part of the public interface.
Let's mark it as private.
Differential Revision: https://phab.mercurial-scm.org/D4631
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 17 Sep 2018 16:08:02 -0700] rev 39682
transaction: make opener a private attribute
The VFS instance is an implementation detail of the transaction
and doesn't belong as part of the public interface. So mark it as
private.
Differential Revision: https://phab.mercurial-scm.org/D4630
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 17 Sep 2018 16:04:52 -0700] rev 39681
transaction: make after a private attribute
This is another callable that is passed in at __init__ time. It
doesn't need to be part of the public interface.
Differential Revision: https://phab.mercurial-scm.org/D4629
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 17 Sep 2018 16:02:53 -0700] rev 39680
transaction: make checkambigfiles a private attribute
This holds instance state that is passed in at __init__ time. It
doesn't need to be exposed as part of the public interface.
Differential Revision: https://phab.mercurial-scm.org/D4628
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 17 Sep 2018 16:01:22 -0700] rev 39679
transaction: make validator a private attribute
This is similar to releasefn. It holds state that doesn't need to be
exposed as part of the public interface.
Differential Revision: https://phab.mercurial-scm.org/D4627
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 17 Sep 2018 16:00:09 -0700] rev 39678
transaction: make releasefn a private attribute
This is a handle on a callable that is called when the journal
is closed. The value is specified at __init__ time. It doesn't
need to be exposed on the public interface. So mark it as private.
Differential Revision: https://phab.mercurial-scm.org/D4626
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 17 Sep 2018 15:57:32 -0700] rev 39677
transaction: make file a private attribute
This holds a file handle for the journal file. This file handle
should not be touched outside the journal class and doesn't
belong on the public interface.
Differential Revision: https://phab.mercurial-scm.org/D4625
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 17 Sep 2018 15:55:57 -0700] rev 39676
transaction: make journal a private attribute
This attribute tracks the name of the journal file. It is an
implementation detail of the current transaction and therefore
shouldn't be exposed as part of the interface. Let's mark it as
private.
Differential Revision: https://phab.mercurial-scm.org/D4624
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 17 Sep 2018 15:52:59 -0700] rev 39675
transaction: make undoname a private attribute
This attribute tracks the file pattern to use for undo files.
It is an implementation detail of the current transaction semantics
and doesn't need to be part of the future transaction interface. So
mark it as private.
Differential Revision: https://phab.mercurial-scm.org/D4623
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 17 Sep 2018 15:51:19 -0700] rev 39674
transaction: make count and usages private attributes
I want to formalize the interface for transactions. As part of
doing that, let's take the opportunity to make some attributes
non-public.
"count" and "usages" track how many times the transaction has
been opened/nested/closed/released. This is internal state and
doesn't need to be part of the public API.
Differential Revision: https://phab.mercurial-scm.org/D4622
Pulkit Goyal <pulkit@yandex-team.ru> [Tue, 18 Sep 2018 13:41:16 +0300] rev 39673
narrow: don't send the changelog information when widening without ellipses
When we widen anon-ellipses narrow copy, the server sends the changelog
information of all the changesets. The code was copied from ellipses case and in
ellipses cases, it's required to send the new changelog data.
But in non-ellipses cases, we don't need to send the changelog data as we will
have all the changesets locally.
Before this patch, there was a overhead of ~8-10 mins on each widening call
because of all the changelog information being pulled and being applied. After
this patch, we no more pull the changelog information. So this patch can save ~5
mins on Mozilla repo on each widening and more on repos which have more
changesets.
When we apply an empty changelog from changegroup, there is a devel-warn. This
patch kind of hacks to silence that devel-warn.
Differential Revision: https://phab.mercurial-scm.org/D4639
Pulkit Goyal <pulkit@yandex-team.ru> [Mon, 17 Sep 2018 21:41:34 +0300] rev 39672
changegroup: add functionality to skip adding changelog data to changegroup
In narrow extension, when we have a non-ellipses narrow working copy and we
extend it, we pull all the changelog data again and the client tries to reapply
all that changelog data.
While downloading millions of changeset data is still not very expensive but
applying them on the client side is very expensive and takes ~10 minutes. These
10 minutes are added to every `hg tracked --addinclude <>` call and extending
a narrow copy becomes very slow.
This patch adds a new changelog argument to cgpacker.generate() fn. If the
changelog argument is set to False, we won't yield the changelog data. We still
have to iterate over the deltas returned by _generatechangelog() because that's
a generator and builds the data for clstate variable which is required for
calculating manifests and filelogs.
Differential Revision: https://phab.mercurial-scm.org/D4638
Pulkit Goyal <pulkit@yandex-team.ru> [Tue, 18 Sep 2018 10:46:19 -0700] rev 39671
tests: add debug output in test-narrow-widen-no-ellipsis.t
This will help us in understanding the upcoming patches better.
Differential Revision: https://phab.mercurial-scm.org/D4637