Mon, 24 Sep 2018 14:31:31 -0700 storageutil: move metadata parsing and packing from revlog (API)
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 24 Sep 2018 14:31:31 -0700] rev 39878
storageutil: move metadata parsing and packing from revlog (API) Parsing and writing of revision text metadata is likely identical across storage backends. Let's move the code out of revlog so we don't need to import the revlog module in order to use it. Differential Revision: https://phab.mercurial-scm.org/D4754
Mon, 24 Sep 2018 14:23:54 -0700 storageutil: new module for storage primitives (API)
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 24 Sep 2018 14:23:54 -0700] rev 39877
storageutil: new module for storage primitives (API) There will exist common code between storage backends. It would be nice to have a central place to put that code. This commit attempts to create that place by creating the "storageutil" module. The first thing we move is revlog.hash(), which is the function for computing the SHA-1 hash of revision fulltext and parents. Differential Revision: https://phab.mercurial-scm.org/D4753
Mon, 24 Sep 2018 13:35:50 -0700 filelog: stop proxying deltaparent() (API)
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 24 Sep 2018 13:35:50 -0700] rev 39876
filelog: stop proxying deltaparent() (API) deltaparent() obtains the revision number of the base revision a delta in storage is stored against. It is highly revlog-centric and may not apply to other storage backends. As a result, it doesn't belong on the generic file storage interface. This method/proxy is no longer used in core. The last consumer was probably changegroup code and went away with the transition to emitrevisions(). Differential Revision: https://phab.mercurial-scm.org/D4751
Mon, 24 Sep 2018 12:49:17 -0700 filelog: stop proxying rawsize() (API)
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 24 Sep 2018 12:49:17 -0700] rev 39875
filelog: stop proxying rawsize() (API) This method is no longer used by external consumers. The API is quite low-level and is effectively len(revision(raw=True)). I don't see a compelling reason to keep it around. Let's drop the API and make the file storage interface simpler. Differential Revision: https://phab.mercurial-scm.org/D4750
Mon, 24 Sep 2018 12:42:03 -0700 filelog: stop proxying "opener" (API)
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 24 Sep 2018 12:42:03 -0700] rev 39874
filelog: stop proxying "opener" (API) The last consumer of it in upgrade code was removed as part of the previous commit. This attribute is revlog specific (because it assumes the existence of a vfs for performing I/O on tracked file data) and therefore isn't appropriate for a generic storage interface. So nuke it. Differential Revision: https://phab.mercurial-scm.org/D4749
Mon, 24 Sep 2018 11:16:33 -0700 filelog: stop proxying flags() (API)
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 24 Sep 2018 11:16:33 -0700] rev 39873
filelog: stop proxying flags() (API) Per-revision storage flags are kinda a revlog-centric API. (Except for the fact that changegroup uses the same integer flags as revlog does and there's minimal verification that the server's flags map to the client's storage flags - but that's another problem.) The last user of flags() was in verify.py and that code was just moved into revlog.py and is accessed behind the verifyintegrity() file storage API. Since there are no more consumers, let's drop the proxy and remove the method from the file storage interface. This commit only drops the dedicated API for reading a single revision's storage flags: we still support reading and writing flags through the bulk data retrieval and add revision APIs. And since changegroups encode revlog integer flags over the wire, we'll always need to support flags at some level. The removal of individual storage flags may be too premature. But since flags() is now unused, I'd like to see how far we can get without that dedicated API - especially since it uses revision numbers instead of nodes. Differential Revision: https://phab.mercurial-scm.org/D4746
Mon, 24 Sep 2018 11:27:47 -0700 revlog: move revision verification out of verify
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 24 Sep 2018 11:27:47 -0700] rev 39872
revlog: move revision verification out of verify File revision verification is performing low-level checks of file storage, namely that flags are appropriate and revision data can be resolved. Since these checks are somewhat revlog-specific and may not be appropriate for alternate storage backends, this commit moves those checks from verify.py to revlog.py. Because we're now emitting warnings/errors that apply to specific revisions, we taught the iverifyproblem interface to expose the problematic node and to report this node in verify output. This was necessary to prevent unwanted test changes. After this change, revlog.verifyintegrity() and file verify code in verify.py both iterate over revisions and resolve their fulltext. But they do so in separate loops. (verify.py needs to resolve fulltexts as part of calling renamed() - at least when using revlogs.) This should add overhead. But on the mozilla-unified repo: $ hg verify before: time: real 700.640 secs (user 585.520+0.000 sys 23.480+0.000) after: time: real 682.380 secs (user 570.370+0.000 sys 22.240+0.000) I'm not sure what's going on. Maybe avoiding the filelog attribute proxies shaved off enough time to offset the losses? Maybe fulltext resolution has less overhead than I thought? I've left a comment indicating the potential for optimization. But because it doesn't produce a performance regression on a large repository, I'm not going to worry about it. Differential Revision: https://phab.mercurial-scm.org/D4745
Wed, 26 Sep 2018 12:06:44 -0700 tests: de-flake test-narrow-debugrebuilddirstate.t
Martin von Zweigbergk <martinvonz@google.com> [Wed, 26 Sep 2018 12:06:44 -0700] rev 39871
tests: de-flake test-narrow-debugrebuilddirstate.t If the dirstate gets written much later (usually 1-2 s, depending on FS) than the working copy file (there's only one), then the `hg debugdirstate` command will include a timestamp. There's nothing wrong with that, so we should just allow it. Differential Revision: https://phab.mercurial-scm.org/D4758
Mon, 24 Sep 2018 12:39:34 -0700 upgrade: use storageinfo() for obtaining storage metadata
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 24 Sep 2018 12:39:34 -0700] rev 39870
upgrade: use storageinfo() for obtaining storage metadata Let's switch to our new API for obtaining information about storage. This eliminates the last consumer of rawsize() and the opener proxy from the file storage interface! Differential Revision: https://phab.mercurial-scm.org/D4748
Mon, 24 Sep 2018 11:56:48 -0700 revlog: add method for obtaining storage info (API)
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 24 Sep 2018 11:56:48 -0700] rev 39869
revlog: add method for obtaining storage info (API) We currently have a handful of methods on the file and manifest storage interfaces for obtaining metadata about storage. e.g. files() is used to obtain the files backing storage. rawsize() is to quickly compute the size of tracked revisions without resolving their fulltext. Code in upgrade and stream clone make heavy use of these methods. The existing APIs are generic and don't necessarily have the specialization that we need going forward. For example, files() doesn't distinguish between exclusive storage and shared storage. This makes stream clone difficult to implement when e.g. there may be a single file backing storage for multiple tracked paths. It also makes reporting difficult, as we don't know how many bytes are actually used by storage since we can't easily identify shared files. This commit implements a new method for obtaining storage metadata. It is designed to accept arguments specifying what metadata to request and to return a dict with those fields populated. We /could/ make each of these attributes a separate method. But this is a specialized API and I'm trying to avoid method bloat on the interfaces. There is also the possibility that certain callers will want to obtain multiple fields in different combinations and some backends may have performance issues obtaining all that data via separate method calls. Simple storage integration tests have been added. For now, we assume fields can't be "None" (ignoring the interface documentation). We can revisit this later. Differential Revision: https://phab.mercurial-scm.org/D4747
(0) -30000 -10000 -3000 -1000 -300 -100 -10 +10 +100 +300 +1000 +3000 +10000 tip