Thu, 05 Apr 2018 17:40:51 -0700 upgrade: sniff for filelog type
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 05 Apr 2018 17:40:51 -0700] rev 37444
upgrade: sniff for filelog type The upgrade code should never encounter a vanilla revlog instance: only changelog, manifestrevlog, and filelog should be seen. The previous code assumed !changelog & !manifestrevlog meant file data. So this change feels pretty safe. If nothing else, it will help tease out typing issues. Differential Revision: https://phab.mercurial-scm.org/D3152
Thu, 05 Apr 2018 16:31:45 -0700 revlog: move censor logic into main revlog class
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 05 Apr 2018 16:31:45 -0700] rev 37443
revlog: move censor logic into main revlog class Previously, the revlog class implemented dummy methods for various censor-related functionality. Revision censoring was (and will continue to be) only possible on filelog instances. So filelog implemented these methods to perform something reasonable. A problem with implementing censoring on filelog is that it assumes filelog is a revlog. Upcoming work to formalize the filelog interface will make this not true. Furthermore, the censoring logic is security-sensitive. I think action-at-a-distance with custom implementation of core revlog APIs in derived classes is a bit dangerous. I think at a minimum the censor logic should live in revlog.py. I was tempted to created a "censored revlog" class that basically pulled these methods out of filelog. But, I wasn't a huge fan of overriding core methods in child classes. A reason to do that would be performance. However, the censoring code only comes into play when: * hash verification fails * delta generation * applying deltas from changegroups The new code is conditional on an instance attribute. So the overhead for running the censored code when the revlog isn't censorable is an attribute lookup. All of these operations are at least a magnitude slower than a Python attribute lookup. So there shouldn't be a performance concern. Differential Revision: https://phab.mercurial-scm.org/D3151
Thu, 05 Apr 2018 18:22:35 -0700 revlog: move parsemeta() and packmeta() from filelog (API)
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 05 Apr 2018 18:22:35 -0700] rev 37442
revlog: move parsemeta() and packmeta() from filelog (API) filelog.parsemeta() and filelog.packmeta() are used to decode and encode metadata for file copies and censor. An upcoming commit will move the core logic for censoring revlogs into revlog.py. This would create a cycle between revlog.py and filelog.py. So we move these metadata functions to revlog.py. .. api:: filelog.parsemeta() and filelog.packmeta() have been moved to the revlog module. Differential Revision: https://phab.mercurial-scm.org/D3150
Thu, 05 Apr 2018 15:18:23 -0700 filelog: declare that filelog implements a storage interface
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 05 Apr 2018 15:18:23 -0700] rev 37441
filelog: declare that filelog implements a storage interface Now that we have a declared interface, let's declare that filelog implements it. Tests have been added that confirm the object conforms to the interface. The existing interface checks verify there are no extra public attributes outside the declared interface. filelog has several extra attributes. So we added a mechanism to suppress this check. The goal is to modify the filelog class so we can drop this check. Differential Revision: https://phab.mercurial-scm.org/D3149
Thu, 05 Apr 2018 15:09:41 -0700 repository: define existing interface for file storage
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 05 Apr 2018 15:09:41 -0700] rev 37440
repository: define existing interface for file storage Now that we have mostly successfully implemented an alternate storage backend for files data, let's start to define the interface for it! This commit takes the mostly-working interface as defined by the simple store repo and codifies it as the file storage interface. The interface has been split into its logical components: * index metadata * fulltext data * mutation * everything else I don't consider the existing interface to be great. But it will help to have it more formally defined so we can start chipping away at refactoring it. Differential Revision: https://phab.mercurial-scm.org/D3148
Thu, 05 Apr 2018 11:16:54 -0700 tests: run some largefiles and lfs tests with simple store
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 05 Apr 2018 11:16:54 -0700] rev 37439
tests: run some largefiles and lfs tests with simple store Now that the simple store handles flags properly, a handful of the largefiles and lfs tests pass! Differential Revision: https://phab.mercurial-scm.org/D3147
Wed, 04 Apr 2018 21:27:02 -0700 commands: don't violate storage abstractions in `manifest --all`
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 04 Apr 2018 21:27:02 -0700] rev 37438
commands: don't violate storage abstractions in `manifest --all` Previously, we asked the store to emit its data files. For modern repos, this would use fncache to resolve the set of files then would stat() each file. For my copy of the mozilla-unified repository, this took 3.3-10s depending on the state of my filesystem cache to render 449,790 items. The previous behavior was a massive layering violation because it assumed tracked files would have specific filenames in specific directories. Alternate storage backends would violate this assumption. The new behavior scans the changelog entries for the set of files changed by each commit. It aggregates them into a set and then sorts and prints the result. This reliably takes ~16.3s on my machine. ~80% of the time is spent in zlib decompression. The performance regression is unfortunate. If we want to claw it back, we can create a proper storage API to query for the set of tracked files. I'm not opposed to doing that. But I'm in no hurry because I suspect ~0 people care about the performance of `hg manifest --all`. .. perf:: `hg manifest --all` is likely slower due to changing its implementation to respect storage interface boundaries. If you are impacted by this regression in a meaningful way, please make noise on the development mailing list and it can be dealt with. Differential Revision: https://phab.mercurial-scm.org/D3119
Wed, 04 Apr 2018 21:09:47 -0700 commands: document the layering violation in `manifest --all`
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 04 Apr 2018 21:09:47 -0700] rev 37437
commands: document the layering violation in `manifest --all` This commit fixes the last test failures when using the simple store extension! It turns out that `hg manifest --all` locks the repo and scans for revlogs. This feature was added by 71938479eff9 in 2011. I am debating changing the behavior. But that can occur in another commit. As part of debugging this, I realized that test-manifest.t is the only meaningful tester of `hg manifest --all` and that test was improperly disabled when bundlerepos aren't supported. The test is testing manifest behavior, not whether you can `hg pull` from a bundle. So I changed the test to `hg unbundle` instead. FWIW, I wasted a non-trivial amount of time tracking down this failure. I thought the issue involved Git, which is why I refactored the test to be more deterministic. Never in my mind would I have guessed that code in `hg manifest` would scan revlogs. I should have looked there to begin with. Doh. Differential Revision: https://phab.mercurial-scm.org/D3118
Wed, 04 Apr 2018 19:17:22 -0700 simplestore: correctly implement flag processors
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 04 Apr 2018 19:17:22 -0700] rev 37436
simplestore: correctly implement flag processors There were a couple of bugs around the implementation of flags processing with the simple store. After these changes, test-flagprocessor.t now passes! test-flagprocessor.t was also updated to include explicit test coverage that pushed data is as expected on the server. The test extension used by test-flagprocessor.t has been updated so it monkeypatches the object returned from repo.file() instead of monkeypatching filelog.filelog. This allows it to work with extensions that return custom types from repo.file(). The monkeypatching is rather hacky and probably is performance prohibitive for real repos. We should probably come up with a better mechanism for registering flag processors so monkeypatching isn't needed. Differential Revision: https://phab.mercurial-scm.org/D3116
Wed, 04 Apr 2018 17:40:09 -0700 tests: `hg init` after resetting HGRCPATH
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 04 Apr 2018 17:40:09 -0700] rev 37435
tests: `hg init` after resetting HGRCPATH Otherwise extensions loaded via --extra-config-opt could prevent access to the repo by introducing requirements file. This does mean that custom extensions loaded in this way won't impact this test. I'm fine with that. Differential Revision: https://phab.mercurial-scm.org/D3115
Wed, 04 Apr 2018 17:33:59 -0700 tests: work around potential repo incompatibility
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 04 Apr 2018 17:33:59 -0700] rev 37434
tests: work around potential repo incompatibility test-run-tests.t invokes run-tests.py. But custom extensions providing new repo requirements may be in play and may not get inherited by the new run-tests.py. We ensure our repo is created with a vanilla config to mitigate extension-caused badness. Differential Revision: https://phab.mercurial-scm.org/D3114
Wed, 04 Apr 2018 17:29:02 -0700 tests: disable test-keyword.t with simple store
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 04 Apr 2018 17:29:02 -0700] rev 37433
tests: disable test-keyword.t with simple store The keyword extension is hooking into repo.file() and defining its own filelog class. It will likely require a more formal storage interface before keywords are usable with alternate storage backends. Differential Revision: https://phab.mercurial-scm.org/D3113
Wed, 04 Apr 2018 17:12:00 -0700 tests: conditionalize test-treemanifest.t
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 04 Apr 2018 17:12:00 -0700] rev 37432
tests: conditionalize test-treemanifest.t Parts of the test were assuming the use of revlogs with fnstore path encoding. Other parts of the test assumed we could create repos with different store encodings and that stream clone bundles worked. Make all of this conditional on running a revlog repo. Differential Revision: https://phab.mercurial-scm.org/D3112
Wed, 04 Apr 2018 17:02:54 -0700 tests: use unbundle in test-symlink-os-yes-fs-no.py
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 04 Apr 2018 17:02:54 -0700] rev 37431
tests: use unbundle in test-symlink-os-yes-fs-no.py The test (which should probably be rewritten as a .t test - the test was initially authored in 2009 and this may have predated some test harness features allowing us to implement it as a .t test) is verifying symlink behavior with regards to working directory operations. How it pulls bundle data into a repo is not relevant. So we can switch from pull to unbundle so we can support environments where bundlerepos don't work. Differential Revision: https://phab.mercurial-scm.org/D3111
Wed, 04 Apr 2018 16:49:22 -0700 tests: disable `hg clone --stream` test with simple store
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 04 Apr 2018 16:49:22 -0700] rev 37430
tests: disable `hg clone --stream` test with simple store We mass disabled stream clone tests in a previous commit. Looks like one was missed. Differential Revision: https://phab.mercurial-scm.org/D3110
(0) -30000 -10000 -3000 -1000 -300 -100 -15 +15 +100 +300 +1000 +3000 +10000 tip