Sat, 01 Jul 2017 20:51:19 -0700 localrepo: cache types for filtered repos (issue5043)
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 01 Jul 2017 20:51:19 -0700] rev 33389
localrepo: cache types for filtered repos (issue5043) Python introduces a reference cycle on dynamically created types via __mro__, making them very easy to leak. See https://bugs.python.org/issue17950. Previously, repo.filtered() created a type on every invocation. Long-running processes (like `hg convert`) could call this function thousands of times, leading to a steady memory leak. Since we're Unable to stop the leak because this is a bug in Python, the next best thing is to contain it. This patch adds a cache of of the dynamically generated repoview/filter types on the localrepo object. Since we only generate each type once, we cap the amount of memory that can leak to something reasonable. After this change, `hg convert` no longer leaks memory on every revision. The process will likely grow memory usage over time due to e.g. larger manifests. But there are no leaks.
Tue, 11 Jul 2017 02:10:04 +0900 convert: transcode CVS log messages by specified encoding (issue5597)
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Tue, 11 Jul 2017 02:10:04 +0900] rev 33388
convert: transcode CVS log messages by specified encoding (issue5597) Converting from CVS to Mercurial assumes that CVS log messages in "cvs rlog" output are encoded in UTF-8 (or basic Latin-1). But cvs itself is usually unaware of encoding of log messages, in practice. Therefore, if there are commits, of which log message is encoded in other than UTF-8, log message of corresponded revisions in the converted repository will be broken. To avoid such broken log messages, this patch transcodes CVS log messages by encoding specified via "convert.cvsps.logencoding" configuration. This patch accepts multiple encoding for convenience, because "multiple encoding mixed in a repository" easily occurs. For example, UTF-8 (recent POSIX), cp932 (Windows), and EUC-JP (legacy POSIX) are well known encoding for Japanese.
Mon, 10 Jul 2017 23:09:52 +0900 fsmonitor: execute setup procedures only if dirstate is already instantiated
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Mon, 10 Jul 2017 23:09:52 +0900] rev 33387
fsmonitor: execute setup procedures only if dirstate is already instantiated Before this patch, reposetup() of fsmonitor executes setup procedures for dirstate, even if it isn't yet instantiated at that time. On the other hand, dirstate might be already instantiated before reposetup() intentionally (prefilling by chg, for example, see bf3af0eced44 for detail). If so, just discarding already instantiated one in reposetup() causes issue. To resolve both issues above, this patch executes setup procedures, only if dirstate is already instantiated. BTW, this patch removes "del repo.unfiltered().__dict__['dirstate']", because it is responsibility of the code path, which causes instantiation of dirstate before reposetup(). After this patch, using localrepo.isfilecached() should avoid creating the corresponded entry in repo.unfiltered().__dict__.
Mon, 10 Jul 2017 23:09:52 +0900 fsmonitor: centralize setup procedures for dirstate
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Mon, 10 Jul 2017 23:09:52 +0900] rev 33386
fsmonitor: centralize setup procedures for dirstate
Mon, 10 Jul 2017 23:09:52 +0900 fsmonitor: avoid needless instantiation of dirstate
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Mon, 10 Jul 2017 23:09:52 +0900] rev 33385
fsmonitor: avoid needless instantiation of dirstate Using repo.local() instead of util.safehasattr(repo, 'dirstate') also avoids executing setup procedures for remote repository (including statichttprepo). This is reason why this patch also removes a part of subsequent comment, and try/except for AttributeError at accessing to repo.wvfs.
Mon, 10 Jul 2017 23:09:51 +0900 journal: use wrapfilecache instead of wrapfunction on func of filecache
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Mon, 10 Jul 2017 23:09:51 +0900] rev 33384
journal: use wrapfilecache instead of wrapfunction on func of filecache wrapfilecache() on filecache-ed property works more strictly than wrapfunction() directly on func() of filecache.
Mon, 10 Jul 2017 23:09:51 +0900 journal: execute setup procedures for already instantiated dirstate
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Mon, 10 Jul 2017 23:09:51 +0900] rev 33383
journal: execute setup procedures for already instantiated dirstate If dirstate is instantiated before reposetup() of journal extension, it doesn't have "journalstorage" property, even if it is instantiated via wrapdirstate() wrapping repo.dirstate(), because wrapdirstate() works as same as original one before marking repo as "journal"-ing in reposetup(). This issue can be reproduced by running test-journal.t or test-journal-share.t with fsmonitor-run-tests.py. On the other hand, just discarding already instantiated dirstate in reposetup() prevents chg from filling dirstate before reposetup() (see bf3af0eced44 for detail). Therefore, this patch executes setup procedures for already instantiated dirstate explicitly in reposetup(). To centralize setup procedures for dirstate, this patch also factors them out from wrapdirstate().
Mon, 10 Jul 2017 23:09:51 +0900 localrepo: add isfilecached to check filecache-ed property is already cached
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Mon, 10 Jul 2017 23:09:51 +0900] rev 33382
localrepo: add isfilecached to check filecache-ed property is already cached isfilecached() encapsulates internal implementation of filecache-ed property. "name in repo.unfiltered().__dict__" or so can't be used for this purpose, because corresponded entry in __dict__ might be discarded by repo.invalidate(), repo.invalidatedirstate() or so (fsmonitor does so, for example). This patch makes isfilecached() return not only whether filecache-ed property is already cached, but also already cached value (or None), in order to avoid subsequent access to cached object via "repo.NAME", which prevents main Mercurial procedure after reposetup() from validating cache.
Mon, 10 Jul 2017 21:09:46 -0700 sslutil: check for missing certificate and key files (issue5598)
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 10 Jul 2017 21:09:46 -0700] rev 33381
sslutil: check for missing certificate and key files (issue5598) Currently, sslutil._hostsettings() performs validation that web.cacerts exists. However, client certificates are passed in to the function and not all callers may validate them. This includes httpconnection.readauthforuri(), which loads the [auth] section. If a missing file is specified, the ssl module will raise a generic IOException. And, it doesn't even give us the courtesy of telling us which file is missing! Mercurial then prints a generic "abort: No such file or directory" (or similar) error, leaving users to scratch their head as to what file is missing. This commit introduces explicit validation of all paths passed as arguments to wrapsocket() and wrapserversocket(). Any missing file is alerted about explicitly. We should probably catch missing files earlier - as part of loading the [auth] section. However, I think the sslutil functions should check for file presence regardless of what callers do because that's the only way to be sure that missing files are always detected.
Fri, 07 Jul 2017 08:55:12 -0700 match: override matchfn instead of __call__ for consistency
Martin von Zweigbergk <martinvonz@google.com> [Fri, 07 Jul 2017 08:55:12 -0700] rev 33380
match: override matchfn instead of __call__ for consistency The matchers that were recently moved into core from the sparse extension override __call__, while the previously existing matchers override matchfn. Let's switch to the latter for consistency.
Sun, 09 Jul 2017 17:02:09 -0700 match: express anypats(), not prefix(), in terms of the others
Martin von Zweigbergk <martinvonz@google.com> [Sun, 09 Jul 2017 17:02:09 -0700] rev 33379
match: express anypats(), not prefix(), in terms of the others When I added prefix() in 9789b4a7c595 (match: introduce boolean prefix() method, 2014-10-28), we already had always(), isexact(), and anypats(), so it made sense to write it in terms of them (a prefix matcher is one that isn't any of the other types). It's only now that I realize that it's much more natural to define prefix() explicitly (it's one that uses path: patterns, roughly speaking) and let anypats() be defined in terms of the others. Remember that these methods are all used for determining which fast paths are possible. anypats() simply means that no fast paths are possible (it could be called complex() instead). Further evidence is that rootfilesin:some/dir does not have any patterns, but it's still considered to be an anypats() matcher. That's because anypats() really just means that it's not a prefix() matcher (and not always() and not isexact()). This patch thus changes prefix() to return False by default and anypats() to return True only if the other three are False. Having anypats() be True by default also seems like a good thing, because it means forgetting to override it will lead only to performance bugs, not correctness bugs. Since the base class's implementation changes, we're also forced to update the subclasses. That change exposed and fixed a bug in the differencematcher: for example when both its two input matchers were prefix matchers, we would say that the result was also a prefix matcher, which is incorrect, because e.g "path:dir - path:dir/foo" no longer matches everything under "dir" (which is what prefix() means).
Sun, 09 Jul 2017 15:19:27 -0700 match: make nevermatcher an exact matcher and a prefix matcher
Martin von Zweigbergk <martinvonz@google.com> [Sun, 09 Jul 2017 15:19:27 -0700] rev 33378
match: make nevermatcher an exact matcher and a prefix matcher The m.isexact() and m.prefix() methods are used by callers to determine whether m.files() can be used for fast paths. It seems safe to let callers to any fast paths it can that rely on the empty m.files().
Mon, 10 Jul 2017 10:56:40 -0700 revset: define successors revset
Jun Wu <quark@fb.com> [Mon, 10 Jul 2017 10:56:40 -0700] rev 33377
revset: define successors revset This revset returns all successors, including transit nodes and the source nodes (to be consistent with existing revsets like "ancestors"). To filter out transit nodes, use `successors(X)-obsolete()`. To filter out divergent case, use `successors(X)-divergent()-obsolete()`. The revset could be useful to define rebase destination, like: `max(successors(BASE)-divergent()-obsolete())`. The `max` is to deal with splits. There are other implementations where `successors` returns just one level of successors, and `allsuccessors` returns everything. I think `successors` returning all successors by default is more user friendly. We have seen cases in production where people use 1-level `successors` while they really want `allsuccessors`. So it seems better to just have one single revset returning all successors by default to avoid user errors. In the future we might want to add `depth` keyword argument to it and for other revsets like `ancestors` etc. Or even build some flexible indexing syntax [1] to satisfy people having the depth limit requirement. [1]: https://www.mercurial-scm.org/pipermail/mercurial-devel/2017-July/101140.html
Mon, 10 Jul 2017 21:55:43 -0700 sparse: shorten try..except block in updateconfig()
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 10 Jul 2017 21:55:43 -0700] rev 33376
sparse: shorten try..except block in updateconfig() It now only covers refreshwdir(). This is what importfromfiles() does. I think it is the more appropriate behavior.
Mon, 10 Jul 2017 21:43:19 -0700 sparse: clean up updateconfig()
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 10 Jul 2017 21:43:19 -0700] rev 33375
sparse: clean up updateconfig() * Use context manager for wlock * Rename oldsparsematch to oldmatcher * Always call parseconfig() because parsing an empty string yields the same result as the old code
Mon, 10 Jul 2017 21:39:49 -0700 sparse: move config updating function into core
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 10 Jul 2017 21:39:49 -0700] rev 33374
sparse: move config updating function into core As part of the move, the ui argument was dropped. Additional fixups will be made in a follow-up commit.
Sat, 08 Jul 2017 16:18:04 -0700 dirstate: expose a sparse matcher on dirstate (API)
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 08 Jul 2017 16:18:04 -0700] rev 33373
dirstate: expose a sparse matcher on dirstate (API) The sparse extension performs a lot of monkeypatching of dirstate to make it sparse aware. Essentially, various operations need to take the active sparse config into account. They do this by obtaining a matcher representing the sparse config and filtering paths through it. The monkeypatching is done by stuffing a reference to a repo on dirstate and calling sparse.matcher() (which takes a repo instance) during each function call. The reason this function takes a repo instance is because resolving the sparse config may require resolving file contents from filelogs, and that requires a repo. (If the current sparse config references "profile" files, the contents of those files from the dirstate's parent revisions is resolved.) I seem to recall people having strong opinions that the dirstate object not have a reference to a repo. So copying what the sparse extension does probably won't fly in core. Plus, the dirstate modifications shouldn't require a full repo: they only need a matcher. So there's no good reason to stuff a reference to the repo in dirstate. This commit exposes a sparse matcher to dirstate via a property that when looked up will call a function that eventually calls sparse.matcher(). The repo instance is bound in a closure, so it isn't exposed to dirstate. This approach is functionally similar to what the sparse extension does today, except it hides the repo instance from dirstate. The approach is not optimal because we have to call a proxy function and sparse.matcher() on every property lookup. There is room to cache the matcher instance in dirstate. After all, the matcher only changes if the dirstate's parents change or if the sparse config changes. It feels like we should be able to detect both events and update the matcher when this occurs. But for now we preserve the existing semantics so we can move the dirstate sparseness bits into core. Once in core, refactoring becomes a bit easier since it will be clearer how all these components interact. The sparse extension has been updated to use the new property. Because all references to the repo on dirstate have been removed, the code for setting it has been removed.
Sat, 08 Jul 2017 15:42:11 -0700 sparse: use self instead of repo.dirstate
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 08 Jul 2017 15:42:11 -0700] rev 33372
sparse: use self instead of repo.dirstate "self" here is the dirstate instance. I'm pretty confident that self and repo.dirstate will be the exact same object. So remove a dependency on repo by just looking at self.
Sat, 08 Jul 2017 14:15:07 -0700 sparse: move code for importing rules from files into core
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 08 Jul 2017 14:15:07 -0700] rev 33371
sparse: move code for importing rules from files into core This is a pretty straightforward port. Some code cleanup was performed. But no major changes to the logic were made. I'm not a huge fan of this function because it does multiple things. I'd like to get things into core first to facilitate refactoring later. Please also note the added inline comment about the oddities of writeconfig() and the try..except to undo it. This is because of the hackiness in which the sparse matcher is obtained by various consumers, notably dirstate. We'll need a massive refactor to address this. That refactor is effectively blocked on having the sparse dirstate hacks live in core.
Sat, 08 Jul 2017 14:01:32 -0700 sparse: refactor activeprofiles into a generic function (API)
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 08 Jul 2017 14:01:32 -0700] rev 33370
sparse: refactor activeprofiles into a generic function (API) activeprofiles() is a special case of a more generic function. Furthermore, that generic function is essentially already implemented inline in the sparse extension. So, refactor activeprofiles() to a generic activeconfig(). Change the only consumer of activeprofiles() to use it. And have the inline implementation in the sparse extension use it.
Fri, 07 Jul 2017 15:11:11 -0400 check-code: prohibit `if False` antipattern
Augie Fackler <raf@durin42.com> [Fri, 07 Jul 2017 15:11:11 -0400] rev 33369
check-code: prohibit `if False` antipattern Differential Revision: https://phab.mercurial-scm.org/D20
Fri, 07 Jul 2017 15:08:23 -0400 convert: remove `if False` block
Augie Fackler <raf@durin42.com> [Fri, 07 Jul 2017 15:08:23 -0400] rev 33368
convert: remove `if False` block This code has never run since its introduction on July 18th, 2007. It's time for it to go. Differential Revision: https://phab.mercurial-scm.org/D19
Fri, 07 Jul 2017 15:07:36 -0400 filterpyflakes: move self-test into test file
Augie Fackler <raf@durin42.com> [Fri, 07 Jul 2017 15:07:36 -0400] rev 33367
filterpyflakes: move self-test into test file This will avoid a false positive on an upcoming check-code rule. Differential Revision: https://phab.mercurial-scm.org/D18
Sun, 09 Jul 2017 16:38:04 -0400 test-subrepo: demonstrate a status problem when merge deletes a file
Matt Harbison <matt_harbison@yahoo.com> [Sun, 09 Jul 2017 16:38:04 -0400] rev 33366
test-subrepo: demonstrate a status problem when merge deletes a file At the interactive update prompt, if (c) is chosen and then followed by `hg rm`, both `status -R` and `status -S` show the file as 'R', and `files -R` shows no files (OK, because explicitly removed files aren't supposed to be listed). If `rm` follows selecting (c), then both flavors of `status` list the file as '!', and `files -R` lists the missing file. So somehow, the (d) option has followed a third path.
Sun, 09 Jul 2017 16:13:30 -0400 subrepo: make the output references to subrepositories consistent
Matt Harbison <matt_harbison@yahoo.com> [Sun, 09 Jul 2017 16:13:30 -0400] rev 33365
subrepo: make the output references to subrepositories consistent Well, mostly. The annotation on subrepo functions tacks on a parenthetical to the abort message, which seems reasonable for a generic mechanism. But now all messages consistently spell out 'subrepository', and double quote the name of the repo. I noticed the inconsistency in the change for the last commit.
Sun, 09 Jul 2017 02:55:46 -0400 subrepo: consider the parent repo dirty when a file is missing
Matt Harbison <matt_harbison@yahoo.com> [Sun, 09 Jul 2017 02:55:46 -0400] rev 33364
subrepo: consider the parent repo dirty when a file is missing This simply passes the 'missing' argument down from the context of the parent repo, so the same rules apply. subrepo.bailifchanged() is hardcoded to care about missing files, because cmdutil.bailifchanged() is too. In the end, it looks like this addresses inconsistencies with 'archive', 'identify', blackbox logs, 'merge', and 'update --check'. I wasn't sure how to implement this in git, so that's left for someone more familiar with it.
Sun, 09 Jul 2017 02:46:03 -0400 archival: flag missing files as a dirty wdir() in the metadata file (BC)
Matt Harbison <matt_harbison@yahoo.com> [Sun, 09 Jul 2017 02:46:03 -0400] rev 33363
archival: flag missing files as a dirty wdir() in the metadata file (BC) Since the identify command adds a '+' for missing files, it's reasonable that this does too. Perhaps the node field's hex value should be p1+p2 for merges?
Sun, 09 Jul 2017 00:53:16 -0400 cmdutil: simplify the dirty check in howtocontinue()
Matt Harbison <matt_harbison@yahoo.com> [Sun, 09 Jul 2017 00:53:16 -0400] rev 33362
cmdutil: simplify the dirty check in howtocontinue() This is equivalent to the previous code. But it seems to me that if the user is going to be prompted that a commit is needed, missing files should be ignored, but branch and merge changes shouldn't be.
Sun, 09 Jul 2017 00:23:03 -0400 blackbox: simplify the dirty check
Matt Harbison <matt_harbison@yahoo.com> [Sun, 09 Jul 2017 00:23:03 -0400] rev 33361
blackbox: simplify the dirty check Same idea (and possibly incorrect behavior) as the previous commit.
Sun, 09 Jul 2017 00:19:03 -0400 identify: simplify the dirty check
Matt Harbison <matt_harbison@yahoo.com> [Sun, 09 Jul 2017 00:19:03 -0400] rev 33360
identify: simplify the dirty check This is equivalent to the previous code, but it seems better to be explicit about what aspects of dirty are being ignored. Perhaps they shouldn't be, since the help text says 'followed by a "+" if the working directory has uncommitted changes'. Both merges and branch changes are committable, even if the files are unchanged. Additionally, this will make the `identify` command notice missing subrepo files, once subrepos are taught to look for missing files.
Sun, 09 Jul 2017 00:05:31 -0400 tests: tweak the subrepo dirty state tests
Matt Harbison <matt_harbison@yahoo.com> [Sun, 09 Jul 2017 00:05:31 -0400] rev 33359
tests: tweak the subrepo dirty state tests This is a continuation of 439b4d005b4a. I overlooked that blackbox logs also have a dirty marker. Also, the `hg update --check` test was updating to a revision where the deleted file wasn't tracked, which is why status seemed to show the deleted file was restored.
Sun, 09 Jul 2017 23:01:11 -0700 match: combine regex code for path: and relpath:
Martin von Zweigbergk <martinvonz@google.com> [Sun, 09 Jul 2017 23:01:11 -0700] rev 33358
match: combine regex code for path: and relpath: The regexes for path: and relpath: patterns are the same (since the paths have already been normalized at the point we create the regexes). I don't think the "if pat == '.'" will have any effect relpath: because relpath: patterns will have the root directory already normalized to '' by pathutil.canonpath() (unlike path:, for which the root gets normalized to '.' by util.normpath()).
(0) -30000 -10000 -3000 -1000 -300 -100 -50 -32 +32 +50 +100 +300 +1000 +3000 +10000 tip