hg-stable: Changelog

filectx: correct docstring about "changeid" The changeid argument must be a revnum (basefile.rev() is defined as "return self._changeid"), so fix the lie in the docstring. It seems to have been incorrect for at least 10 years (I didn't check further back). Differential Revision: https://phab.mercurial-scm.org/D4881

context: drop incorrect and superfluous docstring It's been incorrect at least since 8b86acc7aa64 (context: drop support for looking up context by ambiguous changeid (API), 2018-04-28). Differential Revision: https://phab.mercurial-scm.org/D4880

remotenames: follow-up on D3639 to make revset funcs take only one arg Per the review discussion on D3639, we want this to just take one argument. That ended up simplifying the code, so I'm sharing this as a follow-up to that revision rather than editing in-flight.

remotenames: add names argument to remotenames revset This patch adds names argument to the revsets provided by the remotenames extension. The revsets are remotenames(), remotebranches() and remotebookmarks(). names can be a single names, list of names or can be empty too which means it's an optional argument. If names is/are passed, changesets which have those remotenames will be returned. If names are not passed, changesets from all the remotenames are shown. Passing an invalid remotename does not throw error. The name argument also supports pattern matching. Tests are added for the argument in tests/test-logexchange.t Differential Revision: https://phab.mercurial-scm.org/D3639

copies: add time information to the debug information

copies: add a devel debug mode to trace what copy tracing does Mercurial can spend a lot of time finding renames between two commits. Having more information about that process help to understand what makes it slow in an individual instance. (eg: many files vs 1 file, etc...)

revlog: rewrite censoring logic I was able to corrupt a revlog relatively easily with the existing censoring code. The underlying problem is that the existing code doesn't fully take delta chains into account. When copying revisions that occur after the censored revision, the delta base can refer to a censored revision. Then at read time, things blow up due to the revision data not being a compressed delta. This commit rewrites the revlog censoring code to take a higher-level approach. We now create a new revlog instance pointing at temp files. We iterate through each revision in the source revlog and insert those revisions into the new revlog, replacing the censored revision's data along the way. The new implementation isn't as efficient as the old one. This is because it will fully engage delta computation on insertion. But I don't think it matters. The new implementation is a bit hacky because it attempts to reload the revlog instance with a new revlog index/data file. This is fragile. But this is needed because the index (which could be backed by C) would have a cached copy of the old, possibly changed data and that could lead to problems accessing index or revision data later. One benefit of the new approach is that we integrate with the transaction. The old revlog is backed up and if the transaction is rolled back, the original revlog is restored. As part of this, we had to teach the transaction about the store vfs. I'm not super keen about this. But this was the easiest way to hook things up to the transaction. We /could/ just ignore the transaction like we were doing before. But any file mutation should be governed by transaction semantics, including undo during rollback. Differential Revision: https://phab.mercurial-scm.org/D4869

revlog: move loading of index data into own method This will allow us to "reload" a revlog instance from a rewritten index file, which will be used in a subsequent commit. Differential Revision: https://phab.mercurial-scm.org/D4868

revlog: clear revision cache on hash verification failure The revision cache is populated after raw revision fulltext is retrieved but before hash verification. If hash verification fails, the revision cache will be populated and subsequent operations to retrieve the invalid fulltext may return the cached fulltext instead of raising. This commit changes hash verification so it will invalidate the revision cache if the cached node fails hash verification. The side-effect is that subsequent operations to request the revision text - even the raw revision text - will always fail. The new behavior is consistent and is definitely less wrong. There is an open question of whether revision(raw=True) should validate hashes. But I'm going to punt on this problem. We can always change behavior later. And to be honest, I'm not sure we should expose raw=True on the storage interface at all. Another day... Differential Revision: https://phab.mercurial-scm.org/D4867

fuzz: new fuzzer for cext/manifest.c This is a bit messy, because lazymanifest is tightly coupled to the cpython API for performance reasons. As a result, we have to build a whole Python without pymalloc (so ASAN can help us out) and link against that. Then we have to use an embedded Python interpreter. We could manually drive the lazymanifest in C from that point, but experimentally just using PyEval_EvalCode isn't really any slower so we may as well do that and write the innermost guts of the fuzzer in Python. Leak detection is currently disabled for this fuzzer because there are a few global-lifetime things in our extensions that we more or less intentionally leak and I didn't want to take the detour to work around that for now. This should not be pushed to our repo until https://github.com/google/oss-fuzz/pull/1853 is merged, as this depends on having the Python tarball around. Differential Revision: https://phab.mercurial-scm.org/D4879