Mercurial: Changelog

rust-dirstate: architecture independence fix Apparently, c_char is u8 on ppc64le and i8 on amd64 Differential Revision: https://phab.mercurial-scm.org/D6473

context: get filesadded() and filesremoved() from changeset if configured This adds the read side for getting the sets of added and removed files from the changeset extras. I timed this command on the hg repo: hg log -T '{rev}\n {files}\n %:{file_mods}\n +{file_adds}\n -{file_dels}\n' It took 1m21s before and 6.4s after. I also used that command to check that the result didn't change compared to calculating the values from the manifests on the fly (it didn't change). In the mozilla-unified repo, the same command run on FIREFOX_BETA_58_END::FIREFOX_BETA_59_END went from 29s to 0.67s. Differential Revision: https://phab.mercurial-scm.org/D6417

changelog: optionally store added and removed files in changeset extras As mentioned in an earlier patch, copies._chain() is used a lot in the changeset-centric version of pathcopies(). It is expensive because it needs to look at the manifest in order to filter out copies whose target file has since been removed. I want to store the sets of added and removed files in the changeset in order to speed that up. This patch does the writing part of that. It could easily be a separate config, but it's currently tied to experimental.copies.write-to since that's the only real use case (it will also make the {file_*} template keywords faster, but I doubt that anyone cares enough about those to write extra metadata for them). The new information is stored in the changeset extras. Since they're always subsets of the changeset's "files" list, they're stored as indexes into that list. I've stored the indexes as stringified ints separated by NUL bytes. The size of 00changelog.d for the hg repo increased in size by 0.28% percent (compared to the size with only copy information in the changesets, which in turn is 0.17% larger than without copy information). We could store only the delta between the indexes and we could store them in binary, but the chosen format is more readable. We could also have implemented this as a cache outside the changelog. One advantage of doing it that way is that we would get the speedups from the {file_*} template keywords also on old repos. Another advantage is that it we can rewrite the cache if we find a bug in how we calculate the set of files. A disadvantage is that it would be more complex. Another is that it would surely use more space. We already write the copy information to the changeset extras, so it seems like a small step to also write these file sets. Differential Revision: https://phab.mercurial-scm.org/D6416

templatekw: make {file_*} compare to both merge parents (issue4292) This redefines the {file_adds}, {file_dels}, {file_mods} template keywords by getting the lists from the recently introduced context methods instead of getting them from status compared to p1. As mentioned before, these are better defined on merge commits. The total number of files from the three lists now always add up to the number of files in {files}. I timed this command: hg log -r 4.0::5.0 -T '{rev}\n {file_mods}\n {file_adds}\n {file_dels}\n' It went from 7.6s to 5.6s with this patch. So it's actually faster than before. Note that the "files:" field in the bazaar test log output was using "{file_mods}" (not "{files}" as one might think based on the label). Differential Revision: https://phab.mercurial-scm.org/D6369

narrowspec: use vfs.tryread() instead of reimplementing Note that parseconfig() works well with empty strings. Differential Revision: https://phab.mercurial-scm.org/D6465

help: remove a superfluous "the" in revlogs text Differential Revision: https://phab.mercurial-scm.org/D6466

setdiscovery: make progress on most connected groups each roundtrip Consider history like this: o | o | | | o | | | o |/ o | o | | | o | | | o |/ o | o | | | o | | | o |/ o ~ Assume the left mainline is available in the remote repo and the other commits are only in the local repo. Also imagine that instead of 3 local branches with 3 commits on each, there are 1000 branches (the number of commits on each doesn't matter much here). In such a scenario, the current setdiscovery code will pick a sample size of 200 among these branches and ask the remote which of them it has. However, the discovery for each such branch is completely independent of the discovery for the others -- knowing whether the remote has a commit in one branch doesn't give us any information about the other branches. The discovery will therefore take at least 5 roundtrips (maybe more depending on which commit in each linear chain was sampled). Since the discovery for each branch is independent, there is no reason to let one branch wait for another, so this patch makes it so we sample at least as many commits as there are branches. It may still happen (it's very likely, even) that we get multiple samples from one branch and none from another, but that will even out over a few rounds and I think this is still a big improvement. Because of http header size limits, we still use the old behavior unless experimental.httppostargs=true. I've timed this by running `hg debugdiscovery mozilla-unified --debug` in the mozilla-try repo. Both repos were local. Before this patch, last part of the output was: 2249 total queries in 5276.4859s elapsed time: 5276.652634 seconds heads summary: total common heads: 13 also local heads: 4 also remote heads: 8 both: 4 local heads: 28317 common: 4 missing: 28313 remote heads: 12 common: 8 unknown: 4 local changesets: 2014901 common: 530373 missing: 1484528 common heads: 1dad417c28ad 4a108e94d3e2 4d7ef530fffb 5350524bb654 777e60ca8853 7d97fafba271 9cd2ab4d0029 a55ce37217da d38398e5144e dcc6d7a0dc00 e09297892ada e24ec6070d7b fd559328eaf3 After this patch, the output was (including all the samples, since there were so few now): taking initial sample query 2; still undecided: 1599476, sample size is: 108195 sampling from both directions query 3; still undecided: 810922, sample size is: 194158 sampling from both directions query 4; still undecided: 325882, sample size is: 137302 sampling from both directions query 5; still undecided: 111459, sample size is: 74586 sampling from both directions query 6; still undecided: 26805, sample size is: 23960 sampling from both directions query 7; still undecided: 2549, sample size is: 2528 sampling from both directions query 8; still undecided: 21, sample size is: 21 8 total queries in 24.5064s elapsed time: 24.670051 seconds heads summary: total common heads: 13 also local heads: 4 also remote heads: 8 both: 4 local heads: 28317 common: 4 missing: 28313 remote heads: 12 common: 8 unknown: 4 local changesets: 2014901 common: 530373 missing: 1484528 common heads: 1dad417c28ad 4a108e94d3e2 4d7ef530fffb 5350524bb654 777e60ca8853 7d97fafba271 9cd2ab4d0029 a55ce37217da d38398e5144e dcc6d7a0dc00 e09297892ada e24ec6070d7b fd559328eaf3 Differential Revision: https://phab.mercurial-scm.org/D2647

help: clarify overlap of revlog header and first revlog entry Differential Revision: https://phab.mercurial-scm.org/D6449

py3: fix test-convert-svn-sink.t In cases where the root commit is empty commit, None will be returned as parents. This was implemented by 2c13e91ede6e. This breaks test on py3 because `b'%s' % None` does not work. It does not matter whether we return `None` or `'None'` as we skipped converting to svn step by doing an early return. So let's return `'None'`. I tried to patch all the users to convert `None` to `'None'`, but there were more users than I expected. I hit 3 of them and decided to fix it this way around. Differential Revision: https://phab.mercurial-scm.org/D6458

commit: respect --no-edit in combination with --amend Differential Revision: https://phab.mercurial-scm.org/D6464