Sun, 06 Nov 2016 00:37:50 -0700 bdiff: don't check border condition in loop
Gregory Szorc <gregory.szorc@gmail.com> [Sun, 06 Nov 2016 00:37:50 -0700] rev 30321
bdiff: don't check border condition in loop `plast = a + len - 1`. So, this "for" loop iterates from "a" to "plast", inclusive. So, `p == plast` can only be true on the final iteration of the loop. So checking for it on every loop iteration is wasteful. This patch simply decreases the upper bound of the loop by 1 and adds an explicit check after iteration for the `p == plast` case. We can't simply add 1 to the initial value for "i" because that doesn't do the correct thing on empty input strings. `perfbdiff -m 3041e4d59df2` on the Firefox repo becomes significantly faster: ! wall 0.072763 comb 0.070000 user 0.070000 sys 0.000000 (best of 100) ! wall 0.053221 comb 0.060000 user 0.060000 sys 0.000000 (best of 100) For the curious, this code has its origins in 8b067bde6679, which is the changeset that introduced bdiff.c in 2005. Also, GNU diffutils is able to perform a similar line-based diff in under 20ms. So there's likely more perf wins to be found in this code. One of them is the hashing algorithm. But it looks like mpm spent some time testing hash collisions in d0c48891dd4a. I'd like to do the same before switching away from lyhash, just to be on the safe side.
Sat, 05 Nov 2016 23:41:52 -0700 perf: add perfbdiff
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 05 Nov 2016 23:41:52 -0700] rev 30320
perf: add perfbdiff bdiff shows up a lot in profiling. I think it would be useful to have a perf command that runs bdiff over and over so we can find hot spots.
Sun, 06 Nov 2016 06:54:31 +0530 help: show help for disabled extensions (issue5228)
Pulkit Goyal <7895pulkit@gmail.com> [Sun, 06 Nov 2016 06:54:31 +0530] rev 30319
help: show help for disabled extensions (issue5228) This patch does not exactly solve issue5228 but it results in a better condition on this issue. For disabled extensions, we used to parse the module and get the first occurrences of docstring and then return the first line of that as an introductory heading of extension. This is what we get today. This patch returns the whole docstring of the module as a help for extension, which is more informative. There are some modules which don't have much docstring at top level except the heading so those are unaffected by this change. To follow the existing trend of showing commands either we have to load the extension or have a very ugly parsing method which don't even assure correctness.
Sun, 06 Nov 2016 04:17:19 +0530 py3: make scmutil.rcpath() return bytes
Pulkit Goyal <7895pulkit@gmail.com> [Sun, 06 Nov 2016 04:17:19 +0530] rev 30318
py3: make scmutil.rcpath() return bytes This patch make sure scmutil.rcpath() returns bytes independent of which platform is used on Python 3. If we want to change type for windows we can just conditionalize the return variable.
Sun, 06 Nov 2016 04:10:33 +0530 py3: use pycompat.ossep at certain places
Pulkit Goyal <7895pulkit@gmail.com> [Sun, 06 Nov 2016 04:10:33 +0530] rev 30317
py3: use pycompat.ossep at certain places Certain instances of os.sep has been converted to pycompat.ossep where it was sure to use bytes only. There are more such instances which needs some more attention and will get surely.
Sun, 06 Nov 2016 03:44:44 +0530 py3: have pycompat.ospathsep and pycompat.ossep
Pulkit Goyal <7895pulkit@gmail.com> [Sun, 06 Nov 2016 03:44:44 +0530] rev 30316
py3: have pycompat.ospathsep and pycompat.ossep We needed bytes version of os.sep and os.pathsep in py3 as they return unicodes.
Sun, 06 Nov 2016 03:33:22 +0530 py3: add a bytes version of os.name
Pulkit Goyal <7895pulkit@gmail.com> [Sun, 06 Nov 2016 03:33:22 +0530] rev 30315
py3: add a bytes version of os.name os.name returns unicodes on py3. Most of our checks are like os.name == 'nt' Because of the transformer, on the right hand side we have b'nt'. The condition will never satisfy even if os.name returns 'nt' as that will be an unicode. We either need to encode every occurence of os.name or have a new variable which is much cleaner. Now we have pycompat.osname. There are around 53 occurences of os.name in the codebase which needs to be replaced by pycompat.osname to support Python 3.
Sun, 06 Nov 2016 12:18:23 +0900 py3: make util.datapath a bytes variable
Pulkit Goyal <7895pulkit@gmail.com> [Sun, 06 Nov 2016 12:18:23 +0900] rev 30314
py3: make util.datapath a bytes variable In this patch we make util.datapath a bytes variable, but we have to pass a unicode to gettext.translation otherwise it will cry. Used pycompat.fsdecode() to decode it back to unicode as it was converted to bytes using pycompat.fsencode().
Sun, 06 Nov 2016 03:12:40 +0530 py3: add os.fsdecode() as pycompat.fsdecode()
Pulkit Goyal <7895pulkit@gmail.com> [Sun, 06 Nov 2016 03:12:40 +0530] rev 30313
py3: add os.fsdecode() as pycompat.fsdecode() We need to use os.fsdecode() but this was not present in Python 2. So added the function in pycompat.py
Fri, 04 Nov 2016 20:22:37 -0700 statprof: return state from stop()
Gregory Szorc <gregory.szorc@gmail.com> [Fri, 04 Nov 2016 20:22:37 -0700] rev 30312
statprof: return state from stop() I don't like global variables. Have stop() return the captured state so callers can pass data to the display function.
Sat, 05 Nov 2016 13:20:53 +0900 hghave: check darcs version more strictly
Yuya Nishihara <yuya@tcha.org> [Sat, 05 Nov 2016 13:20:53 +0900] rev 30311
hghave: check darcs version more strictly test-convert-darcs.t suddenly started failing on my Debian sid machine. The reason was Darcs was upgraded from 2.12.0 to 2.12.4 so the original pattern got to match the last two digits. Fix the pattern to match 2.2+.
Sat, 05 Nov 2016 13:16:40 +0900 tests: silence output of darcs command
Yuya Nishihara <yuya@tcha.org> [Sat, 05 Nov 2016 13:16:40 +0900] rev 30310
tests: silence output of darcs command It appears darcs is more verbose by default these days. I got test failure with Darcs 2.12.4.
Wed, 02 Nov 2016 17:10:47 -0700 manifest: remove manifest.readshallowdelta
Durham Goode <durham@fb.com> [Wed, 02 Nov 2016 17:10:47 -0700] rev 30309
manifest: remove manifest.readshallowdelta This removes manifest.readshallowdelta and converts its one consumer to use manifestlog instead.
Wed, 02 Nov 2016 17:10:47 -0700 manifest: get rid of manifest.readshallowfast
Durham Goode <durham@fb.com> [Wed, 02 Nov 2016 17:10:47 -0700] rev 30308
manifest: get rid of manifest.readshallowfast This removes manifest.readshallowfast and converts it's one user to use manifestlog instead.
Wed, 02 Nov 2016 17:10:47 -0700 manifest: add shallow option to treemanifestctx.readdelta and readfast
Durham Goode <durham@fb.com> [Wed, 02 Nov 2016 17:10:47 -0700] rev 30307
manifest: add shallow option to treemanifestctx.readdelta and readfast The old manifest had different functions for performing shallow reads, shallow readdeltas, and shallow readfasts. Since a lot of the code is duplicate (and since those functions don't make sense on a normal manifestctx), let's unify them into flags on the existing readdelta and readfast functions. A future diff will change consumers of these functions to use the manifestctx versions and will delete the old apis.
Wed, 02 Nov 2016 17:10:47 -0700 manifest: change manifestlog mancache to be directory based
Durham Goode <durham@fb.com> [Wed, 02 Nov 2016 17:10:47 -0700] rev 30306
manifest: change manifestlog mancache to be directory based In the last patch we added a get() function that allows fetching directory level treemanifestctxs. It didn't handle caching at directory level though, so we need to change our mancache to support multiple directories.
Wed, 02 Nov 2016 17:24:06 -0700 manifest: add manifestlog.get to obtain subdirectory instances
Durham Goode <durham@fb.com> [Wed, 02 Nov 2016 17:24:06 -0700] rev 30305
manifest: add manifestlog.get to obtain subdirectory instances Previously manifestlog only allowed obtaining root level manifests. Future patches will need direct access to subdirectory manifests as part of changegroup creation, so let's add a get() function that knows how to deal with subdirectories.
Wed, 02 Nov 2016 17:33:31 -0700 manifest: throw LookupError if node not in revlog
Durham Goode <durham@fb.com> [Wed, 02 Nov 2016 17:33:31 -0700] rev 30304
manifest: throw LookupError if node not in revlog When accessing a manifest via manifestlog[node], let's verify that the node actually exists and throw a LookupError if it doesn't. This matches the old read behavior, so we don't accidentally return invalid manifestctxs. We do this in manifestlog instead of in the manifestctx/treemanifestctx constructors because the treemanifest code currently relies on the fact that certain code paths can produce treemanifests without touching the revlogs (and it has tests that verify things work if certain revlogs are missing entirely, so they break if we add validation that tries to read them).
Sun, 23 Oct 2016 10:40:33 -0700 revlog: optimize _chunkraw when startrev==endrev
Gregory Szorc <gregory.szorc@gmail.com> [Sun, 23 Oct 2016 10:40:33 -0700] rev 30303
revlog: optimize _chunkraw when startrev==endrev In many cases, _chunkraw() is called with startrev==endrev. When this is true, we can avoid an extra index lookup and some other minor operations. On the mozilla-unified repo, `hg perfrevlogchunks -c` says this has the following impact: ! read w/ reused fd ! wall 0.371846 comb 0.370000 user 0.350000 sys 0.020000 (best of 27) ! wall 0.337930 comb 0.330000 user 0.300000 sys 0.030000 (best of 30) ! read batch w/ reused fd ! wall 0.014952 comb 0.020000 user 0.000000 sys 0.020000 (best of 197) ! wall 0.014866 comb 0.010000 user 0.000000 sys 0.010000 (best of 196) So, we've gone from ~25x slower than batch to ~22.5x slower. At this point, there's probably not much else we can do except implement an optimized function in the index itself, including in C.
Sat, 22 Oct 2016 15:41:23 -0700 revlog: inline start() and end() for perf reasons
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 22 Oct 2016 15:41:23 -0700] rev 30302
revlog: inline start() and end() for perf reasons When I implemented `hg perfrevlogchunks`, one of the things that stood out was N * _chunk() calls was ~38x slower than 1 _chunks() call. Specifically, on the mozilla-unified repo: N*_chunk: 0.528997s 1*_chunks: 0.013735s This repo has 352,097 changesets. So the average time per changeset comes out to: N*_chunk: 1.502us 1*_chunks: 0.039us If you extrapolate these numbers to a repository with 1M changesets, that comes out to 1.502s versus 0.039s, which is significant. At these latencies, Python attribute lookups and function calls matter. So, this patch inlines some code to cut down on that overhead. The impact of this patch on N*_chunk() calls is clear: ! wall 0.528997 comb 0.520000 user 0.500000 sys 0.020000 (best of 19) ! wall 0.367723 comb 0.370000 user 0.350000 sys 0.020000 (best of 27) So, we go from ~38x slower to ~27x. A nice improvement. But there's still a long way to go. It's worth noting that functionality like revsets perform changelog lookups one revision at a time. So this code path is worth optimizing.
Sun, 23 Oct 2016 09:34:55 -0700 revlog: reorder index accessors to match data structure order
Gregory Szorc <gregory.szorc@gmail.com> [Sun, 23 Oct 2016 09:34:55 -0700] rev 30301
revlog: reorder index accessors to match data structure order Index entries are ordered tuples. We have accessors in the revlog class to map tuple offsets to names. To help reinforce the order, reorder the methods so they match the order of elements in the tuple. While I'm here, also sneak in some minimal documentation.
Thu, 03 Nov 2016 15:17:02 +0100 color: add the ability to display configured style to 'debugcolor'
Pierre-Yves David <pierre-yves.david@ens-lyon.org> [Thu, 03 Nov 2016 15:17:02 +0100] rev 30300
color: add the ability to display configured style to 'debugcolor' The 'hg debugcolor' command gains a '--style' flag to display all the configured labels and their styles. This have many benefits: * discovering documented label, * checking consistency between label's style, * showing the actual style of a label.
Thu, 03 Nov 2016 15:15:47 +0100 color: sort output of 'debugcolor'
Pierre-Yves David <pierre-yves.david@ens-lyon.org> [Thu, 03 Nov 2016 15:15:47 +0100] rev 30299
color: sort output of 'debugcolor' The previous ordering were provided by the set. The new output is more stable and rational. In addition we have some logic to keep the '_background' version together to help readability.
Thu, 03 Nov 2016 14:48:47 +0100 color: extract color and effect display from 'debugcolor'
Pierre-Yves David <pierre-yves.david@ens-lyon.org> [Thu, 03 Nov 2016 14:48:47 +0100] rev 30298
color: extract color and effect display from 'debugcolor' We are about to introduce a second mode for 'hg debugcolor' that would list the known label and their configuration, so we split the code related to color and effect out of the main function.
Thu, 03 Nov 2016 14:29:19 +0100 color: restore _style global after debugcolor ran
Pierre-Yves David <pierre-yves.david@ens-lyon.org> [Thu, 03 Nov 2016 14:29:19 +0100] rev 30297
color: restore _style global after debugcolor ran Before this change, running 'debugcolor' would destroy all color style for the rest of the process life. We now properly backup and restore the variable content. Using a global variable is sketchy in general and could probably be removed. However, this is a quest for another adventure.
Thu, 03 Nov 2016 14:12:32 +0100 color: add basic documentation to 'debugcolor'
Pierre-Yves David <pierre-yves.david@ens-lyon.org> [Thu, 03 Nov 2016 14:12:32 +0100] rev 30296
color: add basic documentation to 'debugcolor' This does not hurt.
Thu, 03 Nov 2016 05:12:23 +0100 tests: merge 'test-push-hook-lock.t' into 'test-push.t'
Pierre-Yves David <pierre-yves.david@ens-lyon.org> [Thu, 03 Nov 2016 05:12:23 +0100] rev 30295
tests: merge 'test-push-hook-lock.t' into 'test-push.t' That test file is very small and is merge with the new 'test-push.t'. No logic is changed. We don't register this as a copy because is actually a "ypoc" merging two file together without replacing the destination and Mercurial cannot express that.
Thu, 03 Nov 2016 05:10:14 +0100 tests: merge 'test-push-validation.t' into 'test-push.t'
Pierre-Yves David <pierre-yves.david@ens-lyon.org> [Thu, 03 Nov 2016 05:10:14 +0100] rev 30294
tests: merge 'test-push-validation.t' into 'test-push.t' That test file is very small and is merge with the new 'test-push.t'. No logic is changed but repository name are update to avoid collision. We don't register this as a copy because is actually a "ypoc" merging two file together without replacing the destination and Mercurial cannot express that.
Thu, 03 Nov 2016 04:58:46 +0100 test: rename 'test-push-r.t' to 'test-push.t'
Pierre-Yves David <pierre-yves.david@ens-lyon.org> [Thu, 03 Nov 2016 04:58:46 +0100] rev 30293
test: rename 'test-push-r.t' to 'test-push.t' We do not have a simple test for 'hg push' but we have multiple tiny tests for various aspect of it. We'll unify them into a single file, and we start with 'test-push-r.t'. The code is unchanged but we renamed the repository used to avoid collision with other tests we'll import in coming changesets. Test timing for the record: start end cuser csys real Test 1.850 2.640 0.650 0.090 0.790 test-push-validation.t 2.640 3.520 0.760 0.090 0.880 test-push-hook-lock.t 0.000 1.850 1.560 0.210 1.850 test-push-r.t
Thu, 03 Nov 2016 05:05:34 +0100 tests: simplify command script in 'test-push-r.t'
Pierre-Yves David <pierre-yves.david@ens-lyon.org> [Thu, 03 Nov 2016 05:05:34 +0100] rev 30292
tests: simplify command script in 'test-push-r.t' I came across this code by chance. The script of this test is a bit messy with a lot of unnecessary intermediate commands. We simplify the script and unify repository access through '-R'. In the process the update after the unbundle is dropped as it does not add anything to the tests.
Thu, 03 Nov 2016 03:12:57 +0530 py3: use encoding.environ in ui.py
Pulkit Goyal <7895pulkit@gmail.com> [Thu, 03 Nov 2016 03:12:57 +0530] rev 30291
py3: use encoding.environ in ui.py Using source transformer we add b'' everywhere. So there are no chances that those bytes string will work with os.environ on Py3 as that returns a dict of unicodes. We are relying on the errors, even though no error is raised even in future, these pieces of codes will tend to do wrong things. if statements can result in wrong boolean and certain errors can be raised while using this piece of code. Let's not wait for them to happen, fix what is wrong. If this patch goes in, I will try to do it for all the cases. Leaving it as it is buggy.
Thu, 03 Nov 2016 02:17:01 +0530 py3: make scmposix.userrcpath() return bytes
Pulkit Goyal <7895pulkit@gmail.com> [Thu, 03 Nov 2016 02:17:01 +0530] rev 30290
py3: make scmposix.userrcpath() return bytes We are making sure that we deal with bytes as much we can. This is a part of fixing functions so that they return bytes if they have to. Used encoding.environ to return bytes. After this patch, scmposix.userrcpath() returns bytes and scmutil.osrcpath() will also return bytes if the platform is posix. Functions is scmposix returns bytes on Python 3 now.
(0) -30000 -10000 -3000 -1000 -300 -100 -50 -32 +32 +50 +100 +300 +1000 +3000 +10000 tip