Gregory Szorc <gregory.szorc@gmail.com> [Mon, 13 Nov 2017 21:48:35 -0800] rev 35118
bundle2: inline changegroup.readexactly()
Profiling reveals this loop is pretty tight. Literally any
function call elimination can make a big difference.
This commit inlines the relatively trivial changegroup.readexactly()
method inside the loop.
The results with `hg perfbundleread` on a bundle of the Firefox repo
speak for themselves:
! read(8k)
! wall 0.679730 comb 0.680000 user 0.140000 sys 0.540000 (best of 15)
! read(16k)
! wall 0.577228 comb 0.570000 user 0.080000 sys 0.490000 (best of 17)
! read(32k)
! wall 0.516060 comb 0.520000 user 0.040000 sys 0.480000 (best of 20)
! read(128k)
! wall 0.496378 comb 0.490000 user 0.010000 sys 0.480000 (best of 20)
! bundle2 iterparts()
! wall 3.460903 comb 3.460000 user 2.760000 sys 0.700000 (best of 3)
! wall 3.056811 comb 3.050000 user 2.340000 sys 0.710000 (best of 4)
! bundle2 iterparts() seekable
! wall 4.312722 comb 4.310000 user 3.480000 sys 0.830000 (best of 3)
! wall 4.007676 comb 4.000000 user 3.170000 sys 0.830000 (best of 3)
! bundle2 part seek()
! wall 6.754764 comb 6.740000 user 3.970000 sys 2.770000 (best of 3)
! wall 6.267110 comb 6.250000 user 3.480000 sys 2.770000 (best of 3)
! bundle2 part read(8k)
! wall 3.668004 comb 3.660000 user 2.960000 sys 0.700000 (best of 3)
! wall 3.404164 comb 3.400000 user 2.650000 sys 0.750000 (best of 3)
! bundle2 part read(16k)
! wall 3.489196 comb 3.480000 user 2.750000 sys 0.730000 (best of 3)
! wall 3.197972 comb 3.200000 user 2.490000 sys 0.710000 (best of 4)
! bundle2 part read(32k)
! wall 3.388569 comb 3.380000 user 2.640000 sys 0.740000 (best of 3)
! wall 3.060557 comb 3.060000 user 2.340000 sys 0.720000 (best of 4)
! bundle2 part read(128k)
! wall 3.276415 comb 3.270000 user 2.560000 sys 0.710000 (best of 4)
! wall 2.952209 comb 2.950000 user 2.230000 sys 0.720000 (best of 4)
Differential Revision: https://phab.mercurial-scm.org/D1392
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 13 Nov 2017 22:05:54 -0800] rev 35117
bundle2: inline debug logging
Profiling revealed that repeated calls to indebug() were
consuming a fair amount of CPU during bundle2 reading, with
most of the time spent in ui.configbool().
Inlining indebug() and avoiding extra attribute lookups speeds
things up substantially. Using `hg perfbundleread` with a Firefox
bundle:
! read(8k)
! wall 0.679730 comb 0.680000 user 0.140000 sys 0.540000 (best of 15)
! read(16k)
! wall 0.577228 comb 0.570000 user 0.080000 sys 0.490000 (best of 17)
! read(32k)
! wall 0.516060 comb 0.520000 user 0.040000 sys 0.480000 (best of 20)
! read(128k)
! wall 0.496378 comb 0.490000 user 0.010000 sys 0.480000 (best of 20)
! bundle2 iterparts()
! wall 6.983756 comb 6.980000 user 6.220000 sys 0.760000 (best of 3)
! wall 3.460903 comb 3.460000 user 2.760000 sys 0.700000 (best of 3)
! bundle2 iterparts() seekable
! wall 8.132131 comb 8.110000 user 7.160000 sys 0.950000 (best of 3)
! wall 4.312722 comb 4.310000 user 3.480000 sys 0.830000 (best of 3)
! bundle2 part seek()
! wall 10.860942 comb 10.840000 user 7.790000 sys 3.050000 (best of 3)
! wall 6.754764 comb 6.740000 user 3.970000 sys 2.770000 (best of 3)
! bundle2 part read(8k)
! wall 7.258035 comb 7.260000 user 6.470000 sys 0.790000 (best of 3)
! wall 3.668004 comb 3.660000 user 2.960000 sys 0.700000 (best of 3)
! bundle2 part read(16k)
! wall 7.099891 comb 7.080000 user 6.310000 sys 0.770000 (best of 3)
! wall 3.489196 comb 3.480000 user 2.750000 sys 0.730000 (best of 3)
! bundle2 part read(32k)
! wall 6.964685 comb 6.950000 user 6.130000 sys 0.820000 (best of 3)
! wall 3.388569 comb 3.380000 user 2.640000 sys 0.740000 (best of 3)
! bundle2 part read(128k)
! wall 6.852867 comb 6.850000 user 6.060000 sys 0.790000 (best of 3)
! wall 3.276415 comb 3.270000 user 2.560000 sys 0.710000 (best of 4)
Differential Revision: https://phab.mercurial-scm.org/D1391