bundle2: inline struct operations
authorGregory Szorc <gregory.szorc@gmail.com>
Mon, 13 Nov 2017 21:54:46 -0800
changeset 35119 764e3ad1cf54
parent 35118 1fb0846ad792
child 35120 699b2a759319
bundle2: inline struct operations Before, we were calling struct.unpack() (via an alias) on every loop iteration. I'm not sure what Python does under the hood, but it would have to look at the struct format and determine what to do. This commit establishes a struct.Struct instance and reuses it for struct reading. We can see the impact from running `hg perfbundleread` on a Firefox bundle: ! read(8k) ! wall 0.679730 comb 0.680000 user 0.140000 sys 0.540000 (best of 15) ! read(16k) ! wall 0.577228 comb 0.570000 user 0.080000 sys 0.490000 (best of 17) ! read(32k) ! wall 0.516060 comb 0.520000 user 0.040000 sys 0.480000 (best of 20) ! read(128k) ! wall 0.496378 comb 0.490000 user 0.010000 sys 0.480000 (best of 20) ! bundle2 iterparts() ! wall 3.056811 comb 3.050000 user 2.340000 sys 0.710000 (best of 4) ! wall 2.992605 comb 2.990000 user 2.260000 sys 0.730000 (best of 4) ! bundle2 iterparts() seekable ! wall 4.007676 comb 4.000000 user 3.170000 sys 0.830000 (best of 3) ! wall 3.863810 comb 3.860000 user 3.000000 sys 0.860000 (best of 3) ! bundle2 part seek() ! wall 6.267110 comb 6.250000 user 3.480000 sys 2.770000 (best of 3) ! wall 6.213387 comb 6.200000 user 3.350000 sys 2.850000 (best of 3) ! bundle2 part read(8k) ! wall 3.404164 comb 3.400000 user 2.650000 sys 0.750000 (best of 3) ! wall 3.241099 comb 3.250000 user 2.560000 sys 0.690000 (best of 3) ! bundle2 part read(16k) ! wall 3.197972 comb 3.200000 user 2.490000 sys 0.710000 (best of 4) ! wall 3.003930 comb 3.000000 user 2.270000 sys 0.730000 (best of 4) ! bundle2 part read(32k) ! wall 3.060557 comb 3.060000 user 2.340000 sys 0.720000 (best of 4) ! wall 2.904695 comb 2.900000 user 2.160000 sys 0.740000 (best of 4) ! bundle2 part read(128k) ! wall 2.952209 comb 2.950000 user 2.230000 sys 0.720000 (best of 4) ! wall 2.776140 comb 2.780000 user 2.070000 sys 0.710000 (best of 4) Profiling now says most remaining time is spent in util.chunkbuffer. I already heavily optimized that data structure several releases ago. So we'll likely get little more performance out of bundle2 reading while still retaining util.chunkbuffer(). Differential Revision: https://phab.mercurial-scm.org/D1393
mercurial/bundle2.py
--- a/mercurial/bundle2.py	Mon Nov 13 21:48:35 2017 -0800
+++ b/mercurial/bundle2.py	Mon Nov 13 21:54:46 2017 -0800
@@ -1196,11 +1196,14 @@
     dolog = ui.configbool('devel', 'bundle2.debug')
     debug = ui.debug
 
-    headersize = struct.calcsize(_fpayloadsize)
+    headerstruct = struct.Struct(_fpayloadsize)
+    headersize = headerstruct.size
+    unpack = headerstruct.unpack
+
     readexactly = changegroup.readexactly
     read = fh.read
 
-    chunksize = _unpack(_fpayloadsize, readexactly(fh, headersize))[0]
+    chunksize = unpack(readexactly(fh, headersize))[0]
     indebug(ui, 'payload chunk size: %i' % chunksize)
 
     # changegroup.readexactly() is inlined below for performance.
@@ -1227,7 +1230,7 @@
                                 ' (got %d bytes, expected %d)') %
                               (len(s), chunksize))
 
-        chunksize = _unpack(_fpayloadsize, s)[0]
+        chunksize = unpack(s)[0]
 
         # indebug() inlined for performance.
         if dolog: