Gregory Szorc <gregory.szorc@gmail.com> [Mon, 13 Nov 2017 20:03:02 -0800] rev 35114
bundle2: implement consume() API on unbundlepart
We want bundle parts to not be seekable by default. That means
eliminating the generic seek() method.
A common pattern in bundle2.py is to seek to the end of the part
data. This is mainly used by the part iteration code to ensure
the underlying stream is advanced to the next bundle part.
In this commit, we establish a dedicated API for consuming a
bundle2 part data. We switch users of seek() to it.
The old implementation of seek(0, os.SEEK_END) would effectively
call self.read(). The new implementation calls self.read(32768)
in a loop. The old implementation would therefore assemble a
buffer to hold all remaining data being seeked over. For seeking
over large bundle parts, this would involve a large allocation and
a lot of overhead to collect intermediate data! This overhead can
be seen in the results for `hg perfbundleread`:
! bundle2 iterparts()
! wall 10.891305 comb 10.820000 user 7.990000 sys 2.830000 (best of 3)
! wall 8.070791 comb 8.060000 user 7.180000 sys 0.880000 (best of 3)
! bundle2 part seek()
! wall 12.991478 comb 10.390000 user 7.720000 sys 2.670000 (best of 3)
! wall 10.370142 comb 10.350000 user 7.430000 sys 2.920000 (best of 3)
Of course, skipping over large payload data isn't likely very common.
So I doubt the performance wins will be observed in the wild.
Differential Revision: https://phab.mercurial-scm.org/D1388
Gregory Szorc <gregory.szorc@gmail.com> [Sun, 12 Nov 2017 19:46:15 -0800] rev 35113
bundle2: implement generic part payload decoder
The previous commit extracted _payloadchunks() to a new derived class.
There was still a reference to this method in unbundlepart, making
unbundlepart unusable on its own.
This commit implements a generic version of a bundle2 part payload
decoder, without offset tracking. seekableunbundlepart._payloadchunks()
has been refactored to consume it, adding offset tracking like before.
We also implement unbundlepart._payloadchunks(), which is a thin
wrapper for it. Since we never instantiate unbundlepart directly,
this new method is not used. This will be changed in subsequent
commits.
The new implementation also inlines some simple code from unpackermixin
and adds some local variable to prevent extra function calls and
attribute lookups. `hg perfbundleread` on an uncompressed Firefox
bundle seems to show a minor win:
! bundle2 iterparts()
! wall 12.593258 comb 12.250000 user 8.870000 sys 3.380000 (best of 3)
! wall 10.891305 comb 10.820000 user 7.990000 sys 2.830000 (best of 3)
! bundle2 part seek()
! wall 13.173163 comb 11.100000 user 8.390000 sys 2.710000 (best of 3)
! wall 12.991478 comb 10.390000 user 7.720000 sys 2.670000 (best of 3)
! bundle2 part read(8k)
! wall 9.483612 comb 9.480000 user 8.420000 sys 1.060000 (best of 3)
! wall 8.599892 comb 8.580000 user 7.720000 sys 0.860000 (best of 3)
! bundle2 part read(16k)
! wall 9.159815 comb 9.150000 user 8.220000 sys 0.930000 (best of 3)
! wall 8.265361 comb 8.250000 user 7.360000 sys 0.890000 (best of 3)
! bundle2 part read(32k)
! wall 9.141308 comb 9.130000 user 8.220000 sys 0.910000 (best of 3)
! wall 8.290308 comb 8.280000 user 7.330000 sys 0.950000 (best of 3)
! bundle2 part read(128k)
! wall 8.880587 comb 8.850000 user 7.960000 sys 0.890000 (best of 3)
! wall 8.204900 comb 8.150000 user 7.210000 sys 0.940000 (best of 3)
Function call overhead in Python strikes again!
Of course, bundle2 decoding CPU overhead is likely small compared to
decompression and changegroup application. But every little bit helps.
Differential Revision: https://phab.mercurial-scm.org/D1387
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 13 Nov 2017 19:22:11 -0800] rev 35112
bundle2: extract logic for seeking bundle2 part into own class
Currently, unbundlepart classes support bi-directional seeking.
Most consumers of unbundlepart only ever seek forward - typically
as part of moving to the end of the bundle part so they can move
on to the next one. But regardless of the actual usage of the
part, instances maintain an index mapping offsets within the
underlying raw payload to offsets within the decoded payload.
Maintaining the mapping of offset data can be expensive in terms of
memory use. Furthermore, many bundle2 consumers don't have access
to an underlying seekable stream. This includes all compressed
bundles. So maintaining offset data when the underlying stream
can't be seeked anyway is wasteful. And since many bundle2 streams
can't be seeked, it seems like a bad idea to expose a seek API
in bundle2 parts by default. If you provide them, people will
attempt to use them.
Seekable bundle2 parts should be the exception, not the rule. This
commit starts the process dividing unbundlepart into 2 classes: a
base class that supports linear, one-time reads and a child class
that supports bi-directional seeking. In this first commit, we
split various methods and attributes out into a new
"seekableunbundlepart" class. Previous instantiators of "unbundlepart"
now instantiate "seekableunbundlepart." This preserves backwards
compatibility. The coupling between the classes is still tight:
"unbundlepart" cannot be used on its own. This will be addressed
in subsequent commits.
Differential Revision: https://phab.mercurial-scm.org/D1386
Augie Fackler <augie@google.com> [Wed, 29 Nov 2017 17:49:08 -0500] rev 35111
merge with i18n
Wagner Bruna <wbruna@softwareexpress.com.br> [Tue, 21 Nov 2017 13:50:25 -0200] rev 35110
i18n-pt_BR: synchronized with
cabc840ffdee
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 13 Nov 2017 19:20:34 -0800] rev 35109
perf: add command to benchmark bundle reading
Upcoming commits will be refactoring bundle2 I/O code.
This commit establishes a `hg perfbundleread` command that measures
how long it takes to read a bundle using various mechanisms.
As a baseline, here's output from an uncompressed bundle1
bundle of my Firefox repo (7,098,622,890 bytes):
! read(8k)
! wall 0.763481 comb 0.760000 user 0.160000 sys 0.600000 (best of 6)
! read(16k)
! wall 0.644512 comb 0.640000 user 0.110000 sys 0.530000 (best of 16)
! read(32k)
! wall 0.581172 comb 0.590000 user 0.060000 sys 0.530000 (best of 18)
! read(128k)
! wall 0.535183 comb 0.530000 user 0.010000 sys 0.520000 (best of 19)
! cg1 deltaiter()
! wall 0.873500 comb 0.880000 user 0.840000 sys 0.040000 (best of 12)
! cg1 getchunks()
! wall 6.283797 comb 6.270000 user 5.570000 sys 0.700000 (best of 3)
! cg1 read(8k)
! wall 1.097173 comb 1.100000 user 0.400000 sys 0.700000 (best of 10)
! cg1 read(16k)
! wall 0.810750 comb 0.800000 user 0.200000 sys 0.600000 (best of 13)
! cg1 read(32k)
! wall 0.671215 comb 0.670000 user 0.110000 sys 0.560000 (best of 15)
! cg1 read(128k)
! wall 0.597857 comb 0.600000 user 0.020000 sys 0.580000 (best of 15)
And from an uncompressed bundle2 bundle (6,070,036,163 bytes):
! read(8k)
! wall 0.676997 comb 0.680000 user 0.160000 sys 0.520000 (best of 15)
! read(16k)
! wall 0.592706 comb 0.590000 user 0.080000 sys 0.510000 (best of 17)
! read(32k)
! wall 0.529395 comb 0.530000 user 0.050000 sys 0.480000 (best of 16)
! read(128k)
! wall 0.491270 comb 0.490000 user 0.010000 sys 0.480000 (best of 19)
! bundle2 forwardchunks()
! wall 2.997131 comb 2.990000 user 2.270000 sys 0.720000 (best of 4)
! bundle2 iterparts()
! wall 12.247197 comb 10.670000 user 8.170000 sys 2.500000 (best of 3)
! bundle2 part seek()
! wall 11.761675 comb 10.500000 user 8.240000 sys 2.260000 (best of 3)
! bundle2 part read(8k)
! wall 9.116163 comb 9.110000 user 8.240000 sys 0.870000 (best of 3)
! bundle2 part read(16k)
! wall 8.984362 comb 8.970000 user 8.110000 sys 0.860000 (best of 3)
! bundle2 part read(32k)
! wall 8.758364 comb 8.740000 user 7.860000 sys 0.880000 (best of 3)
! bundle2 part read(128k)
! wall 8.749040 comb 8.730000 user 7.830000 sys 0.900000 (best of 3)
We already see some interesting data. Notably that bundle2 has
significant overhead compared to bundle1. This matters for e.g. stream
clone bundles, which can be applied at >1Gbps.
Differential Revision: https://phab.mercurial-scm.org/D1385
Zuzanna Mroczek <zuza@fb.com> [Mon, 20 Nov 2017 01:40:26 -0800] rev 35108
sshpeer: add a configurable hint for the ssh error message
Adding a possibility to configure error hint to be shown in the case of problems with SSH. Example of such hint can be "Please see http://company/internalwiki/ssh.html".
Test Plan:
- Ran hg pull with broken link and verified the output has no hint by default:
```
pulling from ssh://brokenrepository.com//repo
remote: ssh: Could not resolve hostname brokenrepository.com: Name or service not known
abort: no suitable response from remote hg!
```
- Run hg pull --config ui.ssherrorhint="Please see http://company/internalwiki/ssh.html":
```
pulling from ssh://brokenrepository.com//repo
remote: ssh: Could not resolve hostname brokenrepository.com: Name or service not known
abort: no suitable response from remote hg!
(Please see http://company/internalwiki/ssh.html)
```
Differential Revision: https://phab.mercurial-scm.org/D1431