py3: avoid use of basestring
"In this case, result is a source variable of a list to be returned, it
shouldn't be unicode. Hence we can use bytes instead of basestring here." -Yuya
py3: use unicodes in __slots__
__slots__ in Python 3 accepts only unicodes and there is no harm using
unicodes in __slots__. So just adding u'' is fine. Previous occurences of this
problem are treated the same way.
memctx: allow the metadataonlyctx thats reusing the manifest node
When we have a lot of files writing a new manifest revision can be expensive.
This commit adds a possibility for memctx to reuse a manifest from a different
commit. This can be beneficial for commands that are creating metadata changes
without any actual files changed like "hg metaedit" in evolve extension.
I will send the change for evolve that leverages this once this is accepted.
localrepo: make it possible to reuse manifest when commiting context
This makes the commit function understand the context that's reusing manifest.
httppeer: assign Vary request header last
In preparation for adding another value to it in a subsequent patch.
While I was here, I added some empty lines because walls of text
are hard to read.
wireproto: only advertise HTTP-specific capabilities to HTTP peers (BC)
Previously, the capabilities list was protocol agnostic and we
advertised the same capabilities list to all clients, regardless of
transport protocol.
A few capabilities are specific to HTTP. I see no good reason why we
should advertise them to SSH clients. So this patch limits their
advertisement to HTTP clients.
This patch is BC, but SSH clients shouldn't be using the removed
capabilities so there should be no impact.
protocol: declare transport protocol name
We add an attribute to the HTTP and SSH protocol implementations
identifying the transport so future patches can conditionally
expose capabilities on a per-transport basis.
bdiff: early pruning of common prefix before doing expensive computations
It seems quite common that files don't change completely. New lines are often
pretty much appended, and modifications will often only change a small section
of the file which on average will be in the middle.
There can thus be a big win by pruning a common prefix before starting the more
expensive search for longest common substrings.
Worst case, it will scan through a long sequence of similar bytes without
encountering a newline. Splitlines will then have to do the same again ...
twice for each side. If similar lines are found, splitlines will save the
double iteration and hashing of the lines ... plus there will be less lines to
find common substrings in.
This change might in some cases make the algorith pick shorter or less optimal
common substrings. We can't have the cake and eat it.
This make hg --time bundle --base null -r 4.0 go from 14.5 to 15 s - a 3%
increase.
On mozilla-unified:
perfbdiff -m
3041e4d59df2
! wall 0.053088 comb 0.060000 user 0.060000 sys 0.000000 (best of 100) to
! wall 0.024618 comb 0.020000 user 0.020000 sys 0.000000 (best of 116)
perfbdiff
0e9928989e9c --alldata --count 10
! wall 0.702075 comb 0.700000 user 0.700000 sys 0.000000 (best of 15) to
! wall 0.579235 comb 0.580000 user 0.580000 sys 0.000000 (best of 18)