Matt Harbison <matt_harbison@yahoo.com> [Sun, 21 Oct 2018 22:26:00 -0400] rev 40386
lfs: consult the narrow matcher when extracting pointers from ctx (
issue5794)
I added a testcase for lfs to all narrow tests, and the following failed:
test-narrow-acl.t
test-narrow-exchange.t
test-narrow-patterns.t
test-narrow-strip.t
test-narrow-trackedcmd.t
test-narrow-widen.t
test-narrow.t
The first two still have errors in the pretxnchangegroup on clone and (receiving
a) push, which I'm still looking into (
4d63f3bc1e1a fixed something in this area
already). These two modified tests seem to cover the things that failed in the
remaining narrow tests, i.e. `hg tracked` and `hg strip`, so I didn't bother
enabling the testcases elsewhere. Maybe we should, but it's 68 tests total.
Yuya Nishihara <yuya@tcha.org> [Sat, 20 Oct 2018 20:25:56 +0900] rev 40385
statprof: fix overflow while skipping boilerplate parts
I got IndexError randomly because of stack[i] where i = len(stack).
Yuya Nishihara <yuya@tcha.org> [Sat, 20 Oct 2018 20:15:48 +0900] rev 40384
statprof: fix indent level of fp.write() (
issue6004)
It was changed at
9d3034348c4f by mistake.
Matt Harbison <matt_harbison@yahoo.com> [Fri, 19 Oct 2018 22:31:47 -0400] rev 40383
py3: stringify setupversion on Windows
This was stringified a few lines above for non Windows platforms, but `version`
remains bytes. The old code effectively undid the conversion, and triggered a
warning in setuptools when building.
Matt Harbison <matt_harbison@yahoo.com> [Fri, 19 Oct 2018 23:47:38 -0400] rev 40382
tests: add coverage for some untested areas of hgweb
The fact that these mimetype guesses weren't blowing up anywhere on py3 prior to
9310037f0636 was the giveaway. The annotate function is a bit unusual in that
it renders the page with a 500 in the middle, so I left the HTML output. For
the other functions, checking the access log is enough.
Pulkit Goyal <pulkit@yandex-team.ru> [Fri, 19 Oct 2018 23:30:56 +0300] rev 40381
statprof: update the name as the i increases (
issue6003)
2864f8d3fcd6 while working on py3 fix, take out the name building out of the
loop so we were not building the new stack-name for each i, rather we were using
the first one again and again.
The test changes shows the profile is now working.
Differential Revision: https://phab.mercurial-scm.org/D5172
Pulkit Goyal <pulkit@yandex-team.ru> [Fri, 19 Oct 2018 23:18:29 +0300] rev 40380
test: show more profile lines in test-profile.t
This shows that we don't output anything after the first line and demonstrate
issue6003.
Differential Revision: https://phab.mercurial-scm.org/D5171
Augie Fackler <augie@google.com> [Fri, 19 Oct 2018 11:45:51 -0400] rev 40379
keepalive: use getattr to avoid AttributeErrors when vcr is in use
Fixes test-phabricator.t.
Differential Revision: https://phab.mercurial-scm.org/D5160
Augie Fackler <augie@google.com> [Fri, 19 Oct 2018 11:45:25 -0400] rev 40378
phabricator: do more of the VCR work in demandimport.deactivated()
If I don't do this, VCR gets confused looking for pycurl and other
libraries. I have no idea how this ever worked.
Differential Revision: https://phab.mercurial-scm.org/D5159
Augie Fackler <augie@google.com> [Fri, 19 Oct 2018 11:28:29 -0400] rev 40377
tests: sleep longer in test-logtoprocess.t
We should probably write some sort of helper that can wait N seconds
for all specified values to appear in a file or something, but for now
this will fix the FreeBSD buildbot.
Differential Revision: https://phab.mercurial-scm.org/D5157
Augie Fackler <augie@google.com> [Fri, 19 Oct 2018 11:31:18 -0400] rev 40376
tests: fix pyflakes warning in test-duplicateoptions.py
Differential Revision: https://phab.mercurial-scm.org/D5158
Gregory Szorc <gregory.szorc@gmail.com> [Fri, 19 Oct 2018 16:34:45 +0200] rev 40375
branchmap: avoid changelog and attribute lookups in replacecache()
This should make things faster. I'm not sure which operations would benefit
from it though. Maybe branchmap application on clone?
Differential Revision: https://phab.mercurial-scm.org/D5162
Gregory Szorc <gregory.szorc@gmail.com> [Fri, 19 Oct 2018 16:16:17 +0200] rev 40374
branchmap: pass changelog into branchmap functions
As part of building the branchmap, we loop over revs and call branchmap()
or _branchmap(). Previously, these functions were accessing repo.changelog.
We know from past experience that repo.changelog in loops is bad for
performance.
This commit teaches the branchmap code to pass a changelog instance into
branchmap() and _branchmap() so we don't need to pay this penalty.
On my MBP, this appears to show a speedup on a clone of the
mozilla-unified repo:
$ hg perfbranchmap --clear-revbranch
! base
! wall 21.078160 comb 21.070000 user 20.920000 sys 0.150000 (best of 3)
! wall 20.574682 comb 20.560000 user 20.400000 sys 0.160000 (best of 3)
$ hg perfbranchmap
! base
! wall 4.880413 comb 4.870000 user 4.860000 sys 0.010000 (best of 3)
! wall 4.573968 comb 4.560000 user 4.550000 sys 0.010000 (best of 3)
Differential Revision: https://phab.mercurial-scm.org/D5161
Augie Fackler <augie@google.com> [Thu, 18 Oct 2018 16:36:10 -0400] rev 40373
fuzz: move many initialization steps into LLVMFuzzerInitialize
Doing this means that things we intentionally leak (eg type objects)
no longer confuse AddressSanitizer, so now we can run the fuzzer MUCH
longer.
Differential Revision: https://phab.mercurial-scm.org/D5154
Martin von Zweigbergk <martinvonz@google.com> [Thu, 17 Nov 2016 15:51:33 -0800] rev 40372
bundle2: fix broken compression engine assertion
bundletype() is a function, so it needs to be called, and it is
documented to return a 2-tuple. This code is untested, so that's why
we haven't noticed the bad assertion.
Differential Revision: https://phab.mercurial-scm.org/D5155
Matt Harbison <matt_harbison@yahoo.com> [Thu, 18 Oct 2018 17:54:07 -0400] rev 40371
tests: glob over a difference between Windows 7 and Window 10
The error value is 11001 on Windows 10. I have no idea why it changed, but it
seems unimportant.
Matt Harbison <matt_harbison@yahoo.com> [Thu, 18 Oct 2018 18:11:16 -0400] rev 40370
py3: fix module imports in test-highlight.t
The hash changes are because the *.py file is committed to the repo.
Matt Harbison <matt_harbison@yahoo.com> [Wed, 17 Oct 2018 23:33:43 -0400] rev 40369
py3: fix module imports in tests, as flagged by test-check-module-imports.t
I have no idea why these aren't flagged with python2. I excluded
test-highlight.t for now to make this easier to review- the changed code is
committed to a repo, which has cascading changes on the rest of the test.
There's a mix of bytes and str in the imports dict of contrib/import-checker.py
that crashed it half way through listing out these errors. I couldn't figure
out how to fix that properly, so I was lazy and applied this on py3, to find the
rest of the errors:
diff --git a/contrib/import-checker.py b/contrib/import-checker.py
--- a/contrib/import-checker.py
+++ b/contrib/import-checker.py
@@ -626,7 +626,12 @@ def find_cycles(imports):
top.foo -> top.qux -> top.foo
"""
cycles = set()
- for mod in sorted(imports.keys()):
+ def sort(v):
+ if isinstance(v, bytes):
+ return v.decode('ascii')
+ return v
+
+ for mod in sorted(imports.keys(), key=sort):
try:
checkmod(mod, imports)
except CircularImport as e:
Matt Harbison <matt_harbison@yahoo.com> [Thu, 18 Oct 2018 21:55:47 -0400] rev 40368
lfs: don't add extension to hgrc after conversion (BC)
This is in the spirit of
bcf72d7b1524.
Yuya Nishihara <yuya@tcha.org> [Thu, 18 Oct 2018 21:00:07 +0900] rev 40367
addremove: add "ui." prefix to message color keys
I don't like fully-colorized status/warning messages, and I want to disable
them at all. If we'd supported a syntax like 'color.ui.*=none', I could
easily turn addremove.added/removed off as well as ui.error. This patch is
just for that.
Since addremove colors aren't released yet, which were added at
ddc1da134772,
there are no compatibility concerns.
Martin von Zweigbergk <martinvonz@google.com> [Thu, 09 Feb 2017 09:17:40 -0800] rev 40366
update: clarify update() call sites by specifying argument names
merge.update() takes a lot of parameters and I get confused all the
time which is which.
Differential Revision: https://phab.mercurial-scm.org/D5153
Martin von Zweigbergk <martinvonz@google.com> [Thu, 18 Oct 2018 10:11:08 -0700] rev 40365
debugcommands: avoid stack trace from debugindexstats in pure mode
This has been broken since I added it in
d71e0ba34d9b (debugcommands:
add a debugindexstats command, 2018-08-08). This patch also fixes the
test.
Differential Revision: https://phab.mercurial-scm.org/D5152
Augie Fackler <augie@google.com> [Thu, 18 Oct 2018 11:24:20 -0400] rev 40364
tests: fix up pure case of test-sqlitestore.t
This is clearly what the line should read based on the "force to zlib"
section below, so I'm guessing it just got overlooked during development.
Differential Revision: https://phab.mercurial-scm.org/D5151
Augie Fackler <augie@google.com> [Thu, 18 Oct 2018 11:14:04 -0400] rev 40363
tests: don't emit false failures when sqlite3 is missing
I'm honestly surprised we have buildbot coverage for this, but we do!
Differential Revision: https://phab.mercurial-scm.org/D5150
Yuya Nishihara <yuya@tcha.org> [Tue, 16 Oct 2018 08:48:23 +0200] rev 40362
py3: get around IOError variants in test-commandserver.t
Yuya Nishihara <yuya@tcha.org> [Tue, 16 Oct 2018 08:41:58 +0200] rev 40361
py3: don't use traceback.print_exc() in commandserver.py
It doesn't support a bytes stream on Python 3. This makes a traceback being
sent by one frame, but that shouldn't matter.
Yuya Nishihara <yuya@tcha.org> [Tue, 16 Oct 2018 08:29:24 +0200] rev 40360
py3: invalidate repository cache with system-string keys
# skip-blame just a few r'' prefixes
Yuya Nishihara <yuya@tcha.org> [Tue, 16 Oct 2018 08:20:03 +0200] rev 40359
py3: system-stringify file mode in commandserver.py
# skip-blame just r'' prefixes
Yuya Nishihara <yuya@tcha.org> [Tue, 16 Oct 2018 07:57:40 +0200] rev 40358
py3: alias next to __next__ in commandserver.py
Yuya Nishihara <yuya@tcha.org> [Tue, 16 Oct 2018 07:57:05 +0200] rev 40357
py3: system-stringify list of attributes to be forwarded from commandserver.py
# skip-blame just some r'' prefixes
Yuya Nishihara <yuya@tcha.org> [Tue, 16 Oct 2018 07:52:56 +0200] rev 40356
py3: import StringIO from test utility to test-commandserver.t
Yuya Nishihara <yuya@tcha.org> [Tue, 16 Oct 2018 07:38:31 +0200] rev 40355
py3: use bprint() helper in test-commandserver.t
Yuya Nishihara <yuya@tcha.org> [Tue, 16 Oct 2018 07:35:29 +0200] rev 40354
py3: byte-stringify most literals in test-commandserver.t
print() calls will be replaced by bprint().
# skip-blame just tons of b'' prefixes.
Augie Fackler <augie@google.com> [Sun, 14 Oct 2018 09:50:21 -0400] rev 40353
localrepo: ensure we properly %-format int in exception throw
I'm not thrilled with this, but it'll do.
Differential Revision: https://phab.mercurial-scm.org/D5107
Pulkit Goyal <pulkit@yandex-team.ru> [Thu, 18 Oct 2018 14:41:14 +0300] rev 40352
py3: add a r'' prefix in mercurial/exchange.py
# skip-blame because just r'' prefix
This fixes test-narrow-acl.t on py3 which was broken by one of the
earlier patches.
Differential Revision: https://phab.mercurial-scm.org/D5149
Pulkit Goyal <pulkit@yandex-team.ru> [Thu, 18 Oct 2018 14:37:38 +0300] rev 40351
py3: add 5 new passing tests to whitelist caught by buildbot
Thanks to everyone who is putting efforts in making hg py3 compatible.
Differential Revision: https://phab.mercurial-scm.org/D5148
Matt Harbison <matt_harbison@yahoo.com> [Wed, 17 Oct 2018 21:54:49 -0400] rev 40350
py3: fix test-import-context.t
Matt Harbison <matt_harbison@yahoo.com> [Mon, 15 Oct 2018 22:02:10 -0400] rev 40349
py3: restore perfstartup() prior to
b456b2e0ad9f on Windows
Otherwise the test errors out with:
--- c:/Users/Matt/projects/hg_py3/tests/test-contrib-perf.t
+++ c:/Users/Matt/projects/hg_py3/tests/test-contrib-perf.t.err
@@ -184,6 +184,8 @@
$ hg perfrevrange
$ hg perfrevset 'all()'
$ hg perfstartup
+ 'b'c:' is not recognized as an internal or external command,
+ operable program or batch file.
$ hg perfstatus
$ hg perftags
$ hg perftemplating
Matt Harbison <matt_harbison@yahoo.com> [Wed, 17 Oct 2018 21:05:43 -0400] rev 40348
help: document the server capabilities added by the LFS extension
I didn't bother marking these experimental because it references the extension
that is already marked experimental.
Matt Harbison <matt_harbison@yahoo.com> [Thu, 27 Sep 2018 21:54:13 -0400] rev 40347
py3: fix test-propertycache.py on Windows
Anton Shestakov <av6@dwimlabs.net> [Wed, 17 Oct 2018 21:00:36 +0800] rev 40346
commands: adjust metavariables as appropriate
Apart from looking better in hg help command, these strings are also helpful
when generating shell completions programmatically.
Augie Fackler <augie@google.com> [Sun, 14 Oct 2018 11:16:22 -0400] rev 40345
match: fix up a repr to not crash on Python 3
Differential Revision: https://phab.mercurial-scm.org/D5120
Martin von Zweigbergk <martinvonz@google.com> [Fri, 05 Oct 2018 11:07:34 -0700] rev 40344
narrow: when widening, don't include manifests the client already has
When widening, we already don't include the changelog (since
f1844a10ee19) and files that the client already has (since
c73c7653dfb9). However, we still include all manifests needed for the
new narrowspec. When using flat manifests, that means we resend all
the manifests even though the client necessarily has all of them. For
tree manifests, we unnecessarily resend the root manifests and any
subdirectory manifests that the client already has.
This patch makes it so we no longer resend manifests that the client
already has. It does so by passing an extra matcher to the changegroup
packer and it uses that for filtering out directories matching the old
matcher's visitdir(). For consistency between directories and files,
it also makes the filtering of files look at both old and new matcher
rather than passing in a diff matcher as we did before.
Differential Revision: https://phab.mercurial-scm.org/D4895
Martin von Zweigbergk <martinvonz@google.com> [Wed, 17 Oct 2018 09:30:07 -0700] rev 40343
tests: add test for widening from an empty clone
Narrow clones that track no paths currently don't even include the
root manifest (which is the only manifest when using flat
manifests). That means that when we widen from such a clone, we need
to make sure that we send the root manifest (and other manifests if
using tree manifests). That currently works because we always resend
all manifest that match the new narrowspec. However, we're about to
stop resending manifests that the client already has and there's a
risk of this breaking then, so let's add a test.
Differential Revision: https://phab.mercurial-scm.org/D5143
Martin von Zweigbergk <martinvonz@google.com> [Wed, 17 Oct 2018 11:43:39 -0700] rev 40342
subrepo: access status members by name instead of by position
Taking my first Mercurial project closer to completion.
Differential Revision: https://phab.mercurial-scm.org/D5144
Kyle Lippincott <spectral@google.com> [Tue, 16 Oct 2018 07:21:00 -0700] rev 40341
revisions: when using prefixhexnode, ensure we prefix "0"
Previously, if using `experimental.revisions.disambiguatewithin` (and it didn't
include rev0), and '0' was the shortest identifier in that disambiguation set,
we printed it as the shortest *without* a prefix. This was because we had logic
to determine "if the prefix is a pure integer, but starts with 0, we don't need
to prefix with 'x': 01 is not a synonym for revision #1", but didn't handle the
case where prefix == 0 (which is a pure integer, and starts with 0... but it
*is* "rev0").
Differential Revision: https://phab.mercurial-scm.org/D5113
Pulkit Goyal <pulkit@yandex-team.ru> [Wed, 03 Oct 2018 16:45:24 +0300] rev 40340
store: pass matcher to store.datafiles()
To get narrow stream clones working, we need a way to filter the storage files
using a matcher. This patch adds matcher as an argument to store.walk() and
store.datafiles() so that we can filter the files returned according to the
matcher.
Differential Revision: https://phab.mercurial-scm.org/D4850
Pulkit Goyal <pulkit@yandex-team.ru> [Wed, 03 Oct 2018 17:59:05 +0300] rev 40339
streamclone: pass narrowing related info in _walkstreamfiles()
This patch build a matcher using the include and exclude arguments we have in
generatev2() and pass that matcher into _walkstreamfiles(). This will help us
in filtering files we stream depending on the includes and excludes passed in
by the user.
Differential Revision: https://phab.mercurial-scm.org/D4851
Pulkit Goyal <pulkit@yandex-team.ru> [Wed, 26 Sep 2018 17:20:04 +0300] rev 40338
streamclone: new server config and some API changes for narrow stream clones
This patch introduces a new server config
`experimental.server.stream-narrow-clones` which if set to True will advertise
that the server supports narrow stream clones.
This patch also pass on the includes and excludes from getbundle command to
streamclone generation code.
There is a test added to show that the includepats and excludepats are correctly
passed.
Upcoming patches will implement storage layer filtering for streamclones and
then we can remove the temporary error and plug in the whole logic together to
make narrow stream clones working.
Differential Revision: https://phab.mercurial-scm.org/D5137
Pulkit Goyal <pulkit@yandex-team.ru> [Wed, 10 Oct 2018 17:36:59 +0300] rev 40337
narrow: only send the narrowspecs back if ACL in play
I am unable to think why we need to send narrowspecs back from the server. The
current state adds a 'narrow:spec' part to each changegroup which is generated
when narrow extension is enabled. So we are sending narrowspecs on pull also.
There is a problem with sending the narrowspecs the way we are doing it right
now. We add include and exclude as parameter of the 'narrow:spec' bundle2 part.
The the len of include or exclude string increase 255 which is obvious while
working on large repos, bundle2 generation code breaks. For more on that refer
issue5952 on bugzilla.
I was thinking why we need to send the narrowspecs back, and deleted the
'narrow:spec' bundle2 part generation code and found that only narrow-acl test
has some failure.
With this patch, we will only send the 'narrow:spec' bundle2 part if ACL is
enabled because the original narrowspecs in those cases can be a subset of
narrowspecs user requested.
There are phase related output change in couple of tests. The output change
shows that we are now dealing in public phases completely. So maybe sending the
narrow:spec bundle2 part was preventing phases being exchanged or phase bundle2
data being applied.
Differential Revision: https://phab.mercurial-scm.org/D4931
Anton Shestakov <av6@dwimlabs.net> [Wed, 17 Oct 2018 22:32:50 +0800] rev 40336
zsh_completion: add -l/--list flag for hg bookmarks completion
Flags in parentheses are mutually exclusive. Logic is taken from commands.py:
selactions = [k for k in ['delete', 'rename', 'list'] if opts.get(k)]
if len(selactions) > 1:
raise error.Abort(_('--%s and --%s are incompatible')
% tuple(selactions[:2]))
...
if rev and action in {'delete', 'rename', 'list'}:
raise error.Abort(_("--rev is incompatible with --%s") % action)
if inactive and action in {'delete', 'list'}:
raise error.Abort(_("--inactive is incompatible with --%s") % action)
Differential Revision: https://phab.mercurial-scm.org/D5142
Anton Shestakov <av6@dwimlabs.net> [Wed, 17 Oct 2018 22:31:34 +0800] rev 40335
zsh_completion: fix a couple of flags still not being perfect
Differential Revision: https://phab.mercurial-scm.org/D5141
Anton Shestakov <av6@dwimlabs.net> [Wed, 17 Oct 2018 22:27:10 +0800] rev 40334
zsh_completion: use $_hg_remote_opts after it is defined
Before this patch, zsh wouldn't complete --ssh, --remotecmd or --insecure for
hg clone.
While at it, replace --uncompressed by --stream.
Differential Revision: https://phab.mercurial-scm.org/D5140
Martin von Zweigbergk <martinvonz@google.com> [Wed, 17 Oct 2018 11:56:03 -0700] rev 40333
tests: fix "running x tests using y ... " output in a few more places
These seem to have been missed by
1039404c5e1d (run-tests: print
number of tests and parallel process count, 2018-10-13).
Differential Revision: https://phab.mercurial-scm.org/D5145
Mark Thomas <mbthomas@fb.com> [Sun, 14 Oct 2018 09:34:21 +0000] rev 40332
py3: fix test-hardlinks.t
Differential Revision: https://phab.mercurial-scm.org/D5096
Gregory Szorc <gregory.szorc@gmail.com> [Fri, 14 Sep 2018 14:56:13 -0700] rev 40331
exchange: support declaring pull depth
Upcoming commits will teach exchangev2 how to perform a shallow
clone. This commit teaches hg.clone(), exchange.pull(), and
exchange.pulloperation to recognize a request for a shallow clone
by having the caller specify a numeric depth of the maximum number of
ancestor changesets to fetch.
There are certainly other ways we could control shallow-ness. But this
one is simple to implement and is also how the narrow extension
controls things. So it seems to make sense to start here.
Differential Revision: https://phab.mercurial-scm.org/D5136
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 17 Oct 2018 10:10:05 +0200] rev 40330
exchangev2: support for calling rawstorefiledata to retrieve raw files
This is somewhat hacky. For that I apologize.
At the 4.8 Sprint, we decided we wanted to land support in wireprotov2 for doing
a partial clone with changelog and manifestlog bootstrapped from a "stream clone"
like primitive.
This commit implements the client-side bits necessary to facilitate that.
If the new server-side command for obtaining raw files data is available, we
call it to get the raw files for the changelog and manifestlog. Then we
fall through to an incremental pull. But when fetching files data, instead
of using the list of a changesets and manifests that we fetched via the
"changesetdata" command, we do a linear scan of the repo and resolve the
changeset and manifest nodes along with the manifest linkrevs.
Differential Revision: https://phab.mercurial-scm.org/D5135
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 16 Oct 2018 21:31:21 +0200] rev 40329
wireprotov2: implement command for retrieving raw store files
Implementing shallow clone of the changelog is hard. We want the 4.8
release to have a fast implementation of partial clone in wireprotov2. In
order to achieve fast, we can't use deltas for transferring changelog and
manifestlog data.
Per discussions at the 4.8 sprint, this commit implements a somwwhat hacky
and likely-to-be-changed-drastically-or-dropped command in wireprotov2 that
facilitates access to raw store files, namely the changelog and manifestlog.
Using this command, clients can perform a "stream clone" of sorts for just
the changelog and manifestlog. This will allow clients to fetch the changelog
and manifest revlogs, stream them to disk (which should be fast), then follow
up filesdata requests for files revision data for a particular changeset.
Differential Revision: https://phab.mercurial-scm.org/D5134
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 16 Oct 2018 21:35:33 +0200] rev 40328
wireprotov2: add response type that serializes to indefinite length bytestring
This will be needed in a future patch.
Differential Revision: https://phab.mercurial-scm.org/D5133
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 26 Sep 2018 14:38:43 -0700] rev 40327
exchangev2: recognize narrow patterns when pulling
pulloperation instances were recently taught to record file
include and exclude patterns to facilitate narrow file transfer.
Teaching the exchangev2 code to transfer a subset of files is
as simple as constructing a narrow matcher from these patterns and
filtering all seen file paths through it.
Keep in mind that this change only influences file data: we're
still fetching all changeset and manifest data. So, there's still
a ton of "partial clone" to implement in exchangev2.
On a personal note, I derive gratification that this feature requires
very few lines of new code to implement.
To test this, we implemented a minimal extension which allows us to specify
--include/--exclude to clone. While the narrow extension provides these
arguments, I explicitly wanted to test this functionality without the
narrow extension enabled, as that extension monkeypatches various things
and I want to isolate the behavior of core Mercurial.
Differential Revision: https://phab.mercurial-scm.org/D5132
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 09 Oct 2018 08:50:13 -0700] rev 40326
sqlitestore: file storage backend using SQLite
This commit provides an extension which uses SQLite to store file
data (as opposed to revlogs).
As the inline documentation describes, there are still several
aspects to the extension that are incomplete. But it's a start.
The extension does support basic clone, checkout, and commit
workflows, which makes it suitable for simple use cases.
One notable missing feature is support for "bundlerepos." This is
probably responsible for the most test failures when the extension
is activated as part of the test suite.
All revision data is stored in SQLite. Data is stored as zstd
compressed chunks (default if zstd is available), zlib compressed
chunks (default if zstd is not available), or raw chunks (if
configured or if a compressed delta is not smaller than the raw
delta). This makes things very similar to revlogs.
Unlike revlogs, the extension doesn't yet enforce a limit on delta
chain length. This is an obvious limitation and should be addressed.
This is somewhat mitigated by the use of zstd, which is much faster
than zlib to decompress.
There is a dedicated table for storing deltas. Deltas are stored
by the SHA-1 hash of their uncompressed content. The "fileindex" table
has columns that reference the delta for each revision and the base
delta that delta should be applied against. A recursive SQL query
is used to resolve the delta chain along with the delta data.
By storing deltas by hash, we are able to de-duplicate delta storage!
With revlogs, the same deltas in different revlogs would result in
duplicate storage of that delta. In this scheme, inserting the
duplicate delta is a no-op and delta chains simply reference the
existing delta.
When initially implementing this extension, I did not have
content-indexed deltas and deltas could be duplicated across files
(just like revlogs). When I implemented content-indexed deltas, the
size of the SQLite database for a full clone of mozilla-unified
dropped:
before: 2,554,261,504 bytes
after: 2,488,754,176 bytes
Surprisingly, this is still larger than the bytes size of revlog
files:
revlog files: 2,104,861,230 bytes
du -b: 2,254,381,614
I would have expected storage to be smaller since we're not limiting
delta chain length and since we're using zstd instead of zlib. I
suspect the SQLite indexes and per-column overhead account for the
bulk of the differences. (Keep in mind that revlog uses a 64-byte
packed struct for revision index data and deltas are stored without
padding. Aside from the 12 unused bytes in the 32 byte node field,
revlogs are pretty efficient.) Another source of overhead is file
name storage. With revlogs, file names are stored in the filesystem.
But with SQLite, we need to store file names in the database. This is
roughly equivalent to the size of the fncache file, which for the
mozilla-unified repository is ~34MB.
Since the SQLite database isn't append-only and since delta chains
can reference any delta, this opens some interesting possibilities.
For example, we could store deltas in reverse, such that fulltexts
are stored for newer revisions and deltas are applied to reconstruct
older revisions. This is likely a more optimal storage strategy for
version control, as new data tends to be more frequently accessed
than old data. We would obviously need wire protocol support for
transferring revision data from newest to oldest. And we would
probably need some kind of mechanism for "re-encoding" stores. But
it should be doable.
This extension is very much experimental quality. There are a handful
of features that don't work. It probably isn't suitable for day-to-day
use. But it could be used in limited cases (e.g. read-only checkouts
like in CI). And it is also a good proving ground for alternate
storage backends. As we continue to define interfaces for all things
storage, it will be useful to have a viable alternate storage backend
to see how things shake out in practice.
test-storage.py passes on Python 2 and introduces no new test failures on
Python 3. Having the storage-level unit tests has proved to be insanely
useful when developing this extension. Those tests caught numerous bugs
during development and I'm convinced this style of testing is the way
forward for ensuring alternate storage backends work as intended. Of
course, test coverage isn't close to what it needs to be. But it is
a start. And what coverage we have gives me confidence that basic store
functionality is implemented properly.
Differential Revision: https://phab.mercurial-scm.org/D4928
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 16 Oct 2018 15:36:19 +0200] rev 40325
storageutil: extract most of peek_censored from revlog
This function is super hacky and isn't correct 100% of the time. I'm going
to need this functionality on a future non-revlog store.
Let's copy things to storageutil so this code only exists once.
Differential Revision: https://phab.mercurial-scm.org/D5118
Matt Harbison <matt_harbison@yahoo.com> [Thu, 20 Sep 2018 17:27:01 -0700] rev 40324
lfs: autoload the extension when cloning from repo with lfs enabled
This is based on a patch by Gregory Szorc. I made small adjustments to
clean up the messaging when the server has the extension enabled, but the
client has it disabled (to prevent autoloading). Additionally, I added
a second server capability to distinguish between the server having the
extension enabled, and the server having LFS commits. This helps prevent
unnecessary requirement propagation- the client shouldn't add a requirement
that the server doesn't have, just because the server had the extension
loaded. The TODO I had about advertising a capability when the server can
natively serve up blobs isn't relevant anymore (we've had 2 releases that
support this), so I dropped it.
Currently, we lazily add the "lfs" requirement to a repo when we first
encounter LFS data. Due to a pretxnchangegroup hook that looks for LFS
data, this can happen at the end of clone.
Now that we have more control over how repositories are created, we can
do better.
This commit adds a repo creation option to add the "lfs" requirement.
hg.clone() sets this creation option if the remote peer is advertising
lfs usage (as opposed to just support needed to push).
So, what this change effectively does is have cloned repos
automatically inherit the "lfs" requirement.
Differential Revision: https://phab.mercurial-scm.org/D5130
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 16 Oct 2018 16:24:46 +0200] rev 40323
testing: switch to inserting deltas
As the comment in the test specifies, this was relying on storage backend
implementation details. We switch to inserting a raw delta, skipping the
regular insert path to ensure we have the desired outcome. This required
implementing support for handling deltas in the revlog testing code.
Differential Revision: https://phab.mercurial-scm.org/D5116
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 16 Oct 2018 15:24:06 +0200] rev 40322
testing: remove expectation of error on bad node insert
addgroup() doesn't necessarily validate the hashes of each incoming revision.
This is an optimization that allows delta group application to complete faster.
The fact that revlog raises in this particular test is an implementation detail
due to the way revlogs are testing multiple deltas.
Differential Revision: https://phab.mercurial-scm.org/D5115
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 16 Oct 2018 17:45:39 +0200] rev 40321
storageutil: convert fileid to bytes to avoid cast to %s
test-storage.py manages to trigger this on Python 3.
Differential Revision: https://phab.mercurial-scm.org/D5117
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 16 Oct 2018 17:48:28 +0200] rev 40320
tests: use byte literals in test-storage.py
This fixes a Python 3 breakage due to unknown key due to str/bytes type
mismatch.
# skip-blame just b'' literals
Differential Revision: https://phab.mercurial-scm.org/D5114
Yuya Nishihara <yuya@tcha.org> [Tue, 16 Oct 2018 07:19:38 +0200] rev 40319
py3: byte-stringify literals in test-keyword.t
# skip-blame just some b'' prefixes
Yuya Nishihara <yuya@tcha.org> [Tue, 16 Oct 2018 07:18:30 +0200] rev 40318
py3: flush std streams before/after running user code in heredoctest.py
Otherwise, things written to stdout.buffer would be interleaved.
Yuya Nishihara <yuya@tcha.org> [Tue, 16 Oct 2018 08:06:17 +0200] rev 40317
py3: rewrite StringIO fallback for Python 3
Yuya Nishihara <yuya@tcha.org> [Tue, 16 Oct 2018 07:04:07 +0200] rev 40316
py3: reinvent print() function for contrib/hgclient.py
Yuya Nishihara <yuya@tcha.org> [Tue, 16 Oct 2018 07:08:12 +0200] rev 40315
py3: work around unicode stdio streams in contrib/hgclient.py
Yuya Nishihara <yuya@tcha.org> [Tue, 16 Oct 2018 07:00:41 +0200] rev 40314
py3: convert string literals to bytes in contrib/hgclient.py
# skip-blame just many b'' prefixes
Augie Fackler <augie@google.com> [Tue, 16 Oct 2018 08:16:11 -0400] rev 40313
merge with stable
Martijn Pieters <mj@octobus.net> [Fri, 31 Aug 2018 19:58:41 +0100] rev 40312
branchmap: remove redundant sort
There is absoluty no benefit in sorting a list that's being merged into a set
on the next line. The changelog.ancestors() call later on also doesn't benefit
from a sorted sequence of revs.
Differential Revision: https://phab.mercurial-scm.org/D5111
Boris Feld <boris.feld@octobus.net> [Thu, 11 Oct 2018 03:15:04 +0200] rev 40311
revset: drop special case of 'revset(...)' function in analyze
We now have a valid no-op function. We no longer need the special case.
Boris Feld <boris.feld@octobus.net> [Thu, 11 Oct 2018 03:13:53 +0200] rev 40310
revset: document the `revset(...)` syntax
We introduce a new "no-op" function to bear the documentation. In practice, the
parsing step is skipping it so it is not even called. This will get fixed in
the next changeset.
Yuya Nishihara <yuya@tcha.org> [Tue, 16 Oct 2018 12:39:21 +0200] rev 40309
check-commit: update test expectation per removal of "double empty line" rule
Follow up for
47084b5ffd80.
Martijn Pieters <mj@octobus.net> [Sun, 14 Oct 2018 15:40:16 +0200] rev 40308
style: drop requirement to only use single lines between top-level objects
Differential Revision: https://phab.mercurial-scm.org/D5105
Matt Harbison <matt_harbison@yahoo.com> [Sun, 14 Oct 2018 13:05:53 -0400] rev 40307
py3: byteify extension in test-relink.t
Augie Fackler <augie@google.com> [Sat, 13 Oct 2018 04:20:22 -0400] rev 40306
f: fix a Python 3 bytes/string issue
I suspect we should test this tool in isolation, but we don't yet. Oh well.
Differential Revision: https://phab.mercurial-scm.org/D5061
Augie Fackler <augie@google.com> [Sun, 14 Oct 2018 08:55:30 -0400] rev 40305
tests: use regex instead of Python versions for archive hash changes
It turns out this behavior changed between versions of Python 3. Let's
just always accept either size or sha1, and move on.
Differential Revision: https://phab.mercurial-scm.org/D5104
Augie Fackler <augie@google.com> [Sun, 14 Oct 2018 05:29:00 -0400] rev 40304
notify: a ton of encoding dancing to deal with the email module
Almost fixes test-keyword.t on Python 3, but leaves us with some
extremely confusing failures at the end of the test that seem related
to the command server?
Differential Revision: https://phab.mercurial-scm.org/D5100
Augie Fackler <augie@google.com> [Sun, 14 Oct 2018 11:06:21 -0400] rev 40303
tests: add missing b prefix in test-context-metadata.t
# skip-blame just a b prefix
Differential Revision: https://phab.mercurial-scm.org/D5109
Augie Fackler <augie@google.com> [Sun, 14 Oct 2018 11:05:41 -0400] rev 40302
context: raise runtime errors with sysstrs
We should probably *not* use RuntimeError for this, but let's deal
with that later, rather than as part of the Python 3 effort.
Differential Revision: https://phab.mercurial-scm.org/D5108
Georges Racinet <gracinet@anybox.fr> [Mon, 15 Oct 2018 11:16:12 +0200] rev 40301
rust: rustfmt config for hg-direct-ffi
For now, we're duplicating it, but it would be probably a good idea
to use a single one for the whole workspace (would have implications on the
other crates as well)
Georges Racinet <gracinet@anybox.fr> [Mon, 08 Oct 2018 19:11:41 +0200] rev 40300
rust: rustlazyancestors.__contains__
This changeset provides a Rust implementation of
the iteration performed by lazyancestor.__contains__
It has the advantage over the Python iteration to use
the 'seen' set encapsuled into the dedicated iterator (self._containsiter),
rather than storing emitted items in another set (self._containsseen),
and hence should reduce the memory footprint.
Also, there's no need to convert intermediate emitted revisions back into
Python integers.
At this point, it would be tempting to implement the whole lazyancestor object
in Rust, but that would lead to more C wrapping code (two objects) for
little expected benefits.
Matt Harbison <matt_harbison@yahoo.com> [Sun, 14 Oct 2018 01:39:22 -0400] rev 40299
help: fix a missing quote character in ui.tweakdefaults
Georges Racinet <gracinet@anybox.fr> [Thu, 27 Sep 2018 16:55:44 +0200] rev 40298
rust: hooking into Python code
We introduce a new class called 'rustlazyancestors'
in the ancestors module, which is used only if
parsers.rustlazyancestors does exist.
The implementation of __contains__ stays unchanged,
but is now backed by the Rust iterator. It would
probably be a good candidate for further development,
though, as it is mostly looping, and duplicates the
'seen' set.
The Rust code could be further optimized, however it already
gives rise to performance improvements:
median timing from hg perfancestors:
- on pypy:
before: 0.077566s
after: 0.016676s -79%
- on mozilla central:
before: 0.190037s
after: 0.082225s -58%
- on a private repository (about one million revisions):
before: 0.567085s
after: 0.108816s -80%
- on another private repository (about 400 000 revisions):
before: 1.440918s
after: 0.290116s -80%
median timing for hg perfbranchmap base
- on pypy:
before: 1.383413s
after: 0.507993s -63%
- on mozilla central:
before: 2.821940s
after: 1.258902s -55%
- on a private repository (about one million revisions):
before: 77.065076s
after: 16.158475s -80%
- on another private repository (about 401 000 revisions):
before: 7.835503s
after: 3.545331s -54%
Mark Thomas <mbthomas@fb.com> [Sun, 14 Oct 2018 14:10:38 +0000] rev 40297
py3: fix test-propertycache.py
Differential Revision: https://phab.mercurial-scm.org/D5101
Mark Thomas <mbthomas@fb.com> [Sun, 14 Oct 2018 14:02:32 +0000] rev 40296
py3: fix test-dirstate-race.t
Differential Revision: https://phab.mercurial-scm.org/D5106
Rodrigo Damazio <rdamazio@google.com> [Fri, 12 Oct 2018 18:49:11 +0200] rev 40295
help: adding a proper declaration for shortlist/basic commands (API)
We previously used the '^' prefix to indicate that a command
should be shown on the short list (shown for just "hg"), but
that's a horrible hack, so I'm removing it.
Differential Revision: https://phab.mercurial-scm.org/D5069
Rodrigo Damazio <rdamazio@google.com> [Fri, 12 Oct 2018 18:06:32 +0200] rev 40294
help: assigning topic categories
Differential Revision: https://phab.mercurial-scm.org/D5068
rdamazio@google.com [Sat, 13 Oct 2018 02:17:41 -0700] rev 40293
help: assigning categories to existing commands
I'm separating this into its own commit so people can bikeshed over the actual
categorization (vs the support for categories). These categories are based on
the help implementation we've been using internally at Google, and have had
zero complaints.
Differential Revision: https://phab.mercurial-scm.org/D5067
Rodrigo Damazio <rdamazio@google.com> [Fri, 12 Oct 2018 17:57:36 +0200] rev 40292
help: splitting the topics by category
Differential Revision: https://phab.mercurial-scm.org/D5066
rdamazio@google.com [Sat, 13 Oct 2018 05:03:50 -0700] rev 40291
help: adding support for command categories
Differential Revision: https://phab.mercurial-scm.org/D5065
Yuya Nishihara <yuya@tcha.org> [Sun, 14 Oct 2018 13:35:47 +0200] rev 40290
notify: just use email.errors
email.Errors is a proxy object to email.errors on Python 2.
Yuya Nishihara <yuya@tcha.org> [Sat, 06 Oct 2018 21:13:59 +0900] rev 40289
rust-chg: add struct holding information needed to spawn server process
The Locator will handle the initialization of the connection. It will spawn
server processes as needed.
Yuya Nishihara <yuya@tcha.org> [Sun, 07 Oct 2018 11:32:42 +0900] rev 40288
rust-chg: install logger if $CHGDEBUG is set
This is modeled after the example logger and debugmsg() of chg/util.c.
https://docs.rs/log/0.4.5/log/#implementing-a-logger
Yuya Nishihara <yuya@tcha.org> [Sat, 06 Oct 2018 20:07:11 +0900] rev 40287
rust-chg: depend on log and tokio_timer
I'll start porting the daemon management functions from chg of C, which
will be difficult to debug without some logging facility. AFAIK, the log
crate is easy-to-use and widely used.
tokio_timer provides sleep() helper to be used while spawning a server
process.
Yuya Nishihara <yuya@tcha.org> [Sun, 07 Oct 2018 20:55:51 +0900] rev 40286
rust-chg: suppress panic while writing chg error to stderr
Otherwise "chg >/dev/full 2>&1" would exit with 101. Spotted by test-basic.t.
Augie Fackler <augie@google.com> [Sun, 14 Oct 2018 04:37:25 -0400] rev 40285
logcmdutil: add a helpful assertion to catch mistyped templates early
This would have made a defect in test-notify.t much easier to figure out.
Differential Revision: https://phab.mercurial-scm.org/D5097
Augie Fackler <augie@google.com> [Sun, 14 Oct 2018 05:28:01 -0400] rev 40284
notify: adapt to new location of email module's errors
Differential Revision: https://phab.mercurial-scm.org/D5099
Augie Fackler <augie@google.com> [Sun, 14 Oct 2018 04:33:47 -0400] rev 40283
notify: add some b prefixes
# skip-blame just b prefixes
Differential Revision: https://phab.mercurial-scm.org/D5098
Mark Thomas <mbthomas@fb.com> [Sun, 14 Oct 2018 09:24:36 +0000] rev 40282
py3: fix test-diff-color.t
Differential Revision: https://phab.mercurial-scm.org/D5095
Mark Thomas <mbthomas@fb.com> [Sun, 14 Oct 2018 09:07:43 +0000] rev 40281
py3: fix test-revlog.t
The mpatchError has a trailing comma on Python 2 but not on Python 3, so
use a glob to handle both Python 2 and Python 3.
Differential Revision: https://phab.mercurial-scm.org/D5093
Augie Fackler <augie@google.com> [Sun, 14 Oct 2018 04:11:35 -0400] rev 40280
fuzz: try *even harder* to prevent Python from looking up usernames
Differential Revision: https://phab.mercurial-scm.org/D5092
Connor Sheehan <sheehan@mozilla.com> [Sun, 14 Oct 2018 03:42:43 -0400] rev 40279
wireproto: fix incorrect function name in docstring
The docstring for `iwireprotocolcommandcacher` references
an `onoutputfinished` method. The actual name of the function
is `onfinished`.
Differential Revision: https://phab.mercurial-scm.org/D5090
Mark Thomas <mbthomas@fb.com> [Sat, 13 Oct 2018 15:32:52 +0000] rev 40278
py3: fix test-status.t
Differential Revision: https://phab.mercurial-scm.org/D5089
Yuya Nishihara <yuya@tcha.org> [Sun, 14 Oct 2018 07:25:01 +0200] rev 40277
formatter: make debug output prettier
"(glob)" won't be needed since pprintgen() can print dict items in stable
order.
Yuya Nishihara <yuya@tcha.org> [Sun, 14 Oct 2018 07:23:02 +0200] rev 40276
stringutil: allow to specify initial indent level of pprint()
I want to pprint() an inner object, which starts with level=1 indent.
Yuya Nishihara <yuya@tcha.org> [Sun, 14 Oct 2018 07:18:19 +0200] rev 40275
stringutil: make level parameter of pprintgen() 0-origin
I think this makes more sense in that the level is incremented where nesting
goes one more deep.
Yuya Nishihara <yuya@tcha.org> [Sun, 14 Oct 2018 06:51:19 +0200] rev 40274
formatter: use stringutil.pprint() in debug output to drop b''
Georges Racinet <gracinet@anybox.fr> [Thu, 27 Sep 2018 16:56:15 +0200] rev 40273
rust: exposing in parsers module
To build with the Rust code, set the HGWITHRUSTEXT
environment variable.
At this point, it's possible to instantiate and use
a rustlazyancestors object from a Python interpreter.
The changes in setup.py are obviously a quick hack,
just good enough to test/bench without much
refactoring. We'd be happy to improve on that with
help from the community.
Rust bindings crate gets compiled as a static library,
which in turn gets linked within 'parsers.so'
With respect to the plans at
https://www.mercurial-scm.org/wiki/OxidationPlan
this would probably qualify as "roll our own FFI".
Also, it doesn't quite meet the target of getting
rid of C code, since it brings actually more, yet:
- the new C code does nothing else than parsing
arguments and calling Rust functions.
In particular, there's no complex allocation involved.
- subsequent changes could rewrite more of revlog.c, this
time resulting in an overall decrease of C code and
unsafety.
Georges Racinet <gracinet@anybox.fr> [Thu, 27 Sep 2018 16:51:36 +0200] rev 40272
rust: iterator bindings to C code
In this changeset, still made of Rust code only,
we expose the Rust iterator for instantiation and
consumption from C code.
The idea is that both the index and index_get_parents()
will be passed from the C extension, hence avoiding a hard
link dependency to parsers.so, so that the crate can
still be built and tested independently.
On the other hand, parsers.so will use the symbols
defined in this changeset.
Georges Racinet <gracinet@anybox.fr> [Thu, 27 Sep 2018 17:03:16 +0200] rev 40271
rust: pure Rust lazyancestors iterator
This is the first of a patch series aiming to provide an
alternative implementation in the Rust programming language
of the _lazyancestorsiter from the ancestor module.
This iterator has been brought to our attention by the people at
Octobus, as a potential good candidate for incremental "oxydation"
(rewriting in Rust), because it has shown performance issues lately
and it merely deals with ints (revision numbers) obtained by calling
the index, whih should be directly callable from Rust code,
being itself implemented as a C extension.
The idea behind this series is to provide a minimal example of Rust code
collaborating with existing C and Python code. To open the way to gradually
rewriting more of Mercurial's Python code in Rust, without being forced to pay
a large initial cost of rewriting the existing fast core into Rust.
This patch does not introduce any bindings to other Mercurial code
yet. Instead, it introduces the necessary abstractions to address the problem
independently, and unit-test it.
Since this is the first use of Rust as a Python module within Mercurial,
the hg-core crate gets created within this patch. See its Cargo.toml for more
details.
Someone with a rustc/cargo installation may chdir into rust/hg-core and
run the tests by issuing:
cargo test --lib
The algorithm is a bit simplified (see details in docstrings),
and at its simplest becomes rather trivial, showcasing that Rust has
batteries included too: BinaryHeap, the Rust analog of Python's heapq
does actually all the work.
The implementation can be further optimized and probably be made more
idiomatic Rust.
Matt Harbison <matt_harbison@yahoo.com> [Sat, 13 Oct 2018 23:08:29 -0400] rev 40270
run-tests: restore quoting the python executable for running *.py tests
This was accidentally dropped in
8cf459d8b111.
Matt Harbison <matt_harbison@yahoo.com> [Sat, 13 Oct 2018 19:49:33 -0400] rev 40269
tests: replace `cd ..` with an absolute path in a couple ssh tests
These tests are broken under py3 on Windows to the point where the `cd ..` was
actually escaping into the system wide $TEMP. The subsequent `hg init` created
a repo there, and then added a local extension to the hgrc. This breaks every
single subsequent test when it tries to `hg init` in its $TESTTMP, and can't
load the localwrite.py extension. And since I botched this the first time and
replaced the wrong `cd ..`, this just replaces all of them. I've noticed test
garbage in $TEMP recently, and maybe this will help.
Perhaps `hg init` shouldn't load the config for the local repo, but this is an
easy enough workaround for now.
Matt Harbison <matt_harbison@yahoo.com> [Thu, 04 Oct 2018 00:17:26 -0400] rev 40268
lfs: register the flag processors per repository
Previously, enabling the extension for any repo in commandserver or hgweb would
enable the flags on all repos. Since localrepo.resolverevlogstorevfsoptions()
is called so early, the check to see if the extension is enabled on the repo
(which hasn't been instantiated yet) is a bit awkward. But I don't see a better
way.
Matt Harbison <matt_harbison@yahoo.com> [Tue, 09 Oct 2018 21:53:21 -0400] rev 40267
revlog: allow flag processors to be applied via store options
This allows flag processors to be registered to specific repos in an extension
by wrapping localrepo.resolverevlogstorevfsoptions(). I wanted to add the
processors via a function on localrepo, but some of the places where the
processors are globally registered don't have a repository available. This
makes targeting specific repos in the wrapper awkward, but still manageable.