Matt Harbison <matt_harbison@yahoo.com> [Fri, 17 Nov 2017 00:06:45 -0500] rev 35480
lfs: verify lfs object content when transferring to and from the remote store
This avoids inserting corrupt files into the usercache, and local and remote
stores. One down side is that the bad file won't be available locally for
forensic purposes after a remote download. I'm thinking about adding an
'incoming' directory to the local lfs store to handle the download, and then
move it to the 'objects' directory after it passes verification. That would
have the additional benefit of not concatenating each transfer chunk in memory
until the full file is transferred.
Verification isn't needed when the data is passed back through the revlog
interface or when the oid was just calculated, but otherwise it is on by
default. The additional overhead should be well worth avoiding problems with
file based remote stores, or buggy lfs servers.
Having two different verify functions is a little sad, but the full data of the
blob is mostly passed around in memory, because that's what the revlog interface
wants. The upload function, however, chunks up the data. It would be ideal if
that was how the content is always handled, but that's probably a huge project.
I don't really like printing the long hash, but `hg debugdata` isn't a public
interface, and is the only way to get it. The filelog and revision info is
nowhere near this area, so recommending `hg verify` is the easiest thing to do.
Matt Harbison <matt_harbison@yahoo.com> [Mon, 04 Dec 2017 21:41:04 -0500] rev 35479
lfs: narrow the exceptions that trigger a transfer retry
The retries were added to workaround TCP RESETs in fb-experimental fc8c131314a9.
I have no idea if that's been debugged yet, but this wide net caught local I/O
errors, bad hostnames and other things that shouldn't be retried. The next
patch will validate objects as they are uploaded, and there's no need to retry
those errors.
The spec[1] does mention that certain http errors can be retried, including 500.
But let's work through the corruption detection issues first.
[1] https://github.com/git-lfs/git-lfs/blob/master/docs/api/batch.md
Matt Harbison <matt_harbison@yahoo.com> [Thu, 16 Nov 2017 22:52:53 -0500] rev 35478
test-lfs: add tests around corrupted lfs objects
These are mostly tests against file:// based remote stores, because that's what
we have the most control over.
The test uploading a corrupt blob to lfs-test-server demonstrates an overly
broad exception handler in the retry loop. A corrupt blob is actually
transferred in a download, but eventually caught when it is accessed (only after
it leaves the corrupt file in a couple places locally). I don't think we want
to trust random 3rd party implementations, and this would be a problem if there
were a `debuglfsdownload` command that simply cached the files. And given the
cryptic errors, we should probably validate the file hash locally before
uploading, and also after downloading.
Matt Harbison <matt_harbison@yahoo.com> [Tue, 19 Dec 2017 17:53:44 -0500] rev 35477
lfs: add note messages indicating what store holds the lfs blob
The following corruption related patches were written prior to adding the user
level cache, and it took awhile to track down why the tests changed. (It
generally made things more resilient.) But I think this will be useful to the
end user as well. I didn't make it --debug level, because there can be a ton of
info coming out of clone/push/pull --debug. The pointers are sorted for test
stability.
I opted for ui.note() instead of checking ui.verbose and then using ui.write()
for convenience, but I see most of this extension does the latter. I have no
idea what the preferred form is.
Matt Harbison <matt_harbison@yahoo.com> [Wed, 20 Dec 2017 20:46:33 -0500] rev 35476
tests: teach `f` to handle sha256 checksums
Matt Harbison <matt_harbison@yahoo.com> [Wed, 20 Dec 2017 20:41:12 -0500] rev 35475
tests: fix a bug in `f` that prevented calculating sha1sum on a file
Yuya Nishihara <yuya@tcha.org> [Thu, 21 Dec 2017 22:17:39 +0900] rev 35474
templater: look up symbols/resources as if they were separated (issue5699)
It wouldn't be easy to split the mapping dict into (symbols, resources). This
patch instead rejects invalid lookup taking resources.keys() as source of
truth.
The doctest is updated since mapping['repo'] is now reserved for a repo
object.
Yuya Nishihara <yuya@tcha.org> [Thu, 21 Dec 2017 22:05:30 +0900] rev 35473
templater: move repo, ui and cache to per-engine resources
Yuya Nishihara <yuya@tcha.org> [Thu, 21 Dec 2017 21:29:06 +0900] rev 35472
templater: keep default resources per template engine (API)
This allows us to register a repo object as a resource in hgweb template,
without loosing '{repo}' symbol:
symbol('repo') -> mapping['repo'] (n/a) -> defaults['repo']
resource('repo') -> mapping['repo'] (n/a) -> resources['repo']
I'm thinking of redesigning the templatekw API to take (context, mapping)
in place of **(context._resources + mapping), but that will be a big change
and not implemented yet.
props['templ'] is ported to the resources dict as an example.
.. api::
mapping does not contain all template resources. use context.resource()
in template functions.
Yuya Nishihara <yuya@tcha.org> [Thu, 21 Dec 2017 21:03:25 +0900] rev 35471
templater: look up mapping table through template engine
These functions are stub for symbol/resource separation. This series is
intended to address the following problems:
a) internal data may be exposed to user (issue5699)
b) defaults['repo'] (a repository name) will conflict with mapping['repo']
(a repo object) in hgweb