Thu, 21 Dec 2017 14:13:39 -0500 lfs: only hardlink between the usercache and local store if the blob verifies
Matt Harbison <matt_harbison@yahoo.com> [Thu, 21 Dec 2017 14:13:39 -0500] rev 35477
lfs: only hardlink between the usercache and local store if the blob verifies This fixes the issue where verify (and other read commands) would propagate corrupt blobs. I originalled coded this to only hardlink if 'verify=True' for store.read(), but then good blobs weren't being linked, and this broke a bunch of tests. (The blob in repo5 that is being corrupted seems to be linked into repo5 in the loop running dumpflog.py prior to it being corrupted, but only if verify=False is handled too.) It's probably better to do a one time extra verification in order to create these files, so that the repo can be copied to a removable drive. Adding the same check to store.write() was only for completeness, but also needs to do a one time extra verification to avoid breaking tests.
Fri, 17 Nov 2017 00:06:45 -0500 lfs: verify lfs object content when transferring to and from the remote store
Matt Harbison <matt_harbison@yahoo.com> [Fri, 17 Nov 2017 00:06:45 -0500] rev 35476
lfs: verify lfs object content when transferring to and from the remote store This avoids inserting corrupt files into the usercache, and local and remote stores. One down side is that the bad file won't be available locally for forensic purposes after a remote download. I'm thinking about adding an 'incoming' directory to the local lfs store to handle the download, and then move it to the 'objects' directory after it passes verification. That would have the additional benefit of not concatenating each transfer chunk in memory until the full file is transferred. Verification isn't needed when the data is passed back through the revlog interface or when the oid was just calculated, but otherwise it is on by default. The additional overhead should be well worth avoiding problems with file based remote stores, or buggy lfs servers. Having two different verify functions is a little sad, but the full data of the blob is mostly passed around in memory, because that's what the revlog interface wants. The upload function, however, chunks up the data. It would be ideal if that was how the content is always handled, but that's probably a huge project. I don't really like printing the long hash, but `hg debugdata` isn't a public interface, and is the only way to get it. The filelog and revision info is nowhere near this area, so recommending `hg verify` is the easiest thing to do.
Mon, 04 Dec 2017 21:41:04 -0500 lfs: narrow the exceptions that trigger a transfer retry
Matt Harbison <matt_harbison@yahoo.com> [Mon, 04 Dec 2017 21:41:04 -0500] rev 35475
lfs: narrow the exceptions that trigger a transfer retry The retries were added to workaround TCP RESETs in fb-experimental fc8c131314a9. I have no idea if that's been debugged yet, but this wide net caught local I/O errors, bad hostnames and other things that shouldn't be retried. The next patch will validate objects as they are uploaded, and there's no need to retry those errors. The spec[1] does mention that certain http errors can be retried, including 500. But let's work through the corruption detection issues first. [1] https://github.com/git-lfs/git-lfs/blob/master/docs/api/batch.md
Thu, 16 Nov 2017 22:52:53 -0500 test-lfs: add tests around corrupted lfs objects
Matt Harbison <matt_harbison@yahoo.com> [Thu, 16 Nov 2017 22:52:53 -0500] rev 35474
test-lfs: add tests around corrupted lfs objects These are mostly tests against file:// based remote stores, because that's what we have the most control over. The test uploading a corrupt blob to lfs-test-server demonstrates an overly broad exception handler in the retry loop. A corrupt blob is actually transferred in a download, but eventually caught when it is accessed (only after it leaves the corrupt file in a couple places locally). I don't think we want to trust random 3rd party implementations, and this would be a problem if there were a `debuglfsdownload` command that simply cached the files. And given the cryptic errors, we should probably validate the file hash locally before uploading, and also after downloading.
Tue, 19 Dec 2017 17:53:44 -0500 lfs: add note messages indicating what store holds the lfs blob
Matt Harbison <matt_harbison@yahoo.com> [Tue, 19 Dec 2017 17:53:44 -0500] rev 35473
lfs: add note messages indicating what store holds the lfs blob The following corruption related patches were written prior to adding the user level cache, and it took awhile to track down why the tests changed. (It generally made things more resilient.) But I think this will be useful to the end user as well. I didn't make it --debug level, because there can be a ton of info coming out of clone/push/pull --debug. The pointers are sorted for test stability. I opted for ui.note() instead of checking ui.verbose and then using ui.write() for convenience, but I see most of this extension does the latter. I have no idea what the preferred form is.
Wed, 20 Dec 2017 20:46:33 -0500 tests: teach `f` to handle sha256 checksums
Matt Harbison <matt_harbison@yahoo.com> [Wed, 20 Dec 2017 20:46:33 -0500] rev 35472
tests: teach `f` to handle sha256 checksums
(0) -30000 -10000 -3000 -1000 -300 -100 -30 -10 -6 +6 +10 +30 +100 +300 +1000 +3000 +10000 tip