largefiles: fix commit when using relative paths from subdirectory
Remove cwd handling from getstandinmatcher - it did not belong there, as proven
by the tests.
largefiles: allow use of urls with #revision
largefiles tried to create a peer directly with the specified url. That caused
abort: unsupported URL component: "..."
if a revision was specified in the url.
The branch name do not matter for largefiles' use of remote peers. Largefiles
will be shared among all branches anyway.
largefiles: don't verify largefile hashes on servers when processing statlfile
When changesets referencing largefiles are pushed then the corresponding
largefiles will be pushed too - unless the target already has them. The client
will use statlfile to make sure it only sends largefiles that the target
doesn't have. The server would however on every statlfile check that the
content of the largefile had the expected hash. What should be cheap thus
became an expensive operation that trashed the disk and the cache.
Largefile hashes are already checked by putlfile before being stored on the
server. A server should thus be able to keep its largefile store free of
errors - even more than it can keep revlogs free of errors. Verification should
happen when running 'hg verify' locally on the server. Rehashing every
largefile on every remote stat is too expensive.
Clients will also stat lfiles before downloading them. When the server verified
the hash in stat it meant that it had to read the file twice to serve it.
With this change the server will assume its own hashes are ok without checking
them on every statlfile.
Some consequences of this change:
- in case of server side corruption the problem will be detected by the
existing check on the client side - not on server side
- clients that could upload an uncorrupted largefile when pushing will no
longer magically heal the server (and break hardlinks) - a client will now
only upload its uncorrupted files after the corrupted file has been removed
on the server side
- client side verify will no longer report corruption in files it doesn't have
(Issue3123 discussed related problems - and how they have been fixed.)
tests: clarify test for pushing corrupted largefile
The test no longer tested that the server prevented pushing a corrupt
largefile. At the same time it tested what happened when the server already had
a corrupt largefile.
These two cases are now separated.
largefiles: verify all files in each revision and report errors in any revision
Verify used 'any' and would stop verifying after the first failure in each
changeset.
The exit code only reported the result from the last changeset.
tests: better test coverage of largefiles localstore verify
This demonstrates problems that will be fixed later.
largefiles: adapt remotestore._getfile to batched statlfile
9e1616307c4c introduced batching of statlfile, but not all codepaths got
converted.
_getfile gave _stat garbage and got garbage back. The garbage didn't match the
expected error codes and was thus interpreted as success. It could thus end up
trying to fetch a largefile that didn't exist.
Instead we now pass _stat valid input and handle both correct and invalid
output correctly.
This makes the code work as intended ... but it would probably be better if it
didn't abort on missing largefiles, just like it happened to do before.
largefiles: don't allow corruption to propagate after detection
basestore.get uses util.atomictempfile when checking and receiving a new
largefile ... but the close/discard logic was too clever for largefiles.
Largefiles relied on being able to discard the file and thus prevent it from
being written to the store. That was however too brittle. lfutil.copyandhash
closes the infile after writing to it ... with a 'blecch' comment. The discard
was thus a silent noop, and as a result of that corruption would be detected
... and then the corrupted files would be used anyway.
Instead we now use a tmp file and rename or unlink it after validating it.
A better solution should be implemented ... but not now.
largefiles: adapt verify to batched remote statlfile (
issue3780)
9e1616307c4c introduced batching of statlfile, but not all codepaths got
converted.
'hg verify' with a remotestore could thus crash with
TypeError: 'builtin_function_or_method' object is not iterable
Also, the 'hash' variable was used without assigning to it. Don't use variable
names that collide with Python built-in functions. Instead we use 'expecthash'
as in localstore.
The tests for this issue covers an untested area. The tests happens to also
reveal incorrect attempts at getting non-existing largefiles, bad server side
handling of that, and corruption issues - all to be fixed later.
largefiles: let wirestore._stat return stats as expected by remotestore verify
- preparing for fixing verify crash.