Gregory Szorc <gregory.szorc@gmail.com> [Tue, 13 Nov 2018 12:30:59 -0800] rev 40670
revlog: detect incomplete revlog reads
_readsegment() is supposed to return N bytes of revlog revision
data starting at a file offset. Surprisingly, its behavior before
this patch never verified that it actually read and returned N
bytes! Instead, it would perform the read(), then return whatever
data was available. And even more surprisingly, nothing in the
call chain appears to have been validating that it received all
the data it was expecting.
This behavior could lead to partial or incomplete revision chunks
being operated on. This could result in e.g. cached deltas being
applied against incomplete base revisions. The delta application
process would happily perform this operation. Only hash
verification would detect the corruption and save us.
This commit changes the behavior of raw revlog reading to validate
that we actually read() the number of bytes that were requested.
We will raise a more specific error faster, rather than possibly
have it go undetected or manifest later in the call stack, at
delta application or hash verification.
Differential Revision: https://phab.mercurial-scm.org/D5266
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 30 Oct 2018 16:50:05 -0700] rev 40669
revlog: use single file handle when de-inlining revlog
_getsegmentforrevs() will eventually call into _datareadfp() to
resolve a file handle to read revision data. If no file handle
is passed into _getsegmentforrevs(), it opens a new one.
Explicit is better than implicit.
This commit changes _enforceinlinesize() to open a file handle
explicitly when converting inline revlogs to split revlogs and
to pass this file handle into _getsegmentforrevs().
I haven't measured, but this change should improve performance,
as we no longer reopen the revlog for reading for every revision
in the revlog when it is converted from inline to split. Instead,
we open it at most once and use it for the duration of the
operation. That being said, I /think/ the chunk cache may mitigate
the number of file opens required.
Differential Revision: https://phab.mercurial-scm.org/D5265
Pulkit Goyal <pulkit@yandex-team.ru> [Tue, 13 Nov 2018 18:44:09 +0300] rev 40668
store: raise ProgrammingError if unable to decode a storage path
Right now, the function magically return False which is dangerous, so let's
raise ProgrammingError.
Suggested by Augie in D5139.
Differential Revision: https://phab.mercurial-scm.org/D5264
Matt Harbison <matt_harbison@yahoo.com> [Tue, 13 Nov 2018 23:54:23 -0500] rev 40667
tests: document a known failing interaction between narrow and lfs
This is one of the two remaining aborts I found looking into issue5794. I've
got no idea what's wrong with the hook, since the changes there fixed the other
two problems noted in that bug report. It seems like it might go away when the
narrow issue is fixed, but let's make sure this doesn't get lost.
The stacktrace for the hook seems to indicate that the missing file *is* in ctx:
remote: Traceback (most recent call last):
remote: File "c:\Users\Matt\projects\hg\hgext\lfs\__init__.py", line 253, in checkrequireslfs
remote: if any(f in ctx and match(f) and ctx[f].islfs() for f in ctx.files()):
remote: File "c:\Users\Matt\projects\hg\hgext\lfs\__init__.py", line 253, in <genexpr>
remote: if any(f in ctx and match(f) and ctx[f].islfs() for f in ctx.files()):
remote: File "c:\Users\Matt\projects\hg\hgext\lfs\wrapper.py", line 191, in filectxislfs
remote: return _islfs(self.filelog(), self.filenode())
remote: File "c:\Users\Matt\projects\hg\mercurial\context.py", line 631, in filenode
remote: return self._filenode
remote: File "c:\Users\Matt\projects\hg\mercurial\util.py", line 1528, in __get__
remote: result = self.func(obj)
remote: File "c:\Users\Matt\projects\hg\mercurial\context.py", line 579, in _filenode
remote: return self._filelog.lookup(self._fileid)
remote: File "c:\Users\Matt\projects\hg\mercurial\filelog.py", line 68, in lookup
remote: self._revlog.indexfile)
remote: File "c:\Users\Matt\projects\hg\mercurial\utils\storageutil.py", line 218, in fileidlookup
remote: raise error.LookupError(fileid, identifier, _('no match found'))
remote: LookupError: data/inside2/f.i@f59b4e021835: no match found
Yuya Nishihara <yuya@tcha.org> [Sun, 11 Nov 2018 12:55:58 +0900] rev 40666
logtoprocess: drop support for ui.log() call with invalid msg arguments (BC)
Before, the logtoprocess extension put a formatted message into $MSG1, and
its arguments to $MSG2... If the specified arguments couldn't be formatted
because of a caller bug, an unformatted message was passed in to $MSG1
instead of exploding. This behavior doesn't make sense.
Since I'm planning to formalize the ui.log() interface such that we'll no
longer have to extend the ui class, I want to remove any features not
conforming to the ui.log() API. So this patch removes the support for
ill-formed arguments, and $MSG{n} (where n > 1) parameters which seems
useless as long as the message can be formatted. The $MSG1 variable isn't
renamed for the maximum compatibility.
In future patches, a formatted msg will be passed to a processlogger object,
instead of overriding the ui.log() function.
.. bc::
The logtoprocess extension no longer supports invalid ``ui.log()``
arguments. A log message is always formatted and passed in to the
``$MSG1`` environment variable.
Yuya Nishihara <yuya@tcha.org> [Sun, 11 Nov 2018 12:35:38 +0900] rev 40665
py3: byte-stringify inline extension in test-logtoprocess.t
Yuya Nishihara <yuya@tcha.org> [Sun, 11 Nov 2018 12:33:14 +0900] rev 40664
logtoprocess: rewrite dict building in py3-compatible way