subrepo: open files in 'rb' mode to read exact data in (
issue3926)
Before this patch, "subrepo._calcfilehash()" opens files by "open()"
without any mode specification. This implies "text mode" on Windows.
When target file contains '\x00' byte, "read()" in "text mode" reads
file contents in without data after '\x00'.
This causes invalid SHA1 hash calculation in "subrepo._calcfilehash()".
This patch opens files in 'rb' mode to read exact data in.
patch: use scmutil.marktouched instead of scmutil.addremove
addremove required paths relative to the cwd, which meant a lot of extra code
that transformed paths into relative ones. That code is now gone as well.
scmutil: add a function to mark that files have been operated on
Several places use scmutil.addremove as a means to declare that certain files
have been operated on. This is ugly because:
- addremove takes patterns relative to the cwd, not paths relative to the root,
which means extra contortions for callers.
- addremove doesn't make clear what happens to files whose status hasn't
changed.
This new method accepts filenames relative to the repo root, and has a much
clearer contract. It also allows future modifications that do more with files
whose status hasn't changed.
scmutil.addremove: factor out code to mark added/removed/renames
An upcoming patch will reuse this code in another function.
scmutil.addremove: factor out code to find renames
This code will be used in a different context in upcoming patches.
scmutil.addremove: rename local 'copies' to 'renames'
An upcoming patch will refactor some code out into a method called
_findrenames. Having a line saying "copies = _findrenames..." is confusing.
Besides, 'renames' is a more precise name for this local anyway.
scmutil.addremove: factor out dirstate walk into another function
Upcoming patches will reuse and expand on this function for other purposes.
filecontext: use 'is not None' to check for filelog existence
Previously we used 'if filelog:' to check if the filelog existed. If the
instance did exist, this pattern then calls len() on the filelog to see
if it is empty. I'm developing a filelog replacement that doesn't have
len() implemented, so it's better to do an explicit 'is not None' check
here instead.
Also change _changeid() to return the _changeid attribute if it has it.
Previously it would try to obtain it from the _changectx(), and if that
did not exist it would construct the _changectx() using the linkrev. In
the extension I'm working on, filectx's don't have easy access to linkrevs
so avoiding this when possible is better.