util: add iterfile to workaround a fileobj.__iter__ issue with EINTR
The fileobj.__iter__ implementation in Python 2.7.12 (hg changeset
45d4cea97b04) is buggy: it cannot handle EINTR correctly.
In Objects/fileobject.c:
size_t Py_UniversalNewlineFread(....) {
....
if (!f->f_univ_newline)
return fread(buf, 1, n, stream);
....
}
According to the "fread" man page:
If an error occurs, or the end of the file is reached, the return value
is a short item count (or zero).
Therefore it's possible for "fread" (and "Py_UniversalNewlineFread") to
return a positive value while errno is set to EINTR and ferror(stream)
changes from zero to non-zero.
There are multiple "Py_UniversalNewlineFread": "file_read", "file_readinto",
"file_readlines", "readahead". While the first 3 have code to handle the
EINTR case, the last one "readahead" doesn't:
static int readahead(PyFileObject *f, Py_ssize_t bufsize) {
....
chunksize = Py_UniversalNewlineFread(
f->f_buf, bufsize, f->f_fp, (PyObject *)f);
....
if (chunksize == 0) {
if (ferror(f->f_fp)) {
PyErr_SetFromErrno(PyExc_IOError);
....
}
}
....
}
It means "readahead" could ignore EINTR, if "Py_UniversalNewlineFread"
returns a non-zero value. And at the next time "readahead" got executed, if
"Py_UniversalNewlineFread" returns 0, "readahead" would raise a Python error
without a incorrect errno - could be 0 - thus "IOError: [Errno 0] Error".
The only user of "readahead" is "readahead_get_line_skip".
The only user of "readahead_get_line_skip" is "file_iternext", aka.
"fileobj.__iter__", which should be avoided.
There are multiple places where the pattern "for x in fp" is used. This
patch adds a "iterfile" method in "util.py" so we can migrate our code from
"for x in fp" to "fox x in util.iterfile(fp)".
$ hg init repo
$ cd repo
$ echo line 1 > foo
$ hg ci -qAm 'add foo'
copy foo to bar and change both files
$ hg cp foo bar
$ echo line 2-1 >> foo
$ echo line 2-2 >> bar
$ hg ci -m 'cp foo bar; change both'
in another branch, change foo in a way that doesn't conflict with
the other changes
$ hg up -qC 0
$ echo line 0 > foo
$ hg cat foo >> foo
$ hg ci -m 'change foo'
created new head
we get conflicts that shouldn't be there
$ hg merge -P
changeset: 1:484bf6903104
user: test
date: Thu Jan 01 00:00:00 1970 +0000
summary: cp foo bar; change both
$ hg merge --debug
searching for copies back to rev 1
unmatched files in other:
bar
all copies found (* = to merge, ! = divergent, % = renamed and deleted):
src: 'foo' -> dst: 'bar' *
checking for directory renames
resolving manifests
branchmerge: True, force: False, partial: False
ancestor: e6dc8efe11cc, local: 6a0df1dad128+, remote: 484bf6903104
preserving foo for resolve of bar
preserving foo for resolve of foo
starting 4 threads for background file closing (?)
bar: remote copied from foo -> m (premerge)
picked tool ':merge' for bar (binary False symlink False changedelete False)
merging foo and bar to bar
my bar@6a0df1dad128+ other bar@484bf6903104 ancestor foo@e6dc8efe11cc
premerge successful
foo: versions differ -> m (premerge)
picked tool ':merge' for foo (binary False symlink False changedelete False)
merging foo
my foo@6a0df1dad128+ other foo@484bf6903104 ancestor foo@e6dc8efe11cc
premerge successful
0 files updated, 2 files merged, 0 files removed, 0 files unresolved
(branch merge, don't forget to commit)
contents of foo
$ cat foo
line 0
line 1
line 2-1
contents of bar
$ cat bar
line 0
line 1
line 2-2
$ cd ..