hg
author Matt Harbison <matt_harbison@yahoo.com>
Tue, 21 Jan 2020 11:32:33 -0500
changeset 44273 43eea17ae7b3
parent 43659 99e231afc29c
child 45830 c102b704edb5
permissions -rwxr-xr-x
lfs: fix the stall and corruption issue when concurrently uploading blobs We've avoided the issue up to this point by gating worker usage with an experimental config. See 10e62d5efa73, and the thread linked there for some of the initial diagnosis, but essentially some data was being read from the blob before an error occurred and `keepalive` retried, but didn't rewind the file pointer. So the leading data was lost from the blob on the server, and the connection stalled, trying to send more data than available. In trying to recreate this, I was unable to do so uploading from Windows to CentOS 7. But it reproduced every time going from CentOS 7 to another CentOS 7 over https. I found recent fixes in the FaceBook repo to address this[1][2]. The commit message for the first is: The KeepAlive HTTP implementation is bugged in it's retry logic, it supports reading from a file pointer, but doesn't support rewinding of the seek cursor when it performs a retry. So it can happen that an upload fails for whatever reason and will then 'hang' on the retry event. The sequence of events that get triggered are: - Upload file A, goes OK. Keep-Alive caches connection. - Upload file B, fails due to (for example) failing Keep-Alive, but LFS file pointer has been consumed for the upload and fd has been closed. - Retry for file B starts, sets the Content-Length properly to the expected file size, but since file pointer has been consumed no data will be uploaded, causing the server to wait for the uploaded data until either client or server reaches a timeout, making it seem as our mercurial process hangs. This is just a stop-gap measure to prevent this behavior from blocking Mercurial (LFS has retry logic). A proper solutions need to be build on top of this stop-gap measure: for upload from file pointers, we should support fseek() on the interface. Since we expect to consume the whole file always anyways, this should be safe. This way we can seek back to the beginning on a retry. I ported those two patches, and it works. But I see that `url._sendfile()` does a rewind on `httpsendfile` objects[3], so maybe it's better to keep this all in one place and avoid a second seek. We may still want the first FaceBook patch as extra protection for this problem in general. The other two uses of `httpsendfile` are in the wire protocol to upload bundles, and to upload largefiles. Neither of these appear to use a worker, and I'm not sure why workers seem to trigger this, or if this could have happened without a worker. Since `httpsendfile` already has a `close()` method, that is dropped. That class also explicitly says there's no `__len__` attribute, so that is removed too. The override for `read()` is necessary to avoid the progressbar usage per file. [1] https://github.com/facebookexperimental/eden/commit/c350d6536d90c044c837abdd3675185644481469 [2] https://github.com/facebookexperimental/eden/commit/77f0d3fd0415e81b63e317e457af9c55c46103ee [3] https://www.mercurial-scm.org/repo/hg/file/5.2.2/mercurial/url.py#l176 Differential Revision: https://phab.mercurial-scm.org/D7962
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
0
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
     1
#!/usr/bin/env python
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
     2
#
1698
ad4a2eefe4d7 Update copyright notice
Matt Mackall <mpm@selenic.com>
parents: 515
diff changeset
     3
# mercurial - scalable distributed SCM
0
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
     4
#
4635
63b9d2deed48 Updated copyright notices and add "and others" to "hg version"
Thomas Arendsen Hein <thomas@intevation.de>
parents: 3877
diff changeset
     5
# Copyright 2005-2007 Matt Mackall <mpm@selenic.com>
0
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
     6
#
8225
46293a0c7e9f updated license to be explicit about GPL version 2
Martin Geisler <mg@lazybytes.net>
parents: 7672
diff changeset
     7
# This software may be used and distributed according to the terms of the
10263
25e572394f5c Update license to GPLv2+
Matt Mackall <mpm@selenic.com>
parents: 8225
diff changeset
     8
# GNU General Public License version 2 or any later version.
33896
1900381b6a6e hg: update top-level script to use modern import conventions
Augie Fackler <raf@durin42.com>
parents: 32424
diff changeset
     9
from __future__ import absolute_import
0
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    10
12661
10da5a1f25dd setup/hg: always load Mercurial from where it was installed.
Dan Villiom Podlaski Christiansen <danchr@gmail.com>
parents: 10263
diff changeset
    11
import os
10da5a1f25dd setup/hg: always load Mercurial from where it was installed.
Dan Villiom Podlaski Christiansen <danchr@gmail.com>
parents: 10263
diff changeset
    12
import sys
10da5a1f25dd setup/hg: always load Mercurial from where it was installed.
Dan Villiom Podlaski Christiansen <danchr@gmail.com>
parents: 10263
diff changeset
    13
10da5a1f25dd setup/hg: always load Mercurial from where it was installed.
Dan Villiom Podlaski Christiansen <danchr@gmail.com>
parents: 10263
diff changeset
    14
libdir = '@LIBDIR@'
10da5a1f25dd setup/hg: always load Mercurial from where it was installed.
Dan Villiom Podlaski Christiansen <danchr@gmail.com>
parents: 10263
diff changeset
    15
10da5a1f25dd setup/hg: always load Mercurial from where it was installed.
Dan Villiom Podlaski Christiansen <danchr@gmail.com>
parents: 10263
diff changeset
    16
if libdir != '@' 'LIBDIR' '@':
10da5a1f25dd setup/hg: always load Mercurial from where it was installed.
Dan Villiom Podlaski Christiansen <danchr@gmail.com>
parents: 10263
diff changeset
    17
    if not os.path.isabs(libdir):
43659
99e231afc29c black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 43073
diff changeset
    18
        libdir = os.path.join(
99e231afc29c black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 43073
diff changeset
    19
            os.path.dirname(os.path.realpath(__file__)), libdir
99e231afc29c black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 43073
diff changeset
    20
        )
12661
10da5a1f25dd setup/hg: always load Mercurial from where it was installed.
Dan Villiom Podlaski Christiansen <danchr@gmail.com>
parents: 10263
diff changeset
    21
        libdir = os.path.abspath(libdir)
10da5a1f25dd setup/hg: always load Mercurial from where it was installed.
Dan Villiom Podlaski Christiansen <danchr@gmail.com>
parents: 10263
diff changeset
    22
    sys.path.insert(0, libdir)
10da5a1f25dd setup/hg: always load Mercurial from where it was installed.
Dan Villiom Podlaski Christiansen <danchr@gmail.com>
parents: 10263
diff changeset
    23
39592
5e78c100a215 hg: wrap the highest layer in the `hg` script possible in trace event
Augie Fackler <augie@google.com>
parents: 34533
diff changeset
    24
from hgdemandimport import tracing
43659
99e231afc29c black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 43073
diff changeset
    25
39592
5e78c100a215 hg: wrap the highest layer in the `hg` script possible in trace event
Augie Fackler <augie@google.com>
parents: 34533
diff changeset
    26
with tracing.log('hg script'):
5e78c100a215 hg: wrap the highest layer in the `hg` script possible in trace event
Augie Fackler <augie@google.com>
parents: 34533
diff changeset
    27
    # enable importing on demand to reduce startup time
5e78c100a215 hg: wrap the highest layer in the `hg` script possible in trace event
Augie Fackler <augie@google.com>
parents: 34533
diff changeset
    28
    try:
5e78c100a215 hg: wrap the highest layer in the `hg` script possible in trace event
Augie Fackler <augie@google.com>
parents: 34533
diff changeset
    29
        if sys.version_info[0] < 3 or sys.version_info >= (3, 6):
43659
99e231afc29c black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 43073
diff changeset
    30
            import hgdemandimport
99e231afc29c black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 43073
diff changeset
    31
99e231afc29c black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 43073
diff changeset
    32
            hgdemandimport.enable()
39592
5e78c100a215 hg: wrap the highest layer in the `hg` script possible in trace event
Augie Fackler <augie@google.com>
parents: 34533
diff changeset
    33
    except ImportError:
43659
99e231afc29c black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 43073
diff changeset
    34
        sys.stderr.write(
99e231afc29c black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 43073
diff changeset
    35
            "abort: couldn't find mercurial libraries in [%s]\n"
99e231afc29c black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 43073
diff changeset
    36
            % ' '.join(sys.path)
99e231afc29c black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 43073
diff changeset
    37
        )
39592
5e78c100a215 hg: wrap the highest layer in the `hg` script possible in trace event
Augie Fackler <augie@google.com>
parents: 34533
diff changeset
    38
        sys.stderr.write("(check your install and PYTHONPATH)\n")
5e78c100a215 hg: wrap the highest layer in the `hg` script possible in trace event
Augie Fackler <augie@google.com>
parents: 34533
diff changeset
    39
        sys.exit(-1)
5197
55860a45bbf2 Enable demandimport only in scripts, not in importable modules (issue605)
Thomas Arendsen Hein <thomas@intevation.de>
parents: 5178
diff changeset
    40
39592
5e78c100a215 hg: wrap the highest layer in the `hg` script possible in trace event
Augie Fackler <augie@google.com>
parents: 34533
diff changeset
    41
    from mercurial import dispatch
43659
99e231afc29c black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 43073
diff changeset
    42
39592
5e78c100a215 hg: wrap the highest layer in the `hg` script possible in trace event
Augie Fackler <augie@google.com>
parents: 34533
diff changeset
    43
    dispatch.run()